Find the right timing to perform preventive maintenance work

Selection of the correct interval to perform a preventive maintenance task is, by far, the most difficult job confronting the maintenance technician and analyst. We need to understand how physical processes and materials change over time, and how those changes ultimately lead to what we call failure modes.


Selection of the correct interval to perform a preventive maintenance task is, by far, the most difficult job confronting the maintenance technician and analyst. We need to understand how physical processes and materials change over time, and how those changes ultimately lead to what we call failure modes. Understanding how failure rates can vary as a function of time is essential and in order to tackle a solution, we enter the world of statistical analysis

The task selection process should establish at the outset whether we know the age%%MDASSML%%reliability relationship for the specific failure mode in question. If we know the age%%MDASSML%%reliability relationship, then we also have information to select the TD task interval. That is, we have the failure density function (fdf) for the failure mode population, and we can select the task interval from the statistical knowledge by deciding on the level of consumer risk that we want to accept.

Suppose, for example, that the fdf looks like a bell-shaped curve where the x-axis is operating time and the y-axis is probability of failure. The left-hand tail may be quite long, thus signifying an extended period of time during which the probability of failure is quite small and, for all practical purposes, the item is in a constant failure rate condition.

However, as we proceed to the right, or as we see the probability of failure beginning to increase as additional operating time is accumulated, we can decide how far we want to proceed before doing the TD task. And this is where the level of consumer risk comes into play. We can pick that level of risk by selecting the percentage of area under the fdf that we can tolerate before taking action.

Say we choose 15%. This means that there is a 15% chance that the failure mode could occur before we take the preventive actions. We can choose any percentage value, but decreasing risk leads to more frequent PM actions and higher PM costs.

Notice that if we use the mean (or MTBF) for the bell-shaped fdf, there is a 50% chance of failure before we take preventive actions. For other fdfs, the chance of failure can be as large as 67% when the mean is used. This is not an acceptable level of risk in most circumstances—hence, using an MTBF value is not really a valid and useful technique for selecting task intervals.

The foregoing discussion has briefly outlined the most ideal situation that we experience for selecting task intervals. This ideal is not encountered as often because we usually do not have sufficient data from operating experience to define the fdf. So let’s discuss what we can do in the non-ideal situations more commonly encountered.

The first situation is one wherein we have a partial knowledge of the age%%MDASSML%%reliability relationship. This means that the failure cause information on the FMEA leads us to conclude that aging or wearout mechanisms are at play. Or perhaps we have some operating experience to support the conclusion that aging/wearout mechanisms exist. But, in either case, we do not have any statistical data to define when this would be expected to occur.

So we tend to use our experience to guess at a task interval for the TD actions. In so doing, there is overwhelming evidence to show that this process is highly conservative. That is, we tend to pick intervals that are way too short. We might overhaul a large electric motor every three years when, in reality, the correct interval turns out to be 10 years.

The second situation is one in which we have no idea what the age-reliability relationship might be, and we are now moving on to look for candidate CD tasks. If the failure mode is hidden, we also extend our search to include candidate FF tasks. These tasks, too, must have intervals specified for the non-intrusive data acquisition and inspection actions that must be accomplished. And, here again, the statistical basis for specifying these intervals is usually missing, and we guess at what they will be—and usually with great conservatism. So Age Exploration will be useful to us with CD and FF tasks as well as with TD tasks.

When good statistical data is not available, using our experience to guess at task intervals is really the only option that is available to us initially. But there is a proven technique that we can employ to refine that “guesstimate” over time, and to predict more accurately the correct task interval. It is called Age Exploration, or AE. The AE technique is strictly empirical, and works like this (using a TD task for illustrative purposes).

Say our initial overhaul interval for a fan motor is 3 years. When we do the first overhaul, we meticulously inspect and record the as-found condition of the motor and all of its parts and assemblies where aging and wearout are thought to be possible. If our inspection reveals no such wearout or aging signs, when the next fan motor comes due for overhaul we automatically increase the interval by 10% (or more), and repeat the process, continuing until, on one of the overhauls, we see the incipient signs of wearout or aging. At this point, we stop the AE process, perhaps back off by 10%, and define this as our final task interval.

Figure 1 illustrates how this AE process was successfully used by United Airlines for one of their hydraulic pumps. On the top half of Figure 1, we see that the overhaul interval started at about 6,000 hours, and that the AE process was then employed over a four-year period to extend the interval to 14,000 hours.

The bottom half of Figure 1 presents a second very interesting statistic for the same population of pumps over the same four-year interval. The statistic is premature removal rate (or the rate at which corrective maintenance actions were required). The interesting point here is that the premature removal rate has a definite decreasing value over the four-year period where the overhaul interval was increasing. We interpret this to suggest that as the amount of human handling and intrusive overhaul maintenance actions decreased, so did the human error resulting from such actions, with the net effect that corrective maintenance actions likewise decreased.

Printed with permission from Butterworth-Heinemann, a division of Elsevier, from RCM%%MDASSML%%Gateway to World Class Maintenance, by Anthony M. Smith, AMS Associates Inc. in California, and Glenn R. Hinchcliffe, Consulting Professional Engineer, G&S Associates Inc. in North Carolina. Copyright 2004. For more information about this title and similar titles, please visit .

No comments
The Engineers' Choice Awards highlight some of the best new control, instrumentation and automation products as chosen by...
Each year, a panel of Control Engineering editors and industry expert judges select the System Integrator of the Year Award winners.
The Engineering Leaders Under 40 program identifies and gives recognition to young engineers who...
Learn how to increase device reliability in harsh environments and decrease unplanned system downtime.
This eGuide contains a series of articles and videos that considers theoretical and practical; immediate needs and a look into the future.
Learn how to create value with re-use; gain productivity with lean automation and connectivity, and optimize panel design and construction.
Go deep: Automation tackles offshore oil challenges; Ethernet advice; Wireless robotics; Product exclusives; Digital edition exclusives
Lost in the gray scale? How to get effective HMIs; Best practices: Integrate old and new wireless systems; Smart software, networks; Service provider certifications
Fixing PID: Part 2: Tweaking controller strategy; Machine safety networks; Salary survey and career advice; Smart I/O architecture; Product exclusives
The Ask Control Engineering blog covers all aspects of automation, including motors, drives, sensors, motion control, machine control, and embedded systems.
Look at the basics of industrial wireless technologies, wireless concepts, wireless standards, and wireless best practices with Daniel E. Capano of Diversified Technical Services Inc.
Join this ongoing discussion of machine guarding topics, including solutions assessments, regulatory compliance, gap analysis...
This is a blog from the trenches – written by engineers who are implementing and upgrading control systems every day across every industry.
IMS Research, recently acquired by IHS Inc., is a leading independent supplier of market research and consultancy to the global electronics industry.

Find and connect with the most suitable service provider for your unique application. Start searching the Global System Integrator Database Now!

Case Study Database

Case Study Database

Get more exposure for your case study by uploading it to the Control Engineering case study database, where end-users can identify relevant solutions and explore what the experts are doing to effectively implement a variety of technology and productivity related projects.

These case studies provide examples of how knowledgeable solution providers have used technology, processes and people to create effective and successful implementations in real-world situations. Case studies can be completed by filling out a simple online form where you can outline the project title, abstract, and full story in 1500 words or less; upload photos, videos and a logo.

Click here to visit the Case Study Database and upload your case study.