Choosing condition monitoring inspection intervals for Reliability Centered Maintenance.

Once you select a predictive maintenance strategy in an RCM analysis you will apply condition monitoring to the component But how do you set the condition sampling inspection frequency without knowing the component failure history?



Hi Mike,

While preparing an RCM draft for Boiler Feed Water Pumps I found some ambiguity. The RCM end result is a proposed task with proposed inspection intervals, how can we propose intervals without statistical analysis?



Hello Fahad,

Once you decide on using predictive maintenance strategy you need to select a suitable condition monitoring technique and chose the condition monitoring technology to spot failure starting. You also need to set a suitable inspection frequency to look for early evidence of failure and give you time to plan and organise the work and do the rectification.


Your concern that condition monitoring frequency be set using statistical analysis of component failure data to determine the probable frequency of component failure is a situation that is not usually possible to apply in an operating plant. To get valid failure mode data for a part you need about ten failure points per component failure mode. For much equipment that volume of data will take many decades to collect on a working plant. At the end of that time the plant operation will most likely not be the same operation as it was at the start and so the data collected across time would be suspect to use.

Nonetheless, we can take an example of a Weibull plot statistical analysis, like the one below, and see the value in having enough failure data points to select component condition monitoring frequency.

Weibull plot example of probable component life

The Weibull graph uses 12 historic failures to predict that 1 percent of all such items fail by 50 days and 99 percent fail by 900 days from a particular failure mode in a particular application. Notice that the frequency of failure increased as the component aged. The Weibull plot warns us that the component can fail in 50 days and it can also fail in 900 days. We do not know what the real life will be of the component now in the equipment. How long a part survives depends on its material-of-construction stress levels. Higher stresses lead to a shorter life. Lower stresses lead to a longer life.

Using the statistical analysis of component failure data you may think to establish a two-month inspection interval. But using statistical analysis alone is not sufficient for choosing the inspection frequency. What if the P-F interval is two weeks long? (P-F interval is the time window from detection of Potential failure to when the equipment can no longer be used at its Functional failure point) With a two-monthly sampling frequency and a two week P-F period you will have many breakdowns. To set a two-month inspection frequency needs a P-F interval that is much longer than two months. In other words, a statistical analysis of component life has little value in setting condition monitoring frequency. Though it can help you decide when to start taking condition monitoring samples.

Do not be concerned at not having failure mode data for statistical analysis of component condition sampling frequency. Failure data is not important for setting the condition monitoring inspection cycle. The real concerns are explained below.

It is reasonable to assume that equipment fails when any of its working parts fail. First a working part fails then the equipment stops working. If an item of plant has many working parts there is regular opportunity for one of them to fail.

This view of failure lets us assume equipment failure will occur randomly throughout the equipment’s life. In fact there are three zones of failure—early life, random, end-of-life—with the random period having the lesser failure frequency and the timing of failure being the most unpredictable. Once failure is random you must adopt predictive maintenance strategy and monitor component condition all the time. The requirement for continuous monitoring explains why more and more equipment manufacturers supply their products with on-board monitoring sensors, and why more and more operations are retrofitting online continuous monitoring of critical equipment.

If equipment component continuous monitoring is not possible you revert to sampling component condition at a set frequency.

Condition monitoring sample frequency must 1) consider the P-F interval, 2) determine how early the condition monitoring technology can detect a potential failure (e.g. do you use a temperature gun or vibration analysis to find failing roller bearings; a temperature gun might give you a few days warning prior failure, whereas vibration analysis should give you a few months warning), 3) the length of time taken to organise and effect the corrective maintenance on the failing component, and 4) the likelihood that an inspection will not find the evidence of an initiated failure (even vibration analysis will miss some impending failures).

It makes sense to sample at least twice during the P-F interval. The second inspection is confirmation that the first inspection result was correct. If the second inspection confirms the part is failing you still need enough time left to do the corrective maintenance before the equipment becomes unusable. If you carry all necessary parts on site the repair can be done quickly. If parts need to come from overseas the repair could be weeks away. For large equipment, like crushers, mills, kilns and their drives, some components can take 18 to 24 months to fabricate and transport to site.

Basically, you want to be sure to find a potential failure and have time to correct it before the equipment can no longer be used. The key drivers in setting component condition monitoring sampling frequency are the duration of the P-F window and the certainty that the condition monitoring technique will detect a problem when sampling. If the inspection technique has poor certainty in finding a problem you want to have more sampling during the P-F interval, than if you used a technique with high certainty of finding a potential failure condition.

You can do many component condition monitoring inspections and find nothing wrong. It will cost you a lot of money year after year without finding an impending failure. But you cannot stop doing the sampling even if there are no component problems. You do the condition monitoring to prevent a failure that will impact your operation badly should it happen. The cost of the condition monitoring is insurance against a far greater cost from a failure. The only time you do not need condition monitoring is if you practice world class precision maintenance.

The ability to do statistical analysis of failure data to set condition monitoring inspection frequency is unimportant. You set component condition inspection frequency using the likely P-F period and raise the corrective maintenance work order in time to rectify the problem before functional failure. Statistical analysis of component failure data is useful for setting the start date of a con-mon inspection routine. When the data analysis proves with high certainty that the first failure will not occur until 50 days after a part is replaced, you can then use that knowledge to set the date for the first time a con-mon sample is taken. It will be 50 days minus the most likely P-F window duration.

I hope that the above information is useful to you.

My best regards to you,

Mike Sondalini
Managing Director
Lifetime Reliability Solutions HQ