Your company is a dynamic, responsive system that reacts to changes in unknown and unknowable ways. If you act to solve problems listed in a FRACAS with point-in-place and point-in-time changes you can have impacts across time and space that you never saw coming, while the problem returns.
For successful problem solving you must know if you have a common cause or special cause situation. The common cause event you fix with holistic business system design changes, while the special cause event you must stop from entering your company.
I was looking for information on FRACAS and Failure Codes and came across a PDF article of yours called Never use FRACAS like this.
I was intrigued by it, especially because ISO 55001 has a full paragraph on Improvement based on identifying and correcting failure modes. It seems that you have had some positive experiences in this field with your approach.
Do you have more information on this that you are willing to share?
Best regards, Riaan
Because the ISO 55001 asset management standard says you must fix root causes, it does not automatically follow that a FRACAS (the acronym for Failure Reporting, Analysis and Corrective Action System) is the best methodology to adopt to address your failures.
Fundamentally, a FRACAS is a list of your operating plant failures and production troubles that need to be corrected coupled with some sort of root cause analysis technique. There are five concerns I have with companies using a FRACAS to eliminate business problems. With a change in perspective and a change in your investigative methods you can address those issues and make a FRACAS useful to have.
There are two types of problems—common and special. Common-type problems are generated internally by the system and need holistic business process design changes to solve. Special-type problems are random effects introduced from external of the system and are solved by stopping the causes of the event.
Can you tell me which problems in a FRACAS list are of the common type and which are of the special type? Because if you can’t differentiate each item in the FRACAS into its correct problem type you will treat every problem as a special cause and try to fix it as if it were a special-type incident. You will make changes that disrupt your business without stopping the true causes.
That brings us to the second concern with FRACAS. Your business is a system that behaves like an organism. When you tamper with a business you are working on a system. Your company is “alive” and reacts to what is done to it. If your FRACAS leads you to making a change, the change will be on one part of the system. A change anywhere in a system has repercussions throughout the system. Do you know all the business-wide effects that will impact your company performance when you act to address each FRACAS problem?
Thirdly, every common-type problem has multiple root causes imbedded throughout your business system. It’s normal to expect from four to a dozen contributing issues with every problem. A FRACAS makes you do something to address the problem so you can close the record and get it off the FRACAS list. Any proposal that looks acceptable will be approved. But if there were a dozen root causes throughout your company to the problem, and you just addressed one or two of them so you can close the FRACAS investigation and call it “completed”, the problem will reappear some time later because its other 10-11 causes are still alive in your business to create more problems.
The fourth danger with using a FRACAS is once the people doing the improvement project arrive at a proposal they will implement their suggestions in isolation. They will meddle with a system and introduce a change at a point-in-time and place in an interconnected process. They make point changes to a holistic system whose integrated behaviours they don’t fully understand. Thus the changes made are based totally on guess-work in circumstances of massive ignorance.
Number five is a FRACAS does not acknowledge the presence of a system. When deciding how to address a gearbox bearing failure listed in your FRACAS you will need to consider the complete life cycle of the bearing, the gearbox and the machine they drive, since the bearing and the gearbox are parts in the machine system. The machine itself is part of the production system. The production system is part of the business system. The equipment, its subassemblies and its parts started life on the drawing board in a design office. The bearing failure is the end result of all that happened to the bearing, gearbox and machine across their lifetimes. During a lifetime it is not surprising to find 10-12 contributing factors to the failure when you investigate the range of life-cycle opportunities to make mistakes. If all you do is replace the bearing and put the gearbox back in service you have not prevented another failure. Everything that was done wrongly to the original bearing, gearbox and machinery during their lifetimes remains in place and the replacement bearing will also fail in the same way as its predecessor.
Because a FRACAS does not recognize your plant, and indeed your company, is a system; because it does not separate common and special events; because it makes you not address the total range of a problem’s causes; because every random process change can affect and disrupt the system-wide operation and long-term health of a company, while never actually preventing the problem from recurring; because you use point-in-time and point-in-place solutions in dynamic systems that cause unimagined effects, using a FRACAS has a good chance of leading you to do more harm than good. You can easily waste a lot of time and money on solutions that will not work sustainably.
The correct approach to use when addressing problems is to first identify the problem type. If it is a common-type problem don’t do anything but to collect more data and history on it so you can find all its roots. Once you know what all the system-wide causes of a problem are, then address them with a holistic, process re-engineering solution. If the problem is a special-type, then go and find its causes and stop them so the problem can never again come into your company.
The typical method used to identify special and common cause problems in a string of historic events is to take all those points within plus or minus three standard deviations of the average as being common cause events. Those events outside the upper and lower control limits are considered to be special cause problems. Once you solve the special cause issues the distribution of events narrows to what the process design naturally generates. The common events can only be stopped by changing the process to remove their causes.
Until your problem solving methodology adopts a holistic, life-cycle long, system-wide perspective to understand the complete extents of where your business problems started, you can be sure you will keep having business problems forevermore. That is why I wrote the Never use FRACAS like this article. A FRACAS is a dangerous technique to use to stop your business problems because it makes you chose point interventions to correct problems that extend throughout a dynamic, life-cycle-long, living system.
Let me know if you have any questions on the above.
All the best to you,
Lifetime Reliability Solutions HQ