Maintenance and Reliability Strategy Selection for Electromechanical Systems, Electromechanical Equipment and Components

Electromechanical equipment presents a maintenance challenge—the mechanical parts fail from use and abuse, whereas the electrical and elecronic components may never fail during their service life but for Acts of God.



Hi Mike,

I have an interview for a job as a Reliability Electromechanical Maintenance Engineer. In the interview they will ask me about how you can pragmatically implement reliability in electromechanical maintenance work. Can you guide me, or give any suggestions, on how I can answer the question.



Hello Friend,

I will go back to reliability and maintenance fundamentals and build the case for how to do pragmatic maintenance on electromechanical systems and their components.

An electromechanical device is a one made of mechanical parts that requires electricity to work. Electromechanical systems involve a symbiotic collection of mechanical and electrical elements working together to deliver the necessary function. Electromechanical equipment and controls are commonly used in industrial machinery, control equipment, and consumer products.

Examples of electromechanical equipment include:

  • Synchronous motors
  • Stepper motors
  • Assemblies like switches, solenoids, electric valve actuators
  • Servomechanisms like servomotors, positioning devices
  • Chart recorders and power meters
  • Automatic controls like relays, thermostats, heating/cooling controls

Examples of electromechanical systems include:

  • Mechatronic equipment like robots
  • Automated, distributed process controls like in a chemical process plant, unmanned off-shore plant and equipment
  • Equipment control panels e.g. production plant control rooms
  • Optical telescope and radio telescope positioning mechanism
  • Computer numerically controlled (CNC) metal and wood working machinery
  • Boiler gas, steam and water controls

These devices/systems have both mechanical and electrical parts that respectively suffer from both mechanical and electrical failures.

Mechanical component failures are likely to be wear related if items are used a lot or used harshly, and age related as the item gets old (e.g. corrosion, fatigue failures). Early life mechanical failures after installation and intrusive maintenance will most likely be human error caused (though manufacturing errors, probably human factor caused, may also occur, as well as commissioning mistakes). Mechanical component failure during normal operating life is likely to be due to either a huge overload event in a localised area of the atomic structure, the cumulative effect of many high localised stress incidents, or rapid component degradation from poor local environment factors e.g. high temperature, dust, water, process chemical attack, etc.

Electrical component failures are likely to be random events due to voltage instability, current surges, excess temperature, contamination ingress, moisture ingress, circuit wiring errors, electric circuit component failures, along with many other electrical and electronic item failure causes (including human error).

To maximise the reliability of electromechanical systems requires minimising the probability of their equipment and component failures. To successfully support an electromechanical system and deliver high reliability you need to adopt four practices:

  • impeccable quality control during installation, during intrusive maintenance and when commissioning,
  • always operate system equipment and components at steady, stable duty substantially below the design loads and stresses,
  • sustain the correct health of the immediate environment surrounding electromechanical items, and
  • proactive replacement of high likelihood-of-failure components before system failure rates rise.

The first practice to create highly reliable electromechanical systems requires that you have a quality assurance process. The aim of the quality assurance is to ensure your electromechanical parts and assemblies meet the manufacturing and installation standards which produce the component reliability required for the system reliability you need. Along with that you will need your people to actually follow the quality assurance process and to continually improve it.

The second practice will minimise stress related degradation. It needs equipment to be run steadily with no pressure surges, no temperature fluctuations, no vibration, no forceful impacts, and the steady control of all the other Physics of Failure factors that affect your electromechanical equipment.

The third practice requires that you have operating and maintenance procedures that keep the immediate environment around electromechanical equipment in the condition to maximise reliability during their service life. To which must be added the appropriate training of engineers, operators and maintainers so they know and can faithfully deliver the necessary health standards for high electromechanical system reliability.

The last practice requires that you monitor electromechanical parts failure rates and when they start to climb beyond what you want you replace all similar components that have suffered similar operating life conditions in a well-focused and well-planned maintenance campaign. This requires that you capture the replacement dates and reasons for replacement of all the types of electromechanical parts in your operation and trend their failure rates by failure causes. Before the frequency of failure becomes too high you replace all similar items suffering the same level of operating risk.

The four-factor approach described above covering design, installation, operation and maintenance will put you on the right path to long-life, high reliability electromechanical systems.

Your question asks how to do pragmatic maintenance of electromechanical systems to create reliability. Pragmatic meaning adjusting for the practical, day-to-day reality. In other words we want to use our daily maintenance efforts to produce high reliability of our electromechanical systems.

Reactive maintenance does not fit this requirement. If you do maintenance when an electromechanical item fails you are maintaining reactively, from failure to failure. You will need to carry many spares. You will suffer excessive downtime as equipment waits for the handover, the parts and the maintainer to arrive and correct the problem. The maintenance will be done poorly and mistakes will be many because everyone is under time and production pressure.

If we want high reliability you need to use your maintenance to reduce the risks of failure and not to fix failures. This requires that you initiate proactive maintenance and act to prevent all possible failures before the operational risk from an electromechanical item failing gets too great.

For example, you establish a monitoring program of the failures of solenoid valves in size ranges 15mm to 40mm and 50mm to 80mm. You view the data each month and notice that where the 15mm to 40mm solenoids were failing at a rate of one every three years when the plant and equipment was younger, they have now reached a failure rate of one every two years. What do you do? Do you increase your spares holding and replace each item on failure (that is reactive maintenance), or do you replace them all in a block of planned maintenance and renew all old 15mm to 40mm solenoids? If you have 100 of such solenoids in your plant you will have a major maintenance cost and a major production disruption (If you had only 10 such solenoids you might justify a campaign to replace them one-by-one as each can be accessed on a planned outage. If you could do all ten at once you would.). Fortunately we have yet to investigate the risk of solenoid valve failure.

Your solenoid valve failure monitoring program is also categorised by failure mode (what you see when an item fails) and by the plant area of the item that failed. You can identify that the failures have historically been in a very damp part of the operation and the failure modes are mostly burnt coils due to moisture ingress (e.g. in a beverage manufacturer, a process chemical plant, off-shore facility, etc). It is clear that not all of the 100 solenoids are at risk; only those in damp operational areas. You only need to campaign those solenoids in damp areas that are of the same design, Ingress Protection (IP) rating or below, and of similar operational age. That is a pragmatic approach to maintenance work selection.

We can still do other pragmatic maintenance to maximise electromechanical system reliability. If we can control the environment immediately in contact with the solenoid valves and keep the moisture out of the coils we will have removed the scientific root cause of the failure. We probably cannot address the latent root cause of the failures, which was why the company allowed such IP solenoid valves to be used in damp areas in the first place. We can only do what we can in the situation, and that will be to prevent moisture ingress. For our maintenance to be pragmatic in this situation we set-up a campaign to improve the sealing of all solenoids in damp areas against moisture ingress. To which we add a planned preventive maintenance route for valves at high risk of moisture ingress, in which their sealing is methodically inspected every six months and kept in good condition. Methodical inspection would require a clear pass/reject criteria for the sealing and a record of what condition the valve was found in and the condition it was left in. These days I would also expect before and after photographs of each item as part of their historic maintenance record (even thermographic images probably should be included).

I have not yet discussed the use of condition monitoring. It is because predictive maintenance strategy, which makes use of condition monitoring technologies and techniques, does not produce reliability improvement. Reliability requires ways to cause reduced rates of failure. From a failure once every two years, to once every three years, to once every five years is reliability improvement. Predictive Maintenance does not increase the time between failure, it only finds the failure, hopefully well before it becomes a breakdown.

Condition monitoring used to increase reliability would be if it were used to identify the failure rate, and if was used to find the root cause of the failure. Once the failure rate is known we use the operating risk management options discussed above to renew components, install more robust assemblies, and even replace entire machines with more reliable ones. Once condition monitoring helps us to find the root cause of failure we remove the cause. It is removal of the failure root cause that produces reliability, not the condition monitoring.

All of the above depends on one other very pragmatic issue—is it going to make money for the business? All maintenance and reliability decisions need to be financially sound for the future well being of the business. If your plant is going to be decommissioned in six months you would not do most of what I suggested above. Only the additional sealing campaign might be justifiable. It bothers me greatly when maintenance decisions are made without a sound financial analysis of the available choices. The right maintenance to do is that which makes most money for the business. That includes, if necessary, maintenance people training operators and engineers in better skills and practices.

Electromechanical equipment is used everywhere, and increasingly so. The most beneficial pragmatic maintenance strategy to use for electromechanical systems is to keep their components safe from the risks that cause their failure. Overriding that recommendation is the need that all activities you do in maintenance must make the most life cycle profit for the business.

I hope that the information above is helpful.


My best regards to you,

Mike Sondalini
Managing Director
Lifetime Reliability Solutions HQ