The Five Common Mistakes of Reliability Engineers

We all screw up, and that’s okay.

In my years working as a Reliability Engineer, I have made many mistakes, felt frequent frustrations, and found myself stuck in the mud, unable to yank my boots free from unproductive methodology. I thought everything followed a rulebook, but that mindset failed me over and over. I had to learn how to solve problems on my own and in doing so, my productivity soared. As I watch other engineers stumble as I did, I have come to recognise five mistakes we all commonly make, and I’m compelled to steer others away from them.

Mistake 1 – Not knowing the problems you have

People don’t always realise they have an issue, which is problematic as you cannot treat an illness if you don’t notice it’s there. Alternatively, you know you have a problem, but mislabel and thus misdiagnose it. The illness goes untreated and you suffer the unnecessary side-effects of the medicine without reward. I sort issues into two categories; chronic and acute. Acute issues are small flukes and one-offs, while chronic issues are identified by trends and cannot be resolved in a simple manner. If you only treat the symptoms of chronic problems like acute problems, you’ll soon find yourself playing an endless form of whack-a-mole. You whack the problem there, another problem pops up somewhere, whack that, and it pops back up where you just fixed it. You’d need more than a hammer to get rid of these “moles” altogether, and you’ll need to know how they’re thriving beneath the surface. Once you identify the presence of a problem and categorise it accordingly, you’ve made progress, but this is only the first in a series of corrections.

Mistake 2 – Misunderstanding the complications that occur

Many Reliability Engineers can identify the problems they have, but not why they’re happening. If they are ignored, misunderstood, or dismissed as one-off incidents, they’ll continue to haunt you. To resolve the problem, you should follow a process I like to call Root Cause Analysis (RCA). To summarise, you analyse patterns and data and keep asking “why” questions until you find the root cause. Maybe a machine part wasn’t the right part for the job. Maybe the machine wasn’t lubricated properly. Maybe the lubrication was contaminated. These root causes won’t become apparent on their own, but they can be found if you look deep enough for them.

So you found the root cause, and you apply a solution according to the rules you have been trained to follow. Did the problem go away? No? Then we come to our third error:

Mistake 3 – Not questioning the rulebook

You can’t expect different results by applying the same, tired solution over and over. When you have a problem that your regular methods fail to fix, you must find another way. Many Reliability Engineers repeat the same method simply because that is all they and their colleagues know. As Abraham Maslow said, “When all you have is a hammer, everything looks like a nail.” Trying something different is often discouraged because it is risky, but often innovation is the only step forward (and especially in this world of rapidly evolving technology). Many companies unfortunately see maintenance as a cost rather than an investment, and thus a change in routine is seen as a gamble rather To be given the freedom to try a different method, you will need to challenge the mindset of management and for that, you need data.

Mistake 4 – Ignoring Data

When explaining ideas, concepts, or issues to management, it is imperative that you use data, especially that which is easy to understand and explain. In the business world today, you cannot progress without data to back you up, and frankly, why ignore it? Data is easy to access, analyse, and present, so using it is one of the simplest ways to improve. I once had a boss who had a sign on his door that read, “When you enter this office, please choose the type of debate you want to have.” The choices were a data-free debate, a data limited debate, and a data-driven debate. When I started up a data-limited debate with him, he encouraged me to upgrade to a data-driven debate. Presenting the right data is vital if you are to pitch the improvement needed to management, because only then can the fifth mistake be overcome.

Mistake 5 – Failing to facilitate change

You could have the right idea or strategy, but it doesn’t change anything unless implemented. If your strategy gets rejected, it will be as if you have made no progress at all. There were once some New Zealand engineers who bought some shipping container offices to their worksite, and everyday they would go inside these containers to optimise their strategy, and then come back out, for two years. No one outside the containers knew what was happening, so when the engineers presented their fully optimised strategy to management, it got rejected. If they had invested management during the process, two years of hard work would likely not have been thrown away like that. As we discussed before, if you use data to catch their interest, only then will progress be made.

 

As you may have noticed, these five errors can be closely linked. If you work to fix one problem, you might find yourself solving many others in the process, and that can lead you along a path of ever growing relief. Don’t feel ashamed of these mistakes; we all make them, and we can all learn from them.

Do you want to learn more? Be sure to check out my book, ‘5 Habits of an Extraordinary Reliability Engineer’ for more in-depth advice on how to correct the 5 Mistakes mentioned here.

Leave a Reply

Your email address will not be published. Required fields are marked *