Why Engineers Should Try to Reproduce Production Issues Locally

23 Jan 2023

As engineers, one of our primary responsibilities is to ensure that the systems we build are stable and reliable.

However, despite our best efforts, issues and issues will inevitably arise in production environments. When this happens, it can be tempting to try and patch the problem and move on quickly.

However, recreating production issues locally is a critical step in the debugging and resolution process.

In this post, I'll explain the benefits of reproducing production issues locally.

engineering examining an engine

Photo by Aaron Huber on Unsplash

Controlled environment

purple pipette

One of the main benefits of recreating production issues locally is that it allows us to reproduce the issue in a controlled environment.

By doing so, we can more easily isolate and identify the root cause of the problem. Trying to do this is much more challenging when the issue is only present in a production environment where many other factors are at play.

Validate Resolutions

problem fixed

Recreating production issues locally allows us to test and validate any proposed fixes before deploying them. This is essential for preventing regressions and ensuring the fix resolves the issue. Recreating the issue locally can also help identify and resolve other problems that may have been overlooked.

Insights into the system

mechanic

Another benefit of recreating production issues locally is that it can help improve our understanding of the system. By closely examining the issue, we can gain valuable insights into how the various components of the system interact, which can help us prevent similar problems from arising.

Improve communication and collaboration

team working together

Photo by thisisengineering

I've also found recreating production bugs locally has helped improve the engineering team's communication and collaboration. By working together to reproduce and resolve the issue, we shared knowledge and learnt from one another, which helped improve the team's morale and motivation.

Is it necessary to reproduce every defect before identifying and fixing it?

Of course, not every issue will be reproducible locally because of certain complex conditions and environments, but engineers should at least try to do so. An example of these situations could be the hardware the software is running on, extensions the users' browser has installed or, as I once encountered, how users within China had a different experience of network conditions from those outside when connecting to an AWS instance within a Chinese region.

When teams cannot reproduce issues locally, then teams and businesses are put into a position where they have to decide if blind patches are a good idea and will not make a situation worse. One position you don't want to find yourself in is looking for something similar with the same consequence but a different root cause. Adding more instrumentation can help to prevent this.

Then, of course, are the pressures on engineering teams to produce a fix.

These pressures could be internal, from management wanting to see action, or external, to calm down an angry customer.

Conclusion

Engineers should attempt to recreate production bugs to improve reliability and stability of software and not damage the company's credibility or reputation.

One of the best ways of recreating production issues locally is by tests, which I'll discuss in the next post.

Arrange Act Assert