In the world of software development, the journey from a pull request to a production rollout can be filled with surprises. Testing is important in this process, everyone agrees, but Paige has a different idea about when to test. As she says, “There is no place like production!.”
To kick the topic off in the face of her upcoming Shift Miami talk, Paige shared a story about how her deployment brought down production:
After rolling back, I was able to uncover the SNAFU that occurred by inspecting each step of the journey from the original pull request to the production rollout. Surprise, surprise, there was a major difference between what was running in lower environments and what was running in production.
For reasons I’ll explain in the talk at Shift Miami 2023. – the major difference between environments was not immediately apparent to me or my pull request reviewers in the file I introduced the change, didn’t raise any flags in the continuous integration testing phase, and had passed through lower environments with flying colors.
Local development environments are just an approximation
Paige emphasizes, “No matter how close of an approximation your local development environment is to production, it’s just that—an approximation.” While testing in production may be regarded as a controversial practice, Paige argues that it doesn’t have to be a dirty word. In fact, she asserts that everyone tests in production, even if they won’t readily admit it.
If you deploy a code change that passes all checks in local development, staging, and then encounter an unforeseen scenario in production that forces a rollback or feature flag toggle, then surprise! Production ‘tested’ your code.
This illustrates the importance of considering the production environment as the ultimate arbiter of code reliability. While operators and Site Reliability Engineers (SREs) possess a good understanding of the differences between environments, including configuration and cloud components, Paige acknowledges that developers may struggle to discern such discrepancies.
She humorously compares the task to playing the game of “Crow or No”:
To the untrained eye – many black birds look like a crow or a raven.
Embracing the Reality of Production
No matter how meticulously developers test in other environments, production will always hold surprises, Paige concludes. By embracing the reality of production, developers can enhance their code’s resilience and ensure a smoother rollout experience for their applications.
For a more detailed version of what happened when she broke production, and notes on the right and the wrong way of testing in production, catch Paige’s talk “There’s No Place Like Production” on Shift Miami!