IT Operating Environments Best Practices - Test for environment-specific failures - do not assume that success in a lower environment guarantees success in a higher one
IT Operating Environments Best Practices
Test for environment-specific failures - do not assume that success in a lower environment guarantees success in a higher one
Overview
Even with well-defined parity standards, documented configuration differences, and infrastructure-as-code enforcement, lower environments will always differ from Production in some respects. These differences create the possibility of environment-specific failures - defects that do not manifest in lower environments because the lower environment condition that would trigger them does not exist, but that manifest in higher environments or in Production because the condition that triggers them is present there. Environment-specific failures are among the most disruptive categories of Production incident because they cannot be reproduced in the lower environments where debugging is most convenient, and because they create the false impression that the solution was adequately tested when in fact it was not tested under the conditions that revealed the defect.
Best Practice
Build explicit awareness of environment-specific failure risk into the testing strategy for every promotion gate and invest in testing techniques that reduce the probability of environment-specific failures reaching Production. Before every promotion to a higher environment, review the configuration difference register for the target environment and identify the specific differences that may affect the behavior of the solution being promoted. Design targeted tests that specifically exercise the solution under the conditions of the target environment rather than assuming that success in the source environment fully predicts success in the destination. In PSTG, conduct production-equivalent load testing and operational validation that lower environments cannot provide, explicitly testing the behaviors that depend on Production-scale infrastructure, Production-equivalent network latency, and Production-scale data volumes.
Benefit(s)
Explicit testing for environment-specific failure risks reduces the probability that environment differences produce Production surprises from validations performed in conditions that did not reflect the Production environment. The organization develops a testing discipline that actively accounts for the limitations of lower environment testing rather than implicitly relying on the assumption that lower environment success is a complete predictor of Production success. Environment-specific defects that are caught in PSTG or in targeted pre-Production testing are resolved before Production exposure, eliminating the most disruptive category of Production incident - the one that cannot be reproduced in the environment where debugging is easiest.
Copyright for the International Foundation for Information Technology (IF4IT): 2008 - Present
Legal Disclaimers