IT Operating Environments Best Practices - Govern environment downtime and maintenance windows in alignment with the teams and processes that depend on them
IT Operating Environments Best Practices
Govern environment downtime and maintenance windows in alignment with the teams and processes that depend on them
Overview
Planned environment downtime - maintenance windows, infrastructure upgrades, configuration updates, and environment reprovisioning - is an unavoidable aspect of environment lifecycle management. The question is not whether environments will experience planned downtime but whether that downtime is scheduled and communicated in a way that minimizes its organizational impact or whether it is executed without coordination with the teams whose delivery activities it disrupts. Uncoordinated planned downtime in environments that are heavily used during delivery sprints produces the same immediate productivity impact as unplanned downtime, with the added frustration that it was predictable and preventable through better coordination.
Best Practice
Govern planned environment downtime through a formal maintenance window process that is coordinated with the teams and delivery schedules that depend on each environment. For every planned maintenance window, require that the Environment Steward publish advance notification to all teams and individuals with access to the environment, with sufficient lead time for teams to adjust their delivery plans - at minimum 48 hours for short maintenance windows, one week for extended maintenance. Maintenance windows for heavily-used shared environments should be scheduled outside core delivery hours wherever technically feasible: early morning, late evening, or weekends, aligned with the delivery rhythm of the teams that use the environment. For environments that are used across multiple time zones, define maintenance windows that are least disruptive across the full geographic range of the user population. Require post-maintenance validation confirming that the environment has returned to its defined operational state before notifying teams that the maintenance window has concluded and the environment is available for use.
Benefit(s)
Coordinated maintenance window governance prevents the delivery disruption and team productivity loss that uncoordinated planned downtime produces. Teams can plan around scheduled maintenance because they know when it is coming and how long it will last. Maintenance that is scheduled outside core delivery hours minimizes its impact on delivery velocity. Post-maintenance validation ensures that environments are confirmed operational before teams return to using them, preventing the wasted debugging time that results when teams attempt to use environments that appear to be available but are actually in a degraded post-maintenance state.
Copyright for the International Foundation for Information Technology (IF4IT): 2008 - Present
Legal Disclaimers