Infrastructure as Code Drift
Teams adopt Infrastructure as Code (Terraform, Pulumi, CloudFormation) to make infrastructure reproducible, version-controlled, and auditable. The initial setup is clean. Then someone makes a manual change in the console during an incident. Then another. Then a new service gets provisioned outside IaC because it was 'just a quick test.' Drift accumulates. The IaC state file says one thing, reality says another. The reproducible infrastructure is no longer reproducible, and nobody knows which version is correct.
What people believe
“Infrastructure as Code ensures reproducible, auditable infrastructure that matches the declared state.”
| Metric | Before | After | Delta |
|---|---|---|---|
| Organizations experiencing IaC drift | Expected 0% | 60-70% | Majority |
| Manual console changes per month | Expected 0 | 5-20 per team | Regular occurrence |
| Confidence in terraform apply | High | Low (fear of breaking production) | Inverted |
| Time to provision new environment | Minutes (IaC promise) | Days (drift + manual fixes) | +1000% |
Don't If
- •Your team doesn't have the discipline to back-port every manual change to IaC
- •You don't have automated drift detection running continuously
If You Must
- 1.Run automated drift detection daily — alert on any divergence immediately
- 2.Establish a 'no manual changes' policy with exceptions requiring documented back-port within 24 hours
- 3.Use policy-as-code (OPA, Sentinel) to prevent manual changes that bypass IaC
- 4.Keep state files in remote backends with locking and versioning
Alternatives
- GitOps with drift detection — Continuous reconciliation between declared state and actual state — auto-corrects drift
- Immutable infrastructure — Never modify — destroy and recreate. Eliminates drift by design.
- Platform engineering abstraction — Internal platform handles IaC complexity — developers interact with simpler abstractions
This analysis is wrong if:
- Organizations using IaC maintain zero drift between declared and actual infrastructure state over 12 months
- Manual changes during incidents are consistently back-ported to IaC within 24 hours
- IaC state files accurately represent production infrastructure at all times without drift detection tooling
- 1.HashiCorp State of Cloud Strategy Survey
Survey showing majority of organizations experience infrastructure drift despite IaC adoption
- 2.Spacelift: State of Infrastructure as Code
Analysis of IaC challenges including drift, state management, and complexity growth
- 3.Gruntwork: Terraform Best Practices
Practitioner guide addressing drift prevention and state management challenges
- 4.CNCF: GitOps Principles
GitOps framework that addresses drift through continuous reconciliation
This is a mirror — it shows what's already true.
Want to surface the hidden consequences of your engineering decisions?