Drift Remediation Strategies for Terraform: Revert, Align, or Ignore
You’ve detected drift in your Terraform-managed infrastructure. Now what? The answer depends entirely on context — what changed, why it changed, and whether the change should persist. Making the wrong call here can be worse than the drift itself: blindly reverting an emergency scaling event could cause an outage, while ignoring a security group modification could leave you exposed.
This guide provides a decision framework for choosing the right remediation strategy based on the type of drift you’re dealing with. For background on how drift happens and how to detect it, see our comprehensive guide to Terraform drift detection.
The Three Remediation Philosophies
Every drift remediation decision comes down to one of three actions: revert the infrastructure to match your code, update your code to match the infrastructure, or acknowledge the drift and move on. Each has legitimate use cases, and the right choice depends on the specific situation.
Revert Infrastructure (Enforce Desired State)
This is the default instinct: run terraform apply and force infrastructure back to what your configuration declares. It’s the right call when drift is unauthorized or unintentional — someone opened a security group port that shouldn’t be open, or a manual change broke the expected configuration.
When to revert:
- Security-sensitive changes (modified IAM policies, opened ports, disabled encryption)
- Changes that violate compliance requirements
- Unintentional modifications from ad-hoc debugging sessions
- Changes that break dependent infrastructure or deployment pipelines
When not to revert:
- Emergency scaling events that are still needed
- Hotfixes applied during an active incident
- Changes made by other automation tools (auto-scaling, self-healing systems)
The risk with reverting is timing. If you revert automatically at 3 AM without understanding the context, you might undo an emergency change that’s keeping production running. This is why platforms like Scalr deliberately keep a human in the loop for remediation — presenting the drift and letting an engineer decide, rather than auto-reverting.
Update Code (Align Configuration)
Sometimes the real world is right and your code is wrong. An emergency change was applied manually during an incident, and now you need your Terraform configuration to reflect that new reality. In this case, you update your .tf files and state to match what actually exists.
When to align:
- Legitimate emergency changes that should become permanent
- Infrastructure changes driven by business decisions (new scaling requirements, new regions)
- Changes from other IaC tools or automation that are authoritative for those resources
- Resources that were manually created and now need to be brought under Terraform management
The mechanics depend on the situation. For simple attribute changes, updating the .tf file and running terraform plan to confirm zero diff is straightforward. For resources that were created outside Terraform entirely, you’ll need terraform import to bring them into state.
Acknowledge and Ignore
Not all drift requires action. Some changes are expected, temporary, or managed by other systems. The key is to acknowledge them deliberately rather than leaving them as unreviewed noise in your drift reports.
When to ignore:
- Auto-generated tags or metadata added by cloud providers
- Temporary scaling changes that will revert on their own
- Changes to resources managed by another team’s Terraform configuration
- Known discrepancies in dynamic resources (Lambda function hashes, ECS task definitions)
The danger of ignoring drift is that it becomes habitual. If your team starts dismissing all drift alerts, you’ll miss the one that actually matters. Good drift hygiene means reviewing every detection, making an explicit decision, and documenting why you chose to ignore it.
A Decision Framework for Drift Remediation
When you detect drift, run through these questions in order:
1. Is the change security-sensitive? If an IAM policy, security group, encryption setting, or access control was modified, treat it as high-priority. Revert immediately unless you can confirm the change was authorized and intentional.
2. Was the change intentional? Check with your team. If someone made a deliberate change during an incident or as part of a planned activity, the right path is usually to align your code rather than revert.
3. Is the change still needed? Emergency scaling events are intentional but temporary. If the incident is resolved and the extra capacity isn’t needed, revert. If it is, align your code.
4. Is it managed by another system? Auto-scaling groups, Kubernetes operators, and other automation tools legitimately modify resources. If another system is authoritative for that resource attribute, consider using lifecycle { ignore_changes } in your Terraform configuration to prevent false positives going forward.
5. Can you explain why you’re ignoring it? If you can’t articulate a clear reason, don’t ignore it. The inability to explain drift is itself a signal that something unexpected happened.
Remediation in Practice: Platform Workflows
In Scalr, detected drift surfaces in a dedicated Drift Detection tab with three built-in actions that map directly to the philosophies above:
- Revert Infrastructure triggers a plan and apply to enforce your declared configuration. Scalr shows you what will change before applying, so you’re not reverting blind.
- Sync State runs a refresh-only operation that updates your state file to match current infrastructure without changing anything. This is the “align” path when the infrastructure change is correct but state is stale.
- Ignore acknowledges the drift and clears the alert. The decision is recorded, creating an audit trail of acknowledged drift.
The advantage of handling remediation through a platform rather than CLI commands is visibility. When an engineer reverts drift from Scalr, the action is logged, attributed to a user, and visible to the team. When someone runs terraform apply from their laptop to fix drift, nobody else knows it happened.
Preventing Repeat Drift
Remediation is reactive. The long-term goal is prevention. After remediating drift, ask: what process or permission allowed this drift to happen, and how do we prevent it next time?
Restrict console access — If drift came from a manual cloud console change, tighten IAM permissions so production modifications require Terraform. Read-only console access for production environments eliminates the most common source of drift.
Use lifecycle ignore_changes — For attributes that are legitimately managed outside Terraform (auto-scaling counts, dynamic tags), add explicit ignore rules so they stop triggering false drift alerts.
Enforce GitOps workflows — Require all infrastructure changes to go through version-controlled pull requests. This creates an audit trail and ensures peer review before changes reach production.
Schedule drift detection — If you’re not already running scheduled checks, start. The faster you catch drift, the easier it is to remediate. See our guide on setting up scheduled drift detection for step-by-step instructions.