Terraform State Files Best Practices
Learn what Terraform state is, best practices to use with state, and how to manipulate it.
Terraform state is the core record of your managed infrastructure. It's Terraform's "source of truth," tracking what resources it controls in the real world. Without it, Terraform cannot accurately plan or apply changes. Many beginners overlook its significance, which often leads to major issues. Effective state management is critical for stable and predictable infrastructure deployments. Terraform fundamentally relies on knowing what it thinks exists to correctly plan your infrastructure changes.
What is Terraform State?
Terraform state maps your configuration to actual infrastructure resources. It tracks metadata, resource IDs, attributes, and dependencies, enabling Terraform to understand relationships and manage updates. This data, stored in JSON format, resides in a file usually named terraform.tfstate
by default. This file also stores outputs and can contain sensitive data if not handled carefully. You must never manually edit this file. Understanding its contents and purpose is fundamental to using Terraform effectively.
State File Components
Here's a breakdown of its key components:
- Metadata: The state file begins with metadata about the state format itself and the Terraform version that last updated it. This helps Terraform understand how to read and interpret the file and ensures compatibility.
- Outputs: Any outputs you define in your Terraform configuration (e.g.,
output "instance_ip" { value = aws_instance.web.public_ip }
) are stored here. These values can then be easily accessed by other Terraform configurations or external tools. Be cautious, as sensitive outputs (like passwords) will be stored in plain text unless explicitly marked assensitive
. - Resources Array: This is the most crucial part of the state file. It contains a list of all resources that Terraform is currently managing. For each resource, you'll find:
- Resource Type and Name: A clear identifier linking to your Terraform configuration (e.g.,
aws_instance.web
). - Provider Information: Details about the Terraform provider used to manage the resource (e.g.,
provider["registry.terraform.io/hashicorp/aws"]
). - Instance Details: Each resource instance (if there are multiple) will have its own entry. This includes:
- Unique ID: The actual ID of the resource as assigned by the cloud provider (e.g.,
i-0abcdef1234567890
for an AWS EC2 instance). This is how Terraform links its configuration to the real-world object. - Attributes: All the attributes of the resource, including those you defined in your configuration and those automatically assigned by the provider (e.g., public IP, ARN, security group IDs, instance state). These attributes represent the current known state of the resource.
- Dependencies: Implicit or explicit dependencies between resources. This allows Terraform to understand the order in which resources must be created, updated, or destroyed.
- Unique ID: The actual ID of the resource as assigned by the cloud provider (e.g.,
- Resource Type and Name: A clear identifier linking to your Terraform configuration (e.g.,
Local vs. Remote State: Remote is King
Using local Terraform state introduces significant problems. Collaboration becomes difficult due to merge conflicts and the risk of accidental deletion or corruption is high. Local state also lacks versioning and can expose sensitive data on individual machines. Remote state solves these issues. It centralizes storage, enabling seamless team collaboration, providing state locking to prevent concurrent modifications, ensuring versioning for rollbacks, and offering greater durability and security. For any team environment or production workload, remote state is essential.
Choosing a Remote Backend
Several robust remote backends are available for Terraform state. Popular choices include AWS S3, Azure Blob Storage, Google Cloud Storage, Scalr, and Terraform Cloud/Enterprise.
Here’s how you might configure some:
AWS S3 Backend Example:
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "path/to/my/infra.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-lock-table"
encrypt = true
}
}
Scalr Example:
terraform {
backend "remote" {
hostname = "<account-name>.scalr.io"
organization = "<scalr-environment-name>"
workspaces {
name = "<workspace-name>"
}
}
}
When choosing a backend, consider factors like cost, your existing cloud provider ecosystem, your team's familiarity with the service, and any required features like policy as code or advanced collaboration capabilities.
Best Practices
Effective state management relies on specific practices. Here are some of our top recommendations:
- Always use a remote backend: Centralize your state (e.g., AWS S3 with DynamoDB, Azure Blob Storage, Terraform Cloud) for team collaboration, state locking, and durability.
- Enable state file versioning: Allow for rollbacks and an audit trail of all state changes.
- Avoid manually editing the state file: Rely solely on Terraform's
terraform state
commands to avoid corruption. - Separate state files: Isolate state for different environments (dev, staging, prod) or logical components to reduce error impact.
- Avoid sensitive data in state: Do not store secrets directly in the state file; use the
sensitive
attribute for outputs and integrate with external secret managers.
State Manipulation Commands with Examples
Terraform provides specific commands to interact with and manage the state file. These commands allow you to inspect, modify, and manage resources within the state without directly editing the JSON file. Great caution should be taken when manipulating state files.
terraform refresh
: Updates the state file with the latest attributes from the real-world infrastructure. While terraform plan
implicitly performs a refresh, running it explicitly can be useful to see if any drift has occurred before planning changes.
terraform refresh
terraform state rm <resource_address>
: Removes a resource from the state file. This does not destroy the actual infrastructure resource. Use this with extreme caution when you want Terraform to "forget" about a resource it no longer manages, perhaps because it's now managed manually or by another process.
terraform state rm aws_instance.web
terraform state mv <source_address> <destination_address>
: Moves a resource within the state. This is useful when refactoring your Terraform configuration, such as moving a resource into a module.
# Before: aws_instance.old_name
# After: module.web_server.aws_instance.new_name
terraform state mv 'aws_instance.old_name' 'module.web_server.aws_instance.new_name'
terraform state show <resource_address>
: Displays the attributes of a specific resource as recorded in the state.
terraform state show aws_instance.web
Example Output (partial):
# aws_instance.web:
resource "aws_instance" "web" {
id = "i-0abcdef1234567890"
ami = "ami-0abcdef1234567890"
instance_type = "t2.micro"
// ... other attributes
}
terraform state list
: Shows a list of all resources tracked in the current state file.
terraform state list
Example Output:
aws_instance.web
aws_vpc.main
terraform import
: Imports resources into the state file.
terraform import aws_instance.example i-abcd1234
These commands provide controlled ways to manipulate the state, reducing the risk of corruption compared to manual edits.
Advanced State Management: Workspaces and Isolation
For more complex environments, advanced state management techniques become vital. Terraform Workspaces can isolate different environments (e.g., dev, staging) within a single configuration using commands like terraform workspace new <name>
and terraform workspace select <name>
. While convenient, for true isolation and blast radius reduction, using separate directories with distinct remote backends per environment or component is often a more robust state isolation strategy.
To share information between different state files, use the terraform_remote_state
data source. This allows one configuration to read outputs from another, facilitating modular and interconnected infrastructure deployments.
data "terraform_remote_state" "network" {
backend = "s3"
config = {
bucket = "my-network-state-bucket"
key = "network.tfstate"
region = "us-east-1"
}
}
resource "aws_instance" "web" {
subnet_id = data.terraform_remote_state.network.outputs.web_subnet_id
# ...
}
Common Pitfalls and Troubleshooting
Even with best practices, you may encounter issues. State corruption can occur due to network problems, manual edits, or abrupt process termination. Recovery often involves restoring from a versioned backup, with manual repair as a last resort. Concurrency issues are largely prevented by robust state locking mechanisms.
Drift detection is critical; terraform plan
helps identify discrepancies between your state and the actual infrastructure. For sensitive data in state, leverage the sensitive
attribute for outputs and use external secret management tools instead of storing secrets directly in state. Finally, large state files can impact performance. Address this by breaking down your infrastructure into smaller, modular components with their own state files. See more on drift detection here.