Terraform Backend Options
Learn about Terraform backends and how to configure them
Terraform backends determine where your state files are stored and how they are managed. The state file is a crucial component that maps your configuration to your real-world infrastructure. Choosing the right backend is essential for team collaboration, state locking, and security.
Types of Backends
There are two primary categories of Terraform backends: local and remote.
Local Backend
The local backend is the default. It stores the state file, terraform.tfstate, directly on your local filesystem in the same directory where you run Terraform.
How to Use: You don't need to specify a backend block for the local backend as it's the default. However, you can explicitly define it for clarity.
terraform {
backend "local" {
path = "terraform.tfstate"
}
}
Pros & Cons:
- Pros: Simple, requires no external configuration, and is perfect for individual use or quick testing.
- Cons: Not suitable for teams. It lacks state locking, which means simultaneous
terraform applyoperations can corrupt the state file. The state file is not easily shared, and if your local machine fails, you lose your state.
Remote Backends
Remote backends store the state file in a centralized, remote location. This is the recommended approach for any team environment. Most remote backends offer state locking, versioning, and secure storage, which are critical for preventing conflicts and data loss.
Common Remote Backends and Examples:
Amazon S3: A popular choice for AWS users. It's durable, scalable, and can be configured with state locking using a DynamoDB table.
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "path/to/my-project/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-lock-table"
encrypt = true
}
}
bucket: The S3 bucket to store the state.key: The path and filename for the state file within the bucket.region: The AWS region of the S3 bucket.dynamodb_table: The DynamoDB table name used for state locking. It's crucial for preventing concurrent modifications.
Google Cloud Storage (GCS): The backend for GCP. It's a simple and reliable choice.
terraform {
backend "gcs" {
bucket = "my-gcp-terraform-state"
prefix = "terraform/state"
}
}
bucket: The GCS bucket name.prefix: The path prefix within the bucket to store the state file.
Azure Storage: The equivalent for Azure. It uses a storage account and a container to store the state file.
terraform {
backend "azurerm" {
resource_group_name = "terraform-resource-group"
storage_account_name = "mystorageaccountforstate"
container_name = "tfstate"
key = "my-project.tfstate"
}
}
resource_group_name: The resource group where the storage account is located.storage_account_name: The Azure storage account name.container_name: The blob container name.key: The name of the blob file for the state.
Scalr: The Scalr backend not only stores state, but also performs remote operations. Scalr handles state locking, OPA policy integration, centralized credential management, and more.
terraform {
backend "remote" {
hostname = "<your-account>.scalr.io"
organization = "my-organization"
workspaces {
name = "my-project-workspace"
}
}
}
hostname: The Scalr URL.organization: Your Scalr environment name.workspaces: Specifies the workspace to use. Workspaces provide logical separation.
HashiCorp Cloud Platform (HCP) Terraform: An "enhanced" backend that provides a fully managed solution. It handles state, locking, and offers additional features like remote runs, collaboration tools, and policy as code.
terraform {
cloud {
organization = "my-org"
workspaces {
name = "my-workspace"
}
}
}
organization: Your HCP organization name.workspaces: Specifies the workspace to use. Workspaces provide logical separation for different environments (e.g.,dev,prod).
Partial Configuration and Using Files
Hardcoding all backend details directly in your main.tf file can be problematic, especially for different environments (dev, staging, prod) or when you want to avoid committing sensitive information to version control. Partial configuration is the solution, allowing you to provide backend configuration values at initialization time.
This approach is highly recommended and works by defining the type of backend in your configuration but leaving the specific details out. You then supply those details using a file or command-line flags when you run terraform init.
How to Use Partial Configuration
Define a minimal backend block in your Terraform configuration file (e.g., main.tf), specifying only the backend type.
# main.tf
terraform {
backend "s3" {}
}
Notice that the S3 backend is defined, but no bucket, key, or region is specified.
Create a separate backend configuration file for each environment. These files are typically named backend.conf or similar and should not be committed to your Git repository as they may contain sensitive information.
For a development environment:
# backend-dev.conf
bucket = "my-terraform-state-bucket-dev"
key = "dev/my-project/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-lock-table-dev"
For a production environment:
# backend-prod.conf
bucket = "my-terraform-state-bucket-prod"
key = "prod/my-project/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-lock-table-prod"
Initialize Terraform for the desired environment using the -backend-config flag.
For the dev environment:
terraform init -backend-config=backend-dev.conf
For the prod environment:
terraform init -backend-config=backend-prod.conf
This command combines the backend type from your main.tf with the specific values from the configuration file, completing the backend setup.
Benefits of This Approach
- Separation of Concerns: Keeps backend configuration separate from your main infrastructure code.
- Environment-Specific Configurations: Easily switch between different environments without changing your core Terraform files.
- Security: Avoids committing backend credentials or sensitive paths to your version control system.
State Locking
State locking is a mechanism that prevents multiple users or processes from concurrently modifying the same state file. Without it, two people running terraform apply at the same time could overwrite each other's changes, leading to a corrupted or inconsistent state file and potential infrastructure drift.
How it Works
Before Terraform performs a write operation (like apply), it attempts to acquire a lock on the backend. This lock is a small file or database entry that signals the state is in use. If the lock is successful, the operation proceeds. If another user is already holding the lock, Terraform will wait for a short period and then fail, preventing the conflict. Once the operation is complete, the lock is automatically released.
Examples by Backend
Amazon S3: State locking is achieved by using a separate, low-cost DynamoDB table. Terraform uses a unique LockID entry in this table to manage the lock. It's a best practice to create this table with a LockID primary key.Terraform
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "path/to/my-project/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-lock-table" # The DynamoDB table name for locking
}
}
Azure Storage: The Azure backend uses the built-in Azure Blob Storage API for locking. When a terraform apply is initiated, Terraform attempts to acquire an exclusive write lock on the state blob. This is handled natively without needing an additional service.
Google Cloud Storage (GCS): GCS also has a built-in locking mechanism that uses object metadata. This feature is enabled by default, so no extra configuration is required to prevent concurrent operations.
Best Practices
- Always use a remote backend for team projects. Never store state files in a version control system like Git. The state file can contain sensitive data.
- Enable versioning in your backend storage. For S3 or GCS buckets, turn on object versioning. This provides a backup and allows you to revert to a previous state if something goes wrong.
- Use state locking. Ensure your chosen backend supports state locking to prevent conflicts when multiple people run Terraform at the same time. Services like DynamoDB for S3 and native features in Azure and GCP handle this.
- Isolate environments. Create separate backend configurations and state files for different environments (e.g.,
dev,staging,prod). This prevents a mistake in one environment from affecting another. A common approach is to use a directory structure with a dedicated backend file for each environment. - Separate backend configuration from code. You can use a separate file for your backend configuration (e.g.,
backend.tfvars) and load it withterraform init -backend-config=backend.tfvars. This keeps your secrets out of your main configuration files.
Advanced Use Cases
Creating the Backend Infrastructure with Terraform Itself
It's a common chicken-and-egg problem: how do you use Terraform to create the backend infrastructure (like an S3 bucket) that your main Terraform code will then use? The solution is to use a separate, minimal Terraform configuration with a local backend to bootstrap the remote backend.
Example: Bootstrapping an S3 backend
Create a separate directory (e.g., bootstrap). In this directory, create a main.tf file to define the S3 bucket and DynamoDB table. This configuration uses the default local backend.
# bootstrap/main.tf
resource "aws_s3_bucket" "terraform_state" {
bucket = "my-terraform-state-bucket"
tags = {
Name = "Terraform-State"
}
}
resource "aws_dynamodb_table" "terraform_locks" {
name = "terraform-lock-table"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
Initialize and apply this bootstrap configuration.
cd bootstrap
terraform init
terraform apply
This will create the necessary resources in your AWS account.
Now, in your main project directory, you can configure the S3 backend and run terraform init to connect to it.
# main_project/main.tf
terraform {
backend "s3" {
bucket = "my-terraform-state-bucket"
key = "path/to/my-project/terraform.tfstate"
region = "us-east-1"
dynamodb_table = "terraform-lock-table"
}
}Summary
Choosing the right backend for your Terraform state is one of the most critical decisions you'll make when setting up your infrastructure-as-code workflow. While the default local backend is suitable for personal use, a remote backend is a non-negotiable requirement for team environments.
By adopting a remote backend like S3, Azure Storage, GCS, or Scalr, you gain:
- Collaboration: A single source of truth for your infrastructure state.
- Security & Durability: Secure, versioned, and durable storage for your state file.
- State Locking: A robust mechanism to prevent concurrent operations and state file corruption.
Furthermore, integrating best practices like partial configuration allows you to manage environment-specific settings and sensitive data securely, promoting a clean and scalable workflow. By understanding these options and applying the recommended practices, you can build a reliable and collaborative foundation for managing your infrastructure with Terraform.