Beginner's Guide to Terragrunt
Learn Terragrunt basics, modular Terraform, DRY configs, and automation in this beginner-friendly step-by-step guide.
What is Terragrunt?
Terragrunt is a thin, language-agnostic wrapper around Terraform and OpenTofu that helps teams maintain DRY (Don't Repeat Yourself) configurations at scale. Rather than replacing Terraform, it enhances it by providing a framework for managing multiple modules, environments, and regions without the repetitive boilerplate that typically accompanies large-scale Terraform deployments.
At its core, Terragrunt addresses a fundamental problem: as infrastructure complexity grows across multiple environments and cloud accounts, managing identical configurations across dozens or hundreds of modules becomes tedious and error-prone. Terragrunt introduces patterns and tooling to abstract away this repetition while maintaining clarity and control.
Key Components
Terragrunt centers on a few critical concepts:
- Modules: The actual Terraform/OpenTofu code (your infrastructure definitions)
- Configurations: Terragrunt's HCL-based configuration files (
terragrunt.hcl) that manage module instantiation - Dependencies: Mechanisms to orchestrate execution order and share outputs between modules
- Remote State: Automated backend configuration for storing and managing Terraform state
Why Use Terragrunt?
Terragrunt solves several fundamental challenges that teams encounter when scaling Terraform:
1. Reducing Configuration Repetition
In vanilla Terraform, managing the same module across development, staging, and production environments means repeating backend configurations, variable definitions, and module blocks. Terragrunt consolidates these through inheritance and templating:
# Parent configuration (env.hcl)
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
config = {
bucket = "my-company-tfstate-${get_aws_account_id()}"
key = "${path_relative_to_include()}/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "my-company-tfstate-lock-${get_aws_account_id()}"
}
}
Child configurations inherit this automatically, eliminating duplication across your entire infrastructure codebase.
2. Managing Multiple Environments
Terragrunt enables hierarchical variable inheritance and environment-specific overrides. A single component (like a VPC module) can be deployed across multiple environments with minimal configuration differences:
# Child configuration inherits parent settings
include "root" {
path = find_in_parent_folders()
}
inputs = {
environment = "production"
instance_count = 10
cidr_block = "10.0.0.0/16"
}
3. Orchestrating Dependencies
Dependencies between modules are complex in vanilla Terraform. Terragrunt provides explicit dependency management, automatically determining execution order and sharing outputs between modules without manual Terraform output commands:
dependency "vpc" {
config_path = "../vpc"
}
inputs = {
vpc_id = dependency.vpc.outputs.vpc_id
}
4. Simplifying State Management
Terragrunt automates backend configuration generation, including creation of S3 buckets and DynamoDB tables for state locking. This removes manual setup overhead and ensures consistent state storage patterns across your organization.
Installation and Setup
Prerequisites
- Terraform or OpenTofu installed (v0.12.26 or later for Terraform)
- Terragrunt binary available in your PATH
- Cloud provider credentials configured (AWS, Azure, GCP, etc.)
Installing Terragrunt
macOS (via Homebrew):
brew install terragrunt
Linux:
wget https://github.com/gruntwork-io/terragrunt/releases/download/v0.50.x/terragrunt_linux_amd64
chmod +x terragrunt_linux_amd64
sudo mv terragrunt_linux_amd64 /usr/local/bin/terragrunt
Windows: Download the binary from the Terragrunt releases page and add it to your PATH.
Project Structure
A typical Terragrunt project follows this organization:
project/
├── terragrunt.hcl # Root configuration
├── components/
│ ├── vpc/
│ │ ├── main.tf
│ │ ├── variables.tf
│ │ └── outputs.tf
│ ├── eks/
│ │ ├── main.tf
│ │ └── ...
│ └── rds/
│ └── ...
├── environments/
│ ├── dev/
│ │ ├── terragrunt.hcl
│ │ ├── vpc/
│ │ │ └── terragrunt.hcl
│ │ └── eks/
│ │ └── terragrunt.hcl
│ ├── staging/
│ └── production/
└── common/
├── common.hcl
├── region.hcl
└── env.hcl
DRY Configurations: The include and find_in_parent_folders Pattern
Understanding include Blocks
The include block is Terragrunt's primary mechanism for code reuse. It allows child configurations to inherit and extend parent configurations:
# Child configuration
include "root" {
path = find_in_parent_folders()
}
# Additional configuration merged with parent
inputs = {
tags = {
Environment = "staging"
}
}
The find_in_parent_folders Challenge
find_in_parent_folders() searches up the directory tree for a configuration file, but this can be finicky with non-standard naming conventions. Best practice has shifted to explicitly specifying the parent file name:
# More explicit and reliable
include "root" {
path = find_in_parent_folders("env.hcl")
}
Avoiding the Deep Include Chain
While deep include hierarchies reduce repetition, they can obscure where settings originate and add HCL parsing overhead at scale. Keep include chains to 2-3 levels maximum:
# Root level (env.hcl)
remote_state { ... }
terraform { ... }
# Mid level (region.hcl)
include "root" {
path = find_in_parent_folders("env.hcl")
}
locals { ... }
# Child level (terragrunt.hcl)
include "region" {
path = find_in_parent_folders("region.hcl")
}
Common Anti-Pattern: Copy-Pasting Configurations
Despite DRY principles, teams often copy-paste terragrunt.hcl files when include and locals aren't understood. This negates Terragrunt's benefits. Always leverage include and locals to avoid duplication.
Dependencies and Run Order
Explicit Dependency Declaration
Terragrunt analyzes dependency blocks to determine execution order when using run-all commands:
# Module: app/terragrunt.hcl
dependency "vpc" {
config_path = "../vpc"
}
dependency "rds" {
config_path = "../rds"
}
inputs = {
vpc_id = dependency.vpc.outputs.vpc_id
db_endpoint = dependency.rds.outputs.endpoint
}
Terragrunt will ensure VPC and RDS are applied before the app module.
Mock Outputs for Planning
Mock outputs enable planning without applying dependencies first:
dependency "vpc" {
config_path = "../vpc"
mock_outputs = {
vpc_id = "vpc-mock123"
subnet_ids = ["subnet-mock1", "subnet-mock2"]
}
}
However, be aware that mock outputs don't account for actual resource IDs, which can cause discrepancies during apply.
The run-all Complexity
run-all plan and run-all apply commands execute across multiple modules, but understanding behavior is critical:
- run-all plan: If a dependency has unapplied changes, dependent modules' plans use the old state
- run-all destroy: Only destroys the specified module, not modules that depend on it
- run-all show -json: Outputs concatenated JSON (not valid single JSON document), making parsing difficult
For safety, use flags to control parallelism and error handling:
# Sequential execution for critical operations
terragrunt run-all apply --terragrunt-parallelism 1
# Continue on errors with caution
terragrunt run-all plan --terragrunt-ignore-dependency-errors
Performance Optimization
Terragrunt's --dependency-fetch-output-from-state flag speeds up dependency resolution for S3 backends by reading state directly instead of invoking terraform output:
terragrunt run-all plan --dependency-fetch-output-from-state
This can significantly reduce execution time in large deployments.
Remote State Management
Automatic Backend Configuration
Terragrunt generates backend configurations automatically based on remote_state blocks, eliminating manual backend setup:
# Root configuration generates backend.tf
remote_state {
backend = "s3"
generate = {
path = "backend.tf"
if_exists = "overwrite_terragrunt"
}
config = {
bucket = "my-tfstate-${get_aws_account_id()}-${get_aws_region()}"
key = "${path_relative_to_include()}/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "my-tfstate-lock-${get_aws_account_id()}"
s3_bucket_tags = {
owner = "infra-team"
name = "Terraform State"
}
}
}
Limitations and Workarounds
Automatic backend creation has limitations:
- S3 auto-created log buckets may lack stringent security defaults
disable_initcan inadvertently disable all backend initialization- Fine-grained IAM or KMS key customization isn't fully supported
- Azure backends don't support auto-creation; storage accounts must be pre-created
For production environments requiring tight security controls, consider managing state infrastructure separately outside of Terragrunt's auto-generation.
Multi-Account State Management
For cross-account deployments, explicitly configure role assumptions:
remote_state {
backend = "s3"
config = {
bucket = "terraform-state-account-b"
key = "${path_relative_to_include()}/terraform.tfstate"
role_arn = "arn:aws:iam::ACCOUNT_B:role/TerraformRole"
region = "us-east-1"
}
}
Environment Management
Hierarchical Variable Inheritance
Terragrunt supports layered variable definition, allowing broad defaults with environment-specific overrides:
# Root: common variables
inputs = {
project_name = "myapp"
team = "platform"
enable_monitoring = true
}
# Region level: region-specific overrides
include "root" { path = find_in_parent_folders("root.hcl") }
inputs = {
aws_region = "us-east-1"
}
# Environment level: environment-specific configuration
include "region" { path = find_in_parent_folders("region.hcl") }
inputs = {
environment = "production"
instance_count = 20
enable_high_availability = true
}
Using locals for Computed Values
Locals enable complex computations and conditionals:
locals {
environment = read_terragrunt_config(find_in_parent_folders("env.hcl"))
region_config = read_terragrunt_config(find_in_parent_folders("region.hcl"))
instance_type = local.environment.inputs.environment == "production" ? "t3.large" : "t3.medium"
tags = merge(
local.environment.inputs.common_tags,
{ Environment = local.environment.inputs.environment }
)
}
inputs = {
instance_type = local.instance_type
tags = local.tags
}
Multi-Region and Multi-Account Patterns
Organizing infrastructure across regions and accounts:
environments/
├── us-east-1/
│ ├── region.hcl
│ ├── dev/
│ │ ├── terragrunt.hcl
│ │ ├── vpc/
│ │ └── eks/
│ └── prod/
├── eu-west-1/
└── ap-southeast-1/
Each region can have region-specific configurations while inheriting organization-wide defaults.
Common Issues and Solutions
1. Performance Regressions (v0.50.15+)
Recent versions show significant slowdowns (15x in some cases) due to O(n²) complexity in locals evaluation:
Symptom: terragrunt run-all plan taking 8+ minutes for 30-50 modules
Solution:
- Downgrade to v0.48.x or earlier if using affected versions
- Avoid reading parent terragrunt configs repeatedly; cache results in locals
- Consider upgrading to latest version when performance patches are released
2. Path Resolution Failures
Incorrect path references are a leading cause of Terragrunt issues:
# Incorrect - breaks when executed from different directories
locals {
config_path = "./config.yaml" # ❌ Relative path doesn't work
}
# Correct - uses Terragrunt execution context
locals {
config_path = "${get_parent_terragrunt_dir()}/config.yaml" # ✓
}
Key functions:
get_terragrunt_dir(): Current module's directoryget_parent_terragrunt_dir(): Parent directory in hierarchyfind_in_parent_folders(): Search up for named file
3. CI/CD Integration Challenges
GitHub Actions:
- uses: hashicorp/setup-terraform@v2
with:
terraform_version: 1.5.0
terraform_wrapper: false # Critical for Terragrunt
Atlantis Integration requires custom Docker image with Terragrunt:
FROM runatlantis/atlantis:latest
RUN curl -L https://github.com/gruntwork-io/terragrunt/releases/download/v0.50.0/terragrunt_linux_amd64 \
-o /usr/local/bin/terragrunt && \
chmod +x /usr/local/bin/terragrunt
Note: Terraform Cloud (TFC) doesn't natively support Terragrunt; organizations using TFC must build custom runners or wrapper scripts.
4. State Locking Race Conditions
Parallel execution can trigger state lock failures:
Error: Error locking state: Error acquiring the state lock
ConditionalCheckFailedException: The conditional request failed
Solution: Reduce parallelism for operations with shared dependencies:
# Sequential execution for safety
terragrunt run-all apply --terragrunt-parallelism 1
# Or configure in terragrunt.hcl
extra_arguments "serial_locking" {
commands = ["apply", "destroy"]
arguments = ["-parallelism=1"]
}
5. Cross-Account AWS Authentication
AWS assume role configurations require explicit setup:
remote_state {
config = {
role_arn = "arn:aws:iam::ACCOUNT_B:role/TerraformRole"
}
}
terraform {
extra_arguments "assume_role" {
commands = get_terraform_commands_that_need_vars()
env_vars = {
AWS_ROLE_ARN = "arn:aws:iam::ACCOUNT_A:role/ResourceRole"
}
}
}
6. Memory Usage at Scale
Large deployments can consume excessive RAM due to repeated config parsing:
Symptoms: 16GB+ RAM usage for 50 modules
Solutions:
- Use provider cache to avoid re-downloading:
--terragrunt-provider-cache - Minimize deep include chains and repeated
read_terragrunt_config()calls - Consider reducing module count through composition
Terragrunt vs Atmos Comparison
While both tools enhance Terraform orchestration, they differ significantly in approach:
| Aspect | Terragrunt | Atmos |
|---|---|---|
| Config Language | HCL | YAML |
| Structure | File hierarchy-based | Component + Stack-based |
| Inheritance | include blocks, find_in_parent_folders | Deep YAML merging, imports |
| Learning Curve | Moderate | Steeper (new paradigm) |
| Flexibility | High (for experienced teams) | High (structured approach) |
| Tooling Integration | Minimal; mostly CLI | Broader Cloud Posse ecosystem |
| Community | Large, mature | Growing, Cloud Posse-centric |
| Operational Overhead | Self-managed | Self-managed (more extensive) |
Atmos is particularly strong for teams invested in Cloud Posse components and infrastructure patterns. Terragrunt is better for teams already using Terraform and wanting minimal tooling overhead.
Best Practices for 2026
1. Keep Configurations Simple and Shallow
Avoid deep include hierarchies. Maintain no more than 3 levels:
root.hcl (remote_state, terraform block)
└── region.hcl (region-specific settings)
└── terragrunt.hcl (module instantiation)
2. Use explicit find_in_parent_folders()
Always specify the file name:
include "root" {
path = find_in_parent_folders("root.hcl") # Not just find_in_parent_folders()
}
3. Minimize Locals Re-evaluation
Cache file reads to avoid repeated parsing:
locals {
root_config = read_terragrunt_config(find_in_parent_folders("root.hcl"))
}
inputs = merge(
local.root_config.inputs,
{ environment = "prod" }
)
4. Implement Provider Caching
For multi-module operations, use provider caching:
terragrunt run-all plan --terragrunt-provider-cache
5. Use Mock Outputs Carefully
Mock outputs help with planning but can mask dependency issues. Validate mocks match reality:
mock_outputs = {
vpc_id = "vpc-mock123" # Update when actual IDs change
}
6. Enforce Sequential Execution for Destructive Operations
Prevent race conditions and unexpected cascading destruction:
terragrunt run-all destroy --terragrunt-parallelism 1
7. Document Variable Inheritance
Clearly document where variables originate and can be overridden. Use comments:
# From root: project_name, team
# From region: aws_region, availability_zones
# Override here: instance_count, environment
inputs = {
instance_count = 5
}
8. Monitor and Alert on Performance
Track terragrunt run-all execution times and alert on regressions. Performance degradation often indicates configuration or version issues.
9. Consider Platform Alternatives for Enterprise Use Cases
For organizations requiring:
- Tight RBAC and governance
- Integrated cost estimation and policy enforcement
- Managed CI/CD and GitOps workflows
- Unified environment promotion
- Built-in team collaboration
Platforms like Scalr, Terraform Cloud, or Env0 provide these capabilities out-of-the-box, reducing operational overhead compared to self-managed Terragrunt.
10. Test Module Compositions Independently
While Terragrunt orchestrates modules, test each module in isolation to catch issues early:
cd components/vpc
terraform init
terraform plan
This ensures modules remain reusable and independently testable.
Conclusion
Terragrunt is a powerful tool for managing Terraform at scale, offering DRY configuration patterns, dependency orchestration, and environment management that vanilla Terraform struggles with. However, it introduces its own complexity layer that requires careful management.
The best outcomes come from teams that:
- Keep configurations simple and well-documented
- Understand the performance characteristics of their setup
- Implement gradual, measured adoption rather than applying Terragrunt to all projects immediately
- Monitor for performance regressions and stay current with releases
- Consider platform alternatives if operational overhead becomes unsustainable
For many organizations, Terragrunt strikes the right balance between flexibility and structure. For others, managed IaC platforms provide a more streamlined path forward. Evaluate your team's needs, existing expertise, and operational constraints when deciding whether Terragrunt is the right fit.