pillar

Beginner's Guide to Terragrunt

Learn Terragrunt basics, modular Terraform, DRY configs, and automation in this beginner-friendly step-by-step guide.

Sebastian Stadil

27 May 2025 • 8 min read

What is Terragrunt?

Terragrunt is a thin, language-agnostic wrapper around Terraform and OpenTofu that helps teams maintain DRY (Don't Repeat Yourself) configurations at scale. Rather than replacing Terraform, it enhances it by providing a framework for managing multiple modules, environments, and regions without the repetitive boilerplate that typically accompanies large-scale Terraform deployments.

At its core, Terragrunt addresses a fundamental problem: as infrastructure complexity grows across multiple environments and cloud accounts, managing identical configurations across dozens or hundreds of modules becomes tedious and error-prone. Terragrunt introduces patterns and tooling to abstract away this repetition while maintaining clarity and control.

Key Components

Terragrunt centers on a few critical concepts:

Modules: The actual Terraform/OpenTofu code (your infrastructure definitions)
Configurations: Terragrunt's HCL-based configuration files (terragrunt.hcl) that manage module instantiation
Dependencies: Mechanisms to orchestrate execution order and share outputs between modules
Remote State: Automated backend configuration for storing and managing Terraform state

Why Use Terragrunt?

Terragrunt solves several fundamental challenges that teams encounter when scaling Terraform:

1. Reducing Configuration Repetition

In vanilla Terraform, managing the same module across development, staging, and production environments means repeating backend configurations, variable definitions, and module blocks. Terragrunt consolidates these through inheritance and templating:

# Parent configuration (env.hcl)
remote_state {
  backend = "s3"
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite_terragrunt"
  }
  config = {
    bucket         = "my-company-tfstate-${get_aws_account_id()}"
    key            = "${path_relative_to_include()}/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "my-company-tfstate-lock-${get_aws_account_id()}"
  }
}

Child configurations inherit this automatically, eliminating duplication across your entire infrastructure codebase.

2. Managing Multiple Environments

Terragrunt enables hierarchical variable inheritance and environment-specific overrides. A single component (like a VPC module) can be deployed across multiple environments with minimal configuration differences:

# Child configuration inherits parent settings
include "root" {
  path = find_in_parent_folders()
}

inputs = {
  environment = "production"
  instance_count = 10
  cidr_block = "10.0.0.0/16"
}

3. Orchestrating Dependencies

Dependencies between modules are complex in vanilla Terraform. Terragrunt provides explicit dependency management, automatically determining execution order and sharing outputs between modules without manual Terraform output commands:

dependency "vpc" {
  config_path = "../vpc"
}

inputs = {
  vpc_id = dependency.vpc.outputs.vpc_id
}

4. Simplifying State Management

Terragrunt automates backend configuration generation, including creation of S3 buckets and DynamoDB tables for state locking. This removes manual setup overhead and ensures consistent state storage patterns across your organization.

Installation and Setup

Prerequisites

Terraform or OpenTofu installed (v0.12.26 or later for Terraform)
Terragrunt binary available in your PATH
Cloud provider credentials configured (AWS, Azure, GCP, etc.)

Installing Terragrunt

macOS (via Homebrew):

brew install terragrunt

Linux:

wget https://github.com/gruntwork-io/terragrunt/releases/download/v0.50.x/terragrunt_linux_amd64
chmod +x terragrunt_linux_amd64
sudo mv terragrunt_linux_amd64 /usr/local/bin/terragrunt

Windows: Download the binary from the Terragrunt releases page and add it to your PATH.

Project Structure

A typical Terragrunt project follows this organization:

project/
├── terragrunt.hcl                 # Root configuration
├── components/
│   ├── vpc/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── eks/
│   │   ├── main.tf
│   │   └── ...
│   └── rds/
│       └── ...
├── environments/
│   ├── dev/
│   │   ├── terragrunt.hcl
│   │   ├── vpc/
│   │   │   └── terragrunt.hcl
│   │   └── eks/
│   │       └── terragrunt.hcl
│   ├── staging/
│   └── production/
└── common/
    ├── common.hcl
    ├── region.hcl
    └── env.hcl

DRY Configurations: The include and find_in_parent_folders Pattern

Understanding include Blocks

The include block is Terragrunt's primary mechanism for code reuse. It allows child configurations to inherit and extend parent configurations:

# Child configuration
include "root" {
  path = find_in_parent_folders()
}

# Additional configuration merged with parent
inputs = {
  tags = {
    Environment = "staging"
  }
}

The find_in_parent_folders Challenge

find_in_parent_folders() searches up the directory tree for a configuration file, but this can be finicky with non-standard naming conventions. Best practice has shifted to explicitly specifying the parent file name:

# More explicit and reliable
include "root" {
  path = find_in_parent_folders("env.hcl")
}

Avoiding the Deep Include Chain

While deep include hierarchies reduce repetition, they can obscure where settings originate and add HCL parsing overhead at scale. Keep include chains to 2-3 levels maximum:

# Root level (env.hcl)
remote_state { ... }
terraform { ... }

# Mid level (region.hcl)
include "root" {
  path = find_in_parent_folders("env.hcl")
}
locals { ... }

# Child level (terragrunt.hcl)
include "region" {
  path = find_in_parent_folders("region.hcl")
}

Common Anti-Pattern: Copy-Pasting Configurations

Despite DRY principles, teams often copy-paste terragrunt.hcl files when include and locals aren't understood. This negates Terragrunt's benefits. Always leverage include and locals to avoid duplication.

Dependencies and Run Order

Explicit Dependency Declaration

Terragrunt analyzes dependency blocks to determine execution order when using run-all commands:

# Module: app/terragrunt.hcl
dependency "vpc" {
  config_path = "../vpc"
}

dependency "rds" {
  config_path = "../rds"
}

inputs = {
  vpc_id = dependency.vpc.outputs.vpc_id
  db_endpoint = dependency.rds.outputs.endpoint
}

Terragrunt will ensure VPC and RDS are applied before the app module.

Mock Outputs for Planning

Mock outputs enable planning without applying dependencies first:

dependency "vpc" {
  config_path = "../vpc"

  mock_outputs = {
    vpc_id = "vpc-mock123"
    subnet_ids = ["subnet-mock1", "subnet-mock2"]
  }
}

However, be aware that mock outputs don't account for actual resource IDs, which can cause discrepancies during apply.

The run-all Complexity

run-all plan and run-all apply commands execute across multiple modules, but understanding behavior is critical:

run-all plan: If a dependency has unapplied changes, dependent modules' plans use the old state
run-all destroy: Only destroys the specified module, not modules that depend on it
run-all show -json: Outputs concatenated JSON (not valid single JSON document), making parsing difficult

For safety, use flags to control parallelism and error handling:

# Sequential execution for critical operations
terragrunt run-all apply --terragrunt-parallelism 1

# Continue on errors with caution
terragrunt run-all plan --terragrunt-ignore-dependency-errors

Performance Optimization

Terragrunt's --dependency-fetch-output-from-state flag speeds up dependency resolution for S3 backends by reading state directly instead of invoking terraform output:

terragrunt run-all plan --dependency-fetch-output-from-state

This can significantly reduce execution time in large deployments.

Remote State Management

Automatic Backend Configuration

Terragrunt generates backend configurations automatically based on remote_state blocks, eliminating manual backend setup:

# Root configuration generates backend.tf
remote_state {
  backend = "s3"
  generate = {
    path      = "backend.tf"
    if_exists = "overwrite_terragrunt"
  }
  config = {
    bucket         = "my-tfstate-${get_aws_account_id()}-${get_aws_region()}"
    key            = "${path_relative_to_include()}/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "my-tfstate-lock-${get_aws_account_id()}"
    s3_bucket_tags = {
      owner = "infra-team"
      name  = "Terraform State"
    }
  }
}

Limitations and Workarounds

Automatic backend creation has limitations:

S3 auto-created log buckets may lack stringent security defaults
disable_init can inadvertently disable all backend initialization
Fine-grained IAM or KMS key customization isn't fully supported
Azure backends don't support auto-creation; storage accounts must be pre-created

For production environments requiring tight security controls, consider managing state infrastructure separately outside of Terragrunt's auto-generation.

Multi-Account State Management

For cross-account deployments, explicitly configure role assumptions:

remote_state {
  backend = "s3"
  config = {
    bucket         = "terraform-state-account-b"
    key            = "${path_relative_to_include()}/terraform.tfstate"
    role_arn       = "arn:aws:iam::ACCOUNT_B:role/TerraformRole"
    region         = "us-east-1"
  }
}

Environment Management

Hierarchical Variable Inheritance

Terragrunt supports layered variable definition, allowing broad defaults with environment-specific overrides:

# Root: common variables
inputs = {
  project_name = "myapp"
  team = "platform"
  enable_monitoring = true
}

# Region level: region-specific overrides
include "root" { path = find_in_parent_folders("root.hcl") }
inputs = {
  aws_region = "us-east-1"
}

# Environment level: environment-specific configuration
include "region" { path = find_in_parent_folders("region.hcl") }
inputs = {
  environment = "production"
  instance_count = 20
  enable_high_availability = true
}

Using locals for Computed Values

Locals enable complex computations and conditionals:

locals {
  environment = read_terragrunt_config(find_in_parent_folders("env.hcl"))
  region_config = read_terragrunt_config(find_in_parent_folders("region.hcl"))

  instance_type = local.environment.inputs.environment == "production" ? "t3.large" : "t3.medium"
  tags = merge(
    local.environment.inputs.common_tags,
    { Environment = local.environment.inputs.environment }
  )
}

inputs = {
  instance_type = local.instance_type
  tags = local.tags
}

Multi-Region and Multi-Account Patterns

Organizing infrastructure across regions and accounts:

environments/
├── us-east-1/
│   ├── region.hcl
│   ├── dev/
│   │   ├── terragrunt.hcl
│   │   ├── vpc/
│   │   └── eks/
│   └── prod/
├── eu-west-1/
└── ap-southeast-1/

Each region can have region-specific configurations while inheriting organization-wide defaults.

Common Issues and Solutions

1. Performance Regressions (v0.50.15+)

Recent versions show significant slowdowns (15x in some cases) due to O(n²) complexity in locals evaluation:

Symptom: terragrunt run-all plan taking 8+ minutes for 30-50 modules

Solution:

Downgrade to v0.48.x or earlier if using affected versions
Avoid reading parent terragrunt configs repeatedly; cache results in locals
Consider upgrading to latest version when performance patches are released

2. Path Resolution Failures

Incorrect path references are a leading cause of Terragrunt issues:

# Incorrect - breaks when executed from different directories
locals {
  config_path = "./config.yaml"  # ❌ Relative path doesn't work
}

# Correct - uses Terragrunt execution context
locals {
  config_path = "${get_parent_terragrunt_dir()}/config.yaml"  # ✓
}

Key functions:

get_terragrunt_dir(): Current module's directory
get_parent_terragrunt_dir(): Parent directory in hierarchy
find_in_parent_folders(): Search up for named file

3. CI/CD Integration Challenges

GitHub Actions:

- uses: hashicorp/setup-terraform@v2
  with:
    terraform_version: 1.5.0
    terraform_wrapper: false  # Critical for Terragrunt

Atlantis Integration requires custom Docker image with Terragrunt:

FROM runatlantis/atlantis:latest
RUN curl -L https://github.com/gruntwork-io/terragrunt/releases/download/v0.50.0/terragrunt_linux_amd64 \
    -o /usr/local/bin/terragrunt && \
    chmod +x /usr/local/bin/terragrunt

Note: Terraform Cloud (TFC) doesn't natively support Terragrunt; organizations using TFC must build custom runners or wrapper scripts.

4. State Locking Race Conditions

Parallel execution can trigger state lock failures:

Error: Error locking state: Error acquiring the state lock
ConditionalCheckFailedException: The conditional request failed

Solution: Reduce parallelism for operations with shared dependencies:

# Sequential execution for safety
terragrunt run-all apply --terragrunt-parallelism 1

# Or configure in terragrunt.hcl
extra_arguments "serial_locking" {
  commands = ["apply", "destroy"]
  arguments = ["-parallelism=1"]
}

5. Cross-Account AWS Authentication

AWS assume role configurations require explicit setup:

remote_state {
  config = {
    role_arn = "arn:aws:iam::ACCOUNT_B:role/TerraformRole"
  }
}

terraform {
  extra_arguments "assume_role" {
    commands = get_terraform_commands_that_need_vars()
    env_vars = {
      AWS_ROLE_ARN = "arn:aws:iam::ACCOUNT_A:role/ResourceRole"
    }
  }
}

6. Memory Usage at Scale

Large deployments can consume excessive RAM due to repeated config parsing:

Symptoms: 16GB+ RAM usage for 50 modules

Solutions:

Use provider cache to avoid re-downloading: --terragrunt-provider-cache
Minimize deep include chains and repeated read_terragrunt_config() calls
Consider reducing module count through composition

Terragrunt vs Atmos Comparison

While both tools enhance Terraform orchestration, they differ significantly in approach:

Aspect	Terragrunt	Atmos
Config Language	HCL	YAML
Structure	File hierarchy-based	Component + Stack-based
Inheritance	include blocks, find_in_parent_folders	Deep YAML merging, imports
Learning Curve	Moderate	Steeper (new paradigm)
Flexibility	High (for experienced teams)	High (structured approach)
Tooling Integration	Minimal; mostly CLI	Broader Cloud Posse ecosystem
Community	Large, mature	Growing, Cloud Posse-centric
Operational Overhead	Self-managed	Self-managed (more extensive)

Atmos is particularly strong for teams invested in Cloud Posse components and infrastructure patterns. Terragrunt is better for teams already using Terraform and wanting minimal tooling overhead.

Best Practices for 2026

1. Keep Configurations Simple and Shallow

Avoid deep include hierarchies. Maintain no more than 3 levels:

root.hcl (remote_state, terraform block)
  └── region.hcl (region-specific settings)
      └── terragrunt.hcl (module instantiation)

2. Use explicit find_in_parent_folders()

Always specify the file name:

include "root" {
  path = find_in_parent_folders("root.hcl")  # Not just find_in_parent_folders()
}

3. Minimize Locals Re-evaluation

Cache file reads to avoid repeated parsing:

locals {
  root_config = read_terragrunt_config(find_in_parent_folders("root.hcl"))
}

inputs = merge(
  local.root_config.inputs,
  { environment = "prod" }
)

4. Implement Provider Caching

For multi-module operations, use provider caching:

terragrunt run-all plan --terragrunt-provider-cache

5. Use Mock Outputs Carefully

Mock outputs help with planning but can mask dependency issues. Validate mocks match reality:

mock_outputs = {
  vpc_id = "vpc-mock123"  # Update when actual IDs change
}

6. Enforce Sequential Execution for Destructive Operations

Prevent race conditions and unexpected cascading destruction:

terragrunt run-all destroy --terragrunt-parallelism 1

7. Document Variable Inheritance

Clearly document where variables originate and can be overridden. Use comments:

# From root: project_name, team
# From region: aws_region, availability_zones
# Override here: instance_count, environment
inputs = {
  instance_count = 5
}

8. Monitor and Alert on Performance

Track terragrunt run-all execution times and alert on regressions. Performance degradation often indicates configuration or version issues.

9. Consider Platform Alternatives for Enterprise Use Cases

For organizations requiring:

Tight RBAC and governance
Integrated cost estimation and policy enforcement
Managed CI/CD and GitOps workflows
Unified environment promotion
Built-in team collaboration

Platforms like Scalr, Terraform Cloud, or Env0 provide these capabilities out-of-the-box, reducing operational overhead compared to self-managed Terragrunt.

10. Test Module Compositions Independently

While Terragrunt orchestrates modules, test each module in isolation to catch issues early:

cd components/vpc
terraform init
terraform plan

This ensures modules remain reusable and independently testable.

Conclusion

Terragrunt is a powerful tool for managing Terraform at scale, offering DRY configuration patterns, dependency orchestration, and environment management that vanilla Terraform struggles with. However, it introduces its own complexity layer that requires careful management.

The best outcomes come from teams that:

Keep configurations simple and well-documented
Understand the performance characteristics of their setup
Implement gradual, measured adoption rather than applying Terragrunt to all projects immediately
Monitor for performance regressions and stay current with releases
Consider platform alternatives if operational overhead becomes unsustainable

For many organizations, Terragrunt strikes the right balance between flexibility and structure. For others, managed IaC platforms provide a more streamlined path forward. Evaluate your team's needs, existing expertise, and operational constraints when deciding whether Terragrunt is the right fit.