Structuring Terraform and OpenTofu: A Platform Engineer's Four-Part Guide

Part 3 of the Platform Engineer’s guide shows how to layer Terraform/OpenTofu stacks across envs, link modules via remote state, and keep configs lean.

Part 3: Practical Code Organization and Environmental Strategies

In the previous parts of this series, we've laid the conceptual groundwork for robust Infrastructure as Code (IaC). Part 1 highlighted why structure is critical and introduced foundational elements. Part 2 delved into mastering module design and navigating the strategic choice between monorepos and polyrepos. Now, in Part 3, we get down to brass tacks: translating these concepts into tangible, practical approaches for organizing your Terraform and OpenTofu code on the filesystem and effectively managing multiple environments and state.

1. Organizing Your Code: Practical Folder Structures

How you arrange your directories and files significantly impacts the clarity, maintainability, and scalability of your IaC. While there's no one-size-fits-all solution, several common and effective patterns have emerged. The ideal structure often depends on your chosen repository strategy (monorepo vs. polyrepo), the complexity of your infrastructure, and your team's workflow.

  • Common Folder Structure Patterns:
      • Pros: Clear separation of concerns per environment; easy to manage environment-specific configurations and state.
      • Cons: Can lead to some duplication of main.tf structure if not carefully managed with modules; deploying a single component across all environments requires changes in multiple directories.
      • Pros: Good for service-oriented architectures; promotes component ownership.
      • Cons: Managing environment-specific nuances within each component can become complex if not handled with clear variable strategies or workspace configurations.
    • Hybrid Approaches: Many organizations adopt a hybrid, for instance, organizing by business unit, then by application, then by environment. The key is consistency and clarity.
  • Structuring Module Sources:
    • Local Modules: Often placed in a top-level modules/ directory within the same repository. Root configurations then reference these using relative paths (e.g., source = "../../modules/vpc").
    • Remote Modules: If modules are in separate repositories (polyrepo for modules) or a module registry, root configurations will reference them using appropriate source strings (e.g., Git URLs, registry addresses).
  • Interaction with Repository Strategy:
    • Monorepo: All the above structures (environment-first, component-first) can exist within a single monorepo. The modules/ directory would also reside here.
    • Polyrepo:
      • Each environment might be its own repository.
      • Each component/service might be its own repository.
      • Reusable modules would typically each reside in their own dedicated repository to allow independent versioning and release cycles.

Component-First (Top-Level Components/Services): Organizes code by logical service or infrastructure component, with environments as subdirectories or managed via workspaces/variable files.

├── components/
│   ├── networking/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   ├── dev.tfvars
│   │   ├── prod.tfvars
│   │   └── backend-configs/ # Or manage backend per workspace
│   │       ├── dev.s3.tfbackend
│   │       └── prod.s3.tfbackend
│   ├── application_A/
│   │   ├── ... (similar structure)
│   └── database_cluster/
│       ├── ...
├── modules/
│   ├── ...

Environment-First (Top-Level Environments): A common approach, especially for managing distinct deployment environments.

├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf  # Specific backend config for dev state
│   ├── staging/
│   │   ├── ... (similar structure)
│   └── prod/
│       ├── ... (similar structure)
├── modules/
│   ├── vpc/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   ├── ec2_instance/
│   │   ├── ...
└── global/ # Optional: for resources shared across all environments
    ├── iam_roles/
    │   ├── ...
    └── s3_buckets_shared/
        ├── ...

2. Environment Parity and Multi-Region/Account Deployments

Managing infrastructure consistently across multiple environments (e.g., development, staging, production) and potentially across multiple cloud regions or accounts is a core challenge that good structure helps address.

  • Strategies for Managing Configurations Across Environments:
    • Input Variables & .tfvars Files: The most common method. Define variables in variables.tf and provide environment-specific values in separate .tfvars files (e.g., dev.tfvars, prod.tfvars). Use terraform apply -var-file="dev.tfvars" to target an environment.
    • Workspaces (Terraform CLI Workspaces): Terraform CLI workspaces allow you to manage multiple states for the same configuration in the same backend. You can use terraform.workspace in your code to introduce conditional logic or naming differences (e.g., name = "my-resource-${terraform.workspace}"). This is often suitable for simpler environment distinctions within a single configuration directory.
    • Directory-Based Separation: As shown in the "Environment-First" folder structure, having separate directories per environment provides strong isolation but requires careful use of modules to keep code DRY.
  • Achieving Environment Parity:
    • The goal is to make non-production environments as similar to production as possible to catch issues early.
    • Use the Same Modules: Deploy the same versioned modules across all environments.
    • Configuration Over Code: Differences between environments (e.g., instance sizes, counts, feature flags) should be driven by input variables, not by forking the HCL code itself.
    • Automated Promotion: Implement CI/CD pipelines that promote the exact same codebase (with different variable files) through environments.
  • Considerations for Multi-Region and Multi-Account Deployments:
    • Provider Aliases: Use provider aliases if you need to manage resources in multiple regions or accounts within a single Terraform configuration/apply.
    • Separate Configurations/States: Often, it's cleaner to have separate root configurations (and thus separate state files) for each region or account, especially if they are largely independent. These configurations can still consume the same shared modules.
    • Organizational Units/Landing Zones: Leverage cloud provider constructs (like AWS Organizations, Azure Management Groups, Google Cloud Folders) to manage accounts and apply baseline policies, then use Terraform to deploy resources within these structures.

3. Robust State Management in Diverse Environments

Terraform state is critical. How you manage it across different environments, regions, and components is fundamental to safe and reliable IaC operations.

  • Separate State Files are Non-Negotiable:
    • Minimize Blast Radius: Each distinct environment (dev, staging, prod) and ideally each major, independently deployable component or region within an environment, should have its own isolated state file. This ensures that an error or corruption in one state file (e.g., during an apply) does not impact other unrelated infrastructure.
    • Avoid Monolithic State: A single state file for all your infrastructure is a significant risk and operational bottleneck.
  • Utilize Remote Backends:
    • As discussed in the migration series (if applicable) and as a general best practice, always store Terraform state files remotely, not in local developer machines or version control.
    • Supported Backends: AWS S3, Azure Blob Storage, Google Cloud Storage, HashiCorp Consul, Terraform Cloud/Enterprise, or TACO platforms.
    • Benefits: Enables collaboration, essential for CI/CD automation, provides durability and often versioning of state.
  • Implement State Locking:
    • Critical for team environments and automated pipelines to prevent concurrent terraform apply operations from corrupting the state file.
    • Most remote backends offer native or companion locking mechanisms (e.g., DynamoDB for S3, Azure Blob leases, etcd for Consul). Ensure locking is configured and working.
  • Design Logical Backend Keys/Paths:
    • The key (in S3/Azure) or path/prefix (in GCS/Consul) within your remote backend determines where the state file is stored. Structure these paths logically and consistently.
    • Example Naming Convention for S3 Keys: terraform-state/<PROJECT_NAME>/<ENVIRONMENT>/<REGION>/<COMPONENT>/terraform.tfstate (e.g., terraform-state/my-app/prod/us-east-1/vpc/terraform.tfstate)
    • This makes state files easy to locate, manage, and reason about.
  • Enable Versioning on Remote Backend Storage:
    • Most cloud object storage services (S3, Azure Blob, GCS) support object versioning. Enable it for your state bucket.
    • This allows you to roll back to previous versions of your state file in case of accidental deletion or corruption—a crucial safety net.
  • Secure Your State:
    • State files can contain sensitive information.
    • Encrypt state at rest (most remote backends support this).
    • Strictly control access to the state storage (e.g., using IAM permissions). Only authorized users and CI/CD service principals should have read/write access.

By implementing practical folder structures, thoughtfully managing environmental differences, and establishing robust state management practices, you create an IaC foundation that is not only organized but also resilient and adaptable to the evolving needs of your platform.

Next in the Series (Part 4): Scaling Structures and Advanced IaC Patterns.