Custom Atlantis Workflows: Advanced Terraform Automation Guide 2025
Discover how custom Atlantis workflows unlock powerful Terraform automation—setup steps, best practices, and real-world use cases to streamline DevOps.
1. Introduction: Beyond Basic Atlantis
Terraform is the dominant dialect for Infrastructure as Code (IaC). Tools that make its application smoother are prized. Atlantis makes a mark by embedding Terraform execution directly within version control processes. Teams can plan
and apply
modifications via pull request commentary. This GitOps style boosts teamwork and audit trails.
Yet, as infrastructure’s web grows, so does the appetite for more refined automation. Here, Atlantis's atlantis.yaml
configuration file enters the scene. It permits custom workflows able to manage complex deployment pipelines, assimilate third-party tools, and enforce particular operational doctrines. This grants enormous flexibility. It also ushers in a stratum of configuration management and operational load that teams must brace for. For entities aiming to administer Terraform at scale without plunging into YAML’s depths or overseeing the base automation engine, investigating platforms offering more structured, opinionated CI/CD for Terraform, like Scalr, might present a persuasive alternative with inherent governance and environment stewardship.
2. The Engine Room: Understanding atlantis.yaml
The atlantis.yaml
file, positioned at your repository's root, is Atlantis customization's core. It tells Atlantis how to find projects, what commands to execute, and under what circumstances.
Core Structure and Top-Level Keys
Each atlantis.yaml
document commences with version: 3
. Principal top-level settings encompass:
projects
: An array that specifies Terraform projects Atlantis should handle. Once set, Atlantis depends only on these, forsaking autodiscovery.workflows
: A map that outlines custom command sequences for planning and applying.automerge
: A Boolean to switch on automatic PR merging post-successful applies.parallel_plan
/parallel_apply
: Booleans to execute operations concurrently for multiple projects.
Defining Projects for Granular Control
Every item in the projects
array details a Terraform directory and its linked settings:
dir
: Path to the project's location.workspace
: The Terraform workspace to employ (default isdefault
).name
: A singular identifier, vital if several projects occupy the samedir
(e.g., for varied workspaces).workflow
: Points to a custom workflow for this project.autoplan
: Sets up automatic planning, particularly withwhen_modified
patterns.apply_requirements
: Conditions such asapproved
ormergeable
before an apply is permitted.
version: 3
projects:
- name: myapp-prod
dir: terraform/myapp
workspace: prod
workflow: prod-workflow
autoplan:
when_modified:
- "**/*.tf"
- "prod.tfvars"
enabled: true
apply_requirements: [approved, mergeable, undiverged]
Crafting Custom Workflows: Plan, Apply, and Beyond
Custom workflows let you supersede default plan
and apply
actions with distinct stages and steps.
- Stages: Generally
plan
andapply
. - Steps: Can be internal Atlantis commands (
init
,plan
,apply
,show
) or bespokerun
commands.
The run
step is notably potent, running arbitrary shell scripts and accessing environment variables like $PLANFILE
, $WORKSPACE
, $PROJECT_NAME
, and $PULL_NUM
.
workflows:
prod-workflow:
plan:
steps:
- init
- run: echo "Running pre-production checks..."
- plan:
extra_args: ["-var-file=prod.tfvars", "-out=$PLANFILE"]
apply:
steps:
- run: echo "Awaiting final sign-off for $PROJECT_NAME..."
# Script for sign-off logic
- run: ./scripts/await_manual_approval.sh $PULL_NUM
- apply
- run: echo "Production deployment of $PROJECT_NAME complete."
Overseeing the dependencies and execution milieu for these run
scripts (making sure tools are installed in the Atlantis container, managing permissions) turns into a weighty thought as workflow complexity mounts.
3. Advanced Use Cases in Action
Let's see how these configurations facilitate advanced automation.
Multi-Environment Deployments (Dev, Staging, Prod)
A frequent setup involves a single Terraform codebase deployed across multiple environments using different workspaces and variable files. Each environment can be a separate Atlantis project with its own workflow and apply stipulations.
atlantis.yaml
for Multi-Environment:
version: 3
projects:
- name: myapp-dev
dir: terraform/myapp
workspace: dev
workflow: dev-workflow
autoplan:
when_modified: ["**/*.tf", "dev.tfvars", "../../modules/shared/**/*.tf"]
enabled: true
- name: myapp-staging
dir: terraform/myapp
workspace: staging
workflow: staging-workflow
apply_requirements: [approved]
autoplan:
when_modified: ["**/*.tf", "staging.tfvars", "../../modules/shared/**/*.tf"]
enabled: true
- name: myapp-prod
dir: terraform/myapp
workspace: prod
workflow: prod-workflow # Possibly with manual approval steps
apply_requirements: [approved, mergeable, undiverged]
autoplan:
when_modified: ["**/*.tf", "prod.tfvars", "../../modules/shared/**/*.tf"]
enabled: true
workflows:
dev-workflow:
plan:
steps:
- init
- plan: {extra_args: ["-var-file=../dev.tfvars", "-out=$PLANFILE"]}
apply:
steps: [apply]
# staging-workflow and prod-workflow would be defined similarly,
# potentially with more steps (e.g., security scans, notifications).
prod-workflow:
plan:
steps:
- init
- run: ./scripts/tfsec_scan.sh .
- plan: {extra_args: ["-var-file=../prod.tfvars", "-out=$PLANFILE"]}
apply:
steps:
- run: ./scripts/prod_approval_gate.sh $PULL_NUM
- apply
- run: ./scripts/notify_slack.sh "PROD apply for $PROJECT_NAME complete."
While adaptable, defining and sustaining distinct yet similar workflows can result in boilerplate. Solutions presenting hierarchical configuration models, where environment-specific settings can be inherited and overridden more tidily, can simplify this. Scalr, for example, offers an environment hierarchy that can make managing variables and policies across dev, staging, and prod more straightforward without extensive YAML repetition.
Integrating Custom Scripts: Linters and Security Scanners (tfsec, Checkov)
run
steps are fitting for embedding quality gates and security verifications. A non-zero exit code from a script will stop the workflow.
Example: tfsec
Pre-Plan Scan
workflows:
secure-workflow:
plan:
steps:
- init
- run:
command: |
echo "Running tfsec scan..."
# tfsec will exit non-zero if issues are found, halting the workflow
tfsec .
description: "tfsec security scan"
- plan: {extra_args: ["-out=$PLANFILE"]}
apply:
steps: [apply]
To employ tools like tfsec
or Checkov
, they must be present in the Atlantis execution environment (e.g., your Atlantis Docker image). Handling these dependencies and ensuring scripts are sound is important.
Conditional Logic: when_modified
and PR Label Strategies
PR Label Logic (via Custom Scripts): Atlantis doesn't natively respond to PR labels. A run
step, however, can run a script that polls your VCS (e.g., GitHub API) for labels and exits based on them, effectively gating the workflow. This necessitates secure stewardship of a VCS token with suitable permissions. Conceptual Script (check_label.sh
):
#!/bin/bash
# Needs GITHUB_TOKEN, PULL_NUM, BASE_REPO_OWNER, BASE_REPO_NAME env vars
REQUIRED_LABEL="ready-for-deploy"
# ... (curl GitHub API to fetch labels for $PULL_NUM) ...
if [[ $LABELS == *"$REQUIRED_LABEL"* ]]; then
echo "Label '$REQUIRED_LABEL' found. Proceeding."
exit 0
else
echo "Label '$REQUIRED_LABEL' not found. Halting."
exit 1
fi
Workflow Integration:
workflows:
label-gated-workflow:
plan:
steps:
- run: ./scripts/check_label.sh # Gate based on label
- init
- plan: {extra_args: ["-out=$PLANFILE"]}
This method, while potent, introduces considerable scripting difficulty and external dependencies. Platforms with integrated policy engines (like OPA support in Scalr) can furnish more robust and governable ways to enforce such conditional logic without custom scripting for API communications.
when_modified
: The autoplan.when_modified
key uses glob patterns to initiate plans only when pertinent files are altered. This is very useful in monorepos to sidestep needless computation.
autoplan:
when_modified:
- "**/*.tf" # Files in this project's dir
- "../../modules/network/**/*.tf" # Files in a shared module
- ".terraform.lock.hcl"
enabled: true
4. Navigating the Labyrinth: Debugging Custom Workflows
Troubleshooting custom Atlantis workflows means:
- Atlantis Server Logs: Activate debug logging (
--log-level debug
) for comprehensive output on parsing, project finding, and step running. atlantis.yaml
Syntax: Employ a YAML linter. Frequent problems include indentation (spaces, not tabs) and incorrect nesting.- Custom Script Failures: Examine exit codes. Scripts require correct permissions, and all dependencies (linters, CLIs) must be in the Atlantis environment. Test scripts in an environment that mirrors Atlantis.
when_modified
Issues: Confirm glob patterns are accurate and relative paths to shared modules are correct.- Server-Side Overrides: Settings in
atlantis.yaml
might be disregarded if not allowed byallowed_overrides
in the server-siderepos.yaml
.
The feedback cycle for debugging involved run
steps inside Atlantis can occasionally be protracted. An environment that offers superior visibility into execution steps or permits easier testing of automation scripts can be advantageous.
5. Taming Complexity: Best Practices for atlantis.yaml
As configurations expand, maintainability becomes a chief concern.
Monorepo Management Strategies
- Granular Projects: Define each independently deployable segment.
- Optimized
when_modified
: Very important for averting plan storms. Think about server-side--autoplan-modules
for automatic dependency tracking, but be cognizant of its actions. execution_order_group
: Direct plan/apply sequence for dependencies in global runs.- Parallelism: Use
parallel_plan: true
andparallel_apply: true
to accelerate operations.
The sheer quantity of project definitions and convoluted when_modified
paths in a substantial monorepo can render the atlantis.yaml
file difficult to manage. Some teams opt to generate this file, adding another layer of instrumentation.
YAML Anchors for Readability
YAML anchors (&
) and aliases (*
) can diminish repetition for common step sequences or apply_requirements
.
_common_plan_steps: &common_plan_steps
- init
- run: ./scripts/lint.sh
- plan: {extra_args: ["-out=$PLANFILE"]}
workflows:
my-workflow:
plan:
steps:
- *common_plan_steps # Reuse anchored steps
While beneficial, excessive use of anchors can sometimes cloud the final configuration, making it tougher to follow a project's precise behavior.
Server-Side Governance with repos.yaml
For larger configurations, the server-side repos.yaml
(via --repo-config
) is essential for centralized command.
allowed_overrides
: Specifies whichatlantis.yaml
keys can be set at the repo level.allowed_workflows
: Constrains which server-defined workflows can be employed.allow_custom_workflows: true/false
: A significant security setting. Iftrue
, repos can define arbitraryrun
steps. Default tofalse
and manage workflows centrally unless you have very strong trust and review protocols.
Effectively, repos.yaml
lets a platform team provide a "paved road" for Terraform automation. Managing this central configuration and the dynamic between server-side and repo-side settings, however, demands careful forethought. This is another domain where a platform like Scalr, with its built-in role-based access control (RBAC) and policy enforcement (e.g., via OPA), can deliver a more integrated governance model without depending on multiple strata of YAML configuration.
6. Summary: Key Atlantis Workflow Capabilities
Feature/Component | Advanced Capability | Key Configuration | Considerations/Complexity |
---|---|---|---|
Projects | Granular control per environment/component |
| Can lead to verbose |
Workflows | Custom plan/apply stages and steps |
| Managing script dependencies, permissions, and execution environment. |
| Integrate any CLI tool (linters, scanners, notifiers) |
| Tooling must be in Atlantis image; script robustness; exit code handling. |
| Enforce PR approvals, mergeability |
| Relies on VCS integration and PR states. |
| Conditional planning based on file changes |
| Crafting accurate globs, especially for shared modules; can be complex in large monorepos. |
PR Label Logic | Gate workflows based on PR labels (via custom script) |
| Requires scripting, VCS API interaction, and secure token management. |
YAML Anchors | Reduce duplication in |
| Can reduce readability if overused or deeply nested. |
| Centralized server-side governance |
| Requires careful policy planning; managing interaction between server and repo configs. |
7. Conclusion: Scaling Your Terraform Automation
Atlantis provides a potent, open-source base for Terraform automation within the PR workflow. Its custom workflow features allow teams to shape automation pipelines to advanced needs, incorporating everything from security checks to multi-environment deployment plans.
Nevertheless, as this document shows, using this advanced functionality brings substantial configuration management, scripting, and operational duties. Sustaining complex atlantis.yaml
files, managing the execution setting for custom scripts, debugging elaborate workflows, and ensuring consistent governance across numerous repositories or a large monorepo can become major undertakings. This is no small feet.
For organizations finding themselves investing considerable effort in building and maintaining these advanced Atlantis setups, or those seeking more built-in governance, security, and scalability attributes from the start, it might be wise to assess managed Terraform automation platforms. Solutions like Scalr aim to tackle these issues by offering a more structured method with features like hierarchical environment management, integrated OPA policy enforcement, RBAC, and a focus on the end-to-end Terraform lifecycle, potentially lessening the need for extensive custom YAML and scripting.
The correct path hinges on your team's scale, expertise, and readiness to manage the underlying automation infrastructure versus utilizing a more opinionated platform.