Understanding Atlantis 101 Guide
Learn how Atlantis automates Terraform workflows—from pull-request plans to secure applies. Get setup steps, best practices, and tips in this 101 guide.
1. Introduction: The Drive for Terraform Automation
Terraform has become a cornerstone of modern infrastructure as code (IaC) practices, enabling teams to define and manage infrastructure with unparalleled consistency and version control. But as infrastructure complexity grows and team collaboration intensifies, the manual execution of Terraform commands can become a bottleneck, prone to errors and inconsistencies. This is where automation tools step in, promising to streamline workflows, enhance governance, and improve operational efficiency. One such popular open-source tool is Atlantis, designed to bring Terraform automation directly into your version control system's pull request (PR) process.
2. What is Atlantis? Core Functionality
Atlantis is an automation tool specifically built to integrate Terraform operations into pull request workflows. It acts as a server that listens for webhook events from your chosen Version Control System (VCS) – like GitHub, GitLab, or Bitbucket. When a developer opens or updates a PR with changes to Terraform code, Atlantis can automatically run terraform plan
and post the output as a comment in the PR. This provides immediate visibility into the proposed changes. Once reviewed and approved, a simple PR comment like atlantis apply
can trigger Atlantis to execute terraform apply
, implementing the infrastructure modifications. Being self-hosted, Atlantis runs on infrastructure you provision and manage, offering a high degree of control over its environment.
3. Problems Atlantis Aims to Solve
Without a dedicated automation layer, Terraform collaboration can encounter several common pain points:
- Decentralized Execution: Terraform run from individual machines can lead to environment drift and inconsistent results.
- Manual Processes: Running plans, sharing outputs, and applying changes manually is time-consuming and error-prone.
- Limited Visibility: Tracking who planned what, and what the expected outcome was, can be challenging.
- State Management Conflicts: Concurrent operations without proper locking can jeopardize Terraform state integrity.
- Onboarding Overhead: Requiring every contributor to have a fully configured local Terraform setup can be a barrier.
Atlantis seeks to address these by centralizing Terraform execution within a consistent, PR-driven workflow.
4. Key Benefits of an Atlantis-Driven Workflow
Integrating Atlantis can bring several advantages to a Terraform practice:
- Enhanced Collaboration: Plan outputs directly in PRs facilitate focused discussions.
- Increased Efficiency: Automation of
plan
andapply
speeds up deployment cycles. - Improved Consistency: Centralized execution minimizes "works on my machine" issues.
- Better Governance & Auditability: PRs provide a natural audit trail for all infrastructure changes.
- Effective State Locking: Atlantis implements its own locking to prevent conflicting operations on the same project/workspace.
- VCS Integration: Leverages familiar developer workflows without needing a separate UI.
5. The Self-Managed Journey: Setting Up Your Atlantis Server
Implementing Atlantis involves setting up and maintaining the server yourself. This provides flexibility but also means taking on the operational responsibility.
Prerequisites: Server, Git, Terraform, VCS
- Server Infrastructure: You'll need a server (VM, container host) to run Atlantis. This requires careful consideration of OS (typically Linux), CPU, RAM (e.g., 1-2 vCPUs, 2-8GB RAM as a starting point), and disk space (5-50GB for Git clones, plan files). Network configuration is also key: the server must be accessible from your VCS for webhooks and be able to call out to the VCS API.
- Git and Terraform: Git and the desired Terraform versions must be installed and accessible on the Atlantis server. Atlantis can manage Terraform binary downloads, which is a helpful feature.
- Version Control System (VCS): A configured Git repository (e.g., on GitHub) is essential. Atlantis needs credentials (a Personal Access Token or, preferably, a GitHub App) to interact with your repositories.
- Terraform State Backend: Atlantis mandates the use of a remote state backend (like S3, Azure Blob, GCS).
local
state is not supported due to Atlantis's operational model.
Deployment: Docker and Kubernetes Options
Atlantis is commonly deployed using Docker or Kubernetes.
- Kubernetes: A Helm chart is available for Kubernetes deployments, often preferred for scalability and resilience. This involves managing Kubernetes manifests, secrets, services, and potentially ingresses.
Docker: The official ghcr.io/runatlantis/atlantis
image simplifies deployment. A typical docker run
command involves setting several environment variables for configuration:
docker run --name atlantis -d -p 4141:4141 \
-e ATLANTIS_ATLANTIS_URL="<YOUR_ATLANTIS_PUBLIC_URL>" \
-e ATLANTIS_GH_USER="<YOUR_GITHUB_USERNAME_OR_APP_NAME>" \
-e ATLANTIS_GH_TOKEN="<YOUR_GITHUB_PAT_OR_APP_KEY_CONTENTS>" \
-e ATLANTIS_GH_WEBHOOK_SECRET="<YOUR_WEBHOOK_SECRET>" \
-e ATLANTIS_REPO_ALLOWLIST="github.com/your-org/*" \
# Consider -v /path/to/atlantis-data:/atlantis-data for persistent plan storage
ghcr.io/runatlantis/atlantis:latest server
Managing this container, its updates, and persistent data (if needed for plans to survive restarts) falls to the operations team.
VCS Integration: GitHub Authentication and Webhooks
Connecting Atlantis to your VCS (e.g., GitHub) is a critical step that requires careful attention to detail.
- Authentication:
- GitHub Personal Access Token (PAT): Simpler to generate but typically grants broad permissions (e.g.,
repo
scope). - GitHub App (Recommended): Offers more granular permissions and enhanced security. Atlantis can even guide you through creating one via its
/github-app/setup
endpoint. This process involves defining specific repository permissions (Contents, Pull Requests, Commit Statuses, etc.) and handling an App ID and private key. Ensuring these permissions are correct and the private key is securely managed is paramount.
- GitHub Personal Access Token (PAT): Simpler to generate but typically grants broad permissions (e.g.,
- Webhook Configuration: GitHub (or your VCS) notifies Atlantis of PR events via webhooks. Manual configuration involves:
- Payload URL: Must be the public URL of your Atlantis server, crucially ending in
/events
(e.g.,https://atlantis.yourdomain.com/events
). A missing/events
is a common setup error. - Content Type:
application/json
. - Secret: A shared secret to verify webhook authenticity, configured in both GitHub and Atlantis.
- Events: Subscribe to "Pull requests," "Issue comments," "Pushes," and "Pull request reviews." Correctly configuring and securing this communication channel is vital for Atlantis to function.
- Payload URL: Must be the public URL of your Atlantis server, crucially ending in
Cloud Credentials and Server Configuration
Atlantis itself doesn't handle cloud provider credentials directly. It relies on the execution environment of the terraform
binary (i.e., the Atlantis server/container) having the necessary credentials.
- Methods: IAM Roles (for cloud VMs/pods), environment variables (e.g.,
AWS_ACCESS_KEY_ID
), or shared credential files are common. The security and management of these credentials on the Atlantis host are your responsibility. - Core Server Settings: Atlantis is configured via CLI flags or environment variables (e.g.,
ATLANTIS_ATLANTIS_URL
,ATLANTIS_REPO_ALLOWLIST
,ATLANTIS_DATA_DIR
,ATLANTIS_DEFAULT_TF_VERSION
).
The setup process, while well-documented, involves multiple components that need to be correctly configured and maintained by the user.
6. Defining Your Infrastructure: The atlantis.yaml
File
To tell Atlantis how to handle Terraform projects within a repository, you use an atlantis.yaml
file at the repo's root. Its key functions are:
- Project Definition: Specifying directories (
dir
) and Terraform workspaces (workspace
) for distinct projects. - Autoplan Control: Defining when
terraform plan
runs automatically (when_modified
file patterns). - Terraform Versioning: Pinning projects to specific Terraform versions.
- Workflow Customization: Optionally defining custom plan/apply steps (often restricted by server-side config for security).
- Apply Requirements: Enforcing conditions like PR approval (
apply_requirements: [approved]
).
A simple atlantis.yaml
might look like this:
version: 3
projects:
- name: my-app-staging
dir: infra/staging
workspace: staging
autoplan:
when_modified: ["**/*.tf", "**/*.tfvars", ".terraform.lock.hcl"]
enabled: true
terraform_version: v1.5.0
# apply_requirements: [approved] # Server-side config may need to allow this
While atlantis.yaml
offers project-level flexibility, managing these files across many repositories and coordinating them with server-side configurations (if used for central governance) adds another layer of configuration management.
7. The Pull Request Lifecycle with Atlantis
Once set up, the Atlantis workflow is quite intuitive:
- PR Creation/Update: A developer pushes Terraform changes and opens a PR.
- Automated
terraform plan
: Atlantis detects changes (based onatlantis.yaml
) and runsterraform plan
, posting the output as a PR comment. - Review and Collaboration: The team reviews the plan directly in the PR. If changes are needed, new commits trigger a new plan.
- Executing
terraform apply
via Comments: An authorized user commentsatlantis apply
(optionally with flags like-p project-name
or-d dir -w workspace
) to apply the approved plan. Atlantis posts theapply
output. - State Locking: Atlantis locks projects during plan/apply to prevent concurrent operations, complementing Terraform's backend locking.
Essential PR Commands:
atlantis plan [-d dir] [-w workspace] [-p project_name] [-- <tf_flags>]
: Manually trigger a plan.atlantis apply [-d dir] [-w workspace] [-p project_name]
: Apply a plan.atlantis unlock
: Release a stuck lock.atlantis help
: Show available commands.
This PR-centric flow is a significant strength, keeping infrastructure operations tied to version control and review processes.
8. Navigating Initial Hurdles: Common Atlantis Troubleshooting
As with any self-hosted system, initial setup can present challenges:
- Webhook Issues: Incorrect Payload URL (especially the
/events
suffix), mismatched webhook secrets, or network connectivity blocking VCS calls to Atlantis. VCS webhook delivery logs are the first place to check. - Authentication/Permission Errors: Invalid or expired VCS tokens/GitHub App credentials, insufficient scopes/permissions for the token/App, or the target repository not being in Atlantis's
--repo-allowlist
. - Plan/Apply Failures: These are often Terraform-related (code errors, incorrect provider credentials on the Atlantis server, state lock issues) rather than Atlantis issues per se. Atlantis server logs (ideally at
debug
level) become crucial here. atlantis.yaml
Misconfigurations: YAML syntax errors, incorrectdir
paths (they are relative to repo root), orwhen_modified
patterns not matching files (patterns are relative to the projectdir
). Restricted features like custom workflows orapply_requirements
might be disabled by server-side policy.
Troubleshooting often involves checking configurations across multiple systems: the VCS, the Atlantis server, network firewalls, and the Terraform code itself.
9. Atlantis at a Glance: Summary
Feature | Description | Key Consideration / Management Aspect |
---|---|---|
Core Function | Terraform PR Automation | Open-source, self-hosted |
Workflow Trigger | VCS Webhooks (PR events, comments) | Requires careful webhook setup & public accessibility |
| Automated on PR, output as comment |
|
| Triggered by PR comment ( | Ensures intentional application after review |
State Locking | PR-level project locking | Complements Terraform backend locking |
Configuration | Server flags/env vars; repo-level | User manages all configuration layers |
Deployment | Docker, Kubernetes, binary | User responsible for provisioning & server maintenance |
VCS Authentication | PAT or GitHub App | Secure credential management is crucial |
Cloud Credentials | Relies on server's environment (IAM roles, env vars) | User responsible for secure credential provisioning to host |
Customization | Custom workflows, apply requirements (may be server-restricted) | Balances flexibility with potential complexity |
Scalability | Depends on server resources & deployment strategy (e.g., K8s) | User manages scaling aspects |
Support | Community-driven (GitHub issues, Slack) | No dedicated enterprise support |
10. Conclusion: Weighing Control vs. Operational Overhead
Atlantis offers a powerful, community-supported solution for teams looking to automate their Terraform workflows within the familiar confines of their version control system. Its PR-centric approach enhances collaboration, consistency, and auditability for infrastructure changes. The level of control afforded by its self-hosted nature and extensive configuration options is a significant draw for many.
However, this control comes with the inherent operational responsibilities of deploying, managing, securing, and troubleshooting a critical piece of infrastructure automation tooling. Teams adopting Atlantis should be prepared for the initial setup effort and ongoing maintenance. For organizations that value deep control and have the resources to manage such a system, Atlantis is a very capable choice. For those who might prioritize a more managed experience, reduced setup and operational burden, or integrated enterprise features like advanced policy enforcement, role-based access control, and dedicated support, exploring specialized commercial IaC platforms could present a compelling alternative to the self-managed path. The decision often hinges on balancing the desire for granular control with the total cost of ownership and operational capacity.