What is a Terraform Taco

Learn what a Terraform TACO is, who the main players are, and what to look for when evaluating a TACO.

For many, Terraform, and now OpenTofu, has become the de facto standard for defining infrastructure as code (IaC). However, as organizations scale, the challenges of collaboration, governance, and automation around Terraform can become significant. This is where the concept of "Terraform TACO" comes into play.

What is a TACO?

TACO stands for Terraform Automation and COllaboration Software. It refers to a category of tools specifically designed to enhance and streamline Terraform workflows, addressing the complexities that arise when multiple teams and environments are involved. Think of it as a specialized CI/CD pipeline built from the ground up for infrastructure as code, providing a comprehensive platform for managing your Terraform deployments.

Why Are They Beneficial?

Without a TACO solution, managing Terraform at scale can quickly devolve into a chaotic mess. Teams might struggle with:

  • State Management: Securely storing, locking, and versioning Terraform state files across distributed teams is a major hurdle. Manual intervention can lead to conflicts and inconsistencies.
  • Collaboration: Coordinating changes, reviewing plans, and approving deployments across different engineers and teams becomes cumbersome.
  • Governance and Security: Enforcing consistent policies, ensuring compliance, and preventing misconfigurations manually is nearly impossible.
  • Self-Service: Empowering developers to provision infrastructure safely and efficiently, without becoming Terraform experts, is a common pain point.
  • Visibility and Auditing: Tracking who did what, when, and why across all infrastructure changes is crucial for accountability and debugging.

TACO solutions directly address these challenges by providing a centralized platform that brings automation, collaboration, and governance to your Terraform operations.

Core Features of a Terraform TACO

A robust Terraform TACO solution typically offers a suite of features that transform how organizations manage their IaC:

  • Remote State Management: Securely stores Terraform state files, providing locking mechanisms to prevent concurrent modifications and maintaining a history of changes.
  • Version Control System (VCS) Integration: Tightly integrates with popular VCS platforms like GitHub, GitLab, and Bitbucket, enabling GitOps-driven workflows where infrastructure changes are triggered by pull requests.
  • Role-Based Access Control (RBAC): Granular control over who can perform which actions (e.g., plan, apply) on specific environments or projects, ensuring adherence to the principle of least privilege.
  • Policy as Code: Allows organizations to define and enforce policies using languages like Open Policy Agent (OPA) or proprietary engines. These policies can block non-compliant deployments, ensure security standards, and manage costs.
  • Automated Runs: Automates Terraform plan and apply operations, often triggered by VCS events or on a schedule, reducing manual errors and speeding up deployments.
  • Drift Detection: Continuously monitors your deployed infrastructure against your Terraform code, alerting you to any discrepancies and even offering automated or guided remediation options.
  • Private Module Registry: Provides a centralized repository for reusable Terraform modules, promoting standardization, reducing duplication, and enabling self-service for developers.
  • Cost Estimation: Integrates with cost analysis tools (e.g., Infracost) to provide cost estimates during the planning phase, helping teams make informed decisions and manage budgets.
  • Audit Trails and Observability: Offers comprehensive logs and dashboards to track all Terraform activities, providing visibility into infrastructure changes and aiding in troubleshooting.
  • Custom Hooks and Integrations: Allows for the injection of custom scripts or integration with external tools (e.g., Slack, ITSM) at various stages of the Terraform workflow.
  • Self-Hosted Agents: Agents allow users to execute runs on their own infrastructure due to networking or security reasons.

Terraform TACOs vs. DIY:

When it comes to automating Terraform workflows, many organizations initially lean towards a DIY approach, leveraging general-purpose CI/CD tools like GitHub Actions, GitLab CI/CD, Jenkins, or similar. While these tools are powerful and offer a decent starting point for basic automation, it's important to understand their limitations when compared to dedicated TACO solutions.

The DIY Approach: Pros and Cons

Pros:

  • Familiarity: Teams already using these CI/CD platforms for application deployments might find it easy to adapt them for Terraform.
  • Cost-Effective (initially): Utilizing existing licenses or open-source solutions can seem cheaper upfront.
  • Flexibility: General-purpose CI/CD tools offer immense flexibility to script almost any workflow.

Cons:

  • Maintenance Overhead: Building and maintaining a robust Terraform automation pipeline from scratch requires significant effort. You're responsible for managing state locking, credential rotation, policy enforcement, drift detection, and audit logging – all of which become complex at scale.
  • Lack of Terraform-Specific Features: These tools are not inherently "aware" of Terraform's nuances. Features like visual plan reviews, cost estimation, private module registries, and a clear audit trail of infrastructure changes are either absent or require extensive custom development.
  • Security Concerns: Managing sensitive cloud credentials and ensuring secure execution environments within a general-purpose CI/CD can be challenging and error-prone without specialized features.
  • Scalability Challenges: As the number of Terraform configurations, environments, and teams grows, a DIY setup often struggles with performance, state management conflicts, and a lack of centralized governance.
  • Feature Richness Deficit: DIY solutions often fall short in providing the out-of-the-box, infrastructure-centric features that TACOs excel at, such as:
    • Built-in Policy Enforcement: Applying and enforcing governance policies across all Terraform runs is cumbersome.
    • Drift Detection and Remediation: Automatically identifying and addressing configuration drift is rarely built-in.
    • Self-Service: Empowering developers to provision predefined Terraform modules securely is difficult to implement.
  • Cognitive Load: Engineers need to understand not only Terraform but also the intricacies of the CI/CD platform and the custom scripts built around it, increasing cognitive load and potential for errors.

The TACO Advantage

Dedicated TACO solutions overcome these drawbacks by providing a purpose-built platform for Terraform automation and collaboration. They abstract away the complexities of managing state, applying policies, and orchestrating runs, allowing teams to focus on defining infrastructure, not managing the tools to deploy it.

While DIY solutions can provide basic automation, TACOs offer a complete, enterprise-grade solution that significantly reduces operational overhead, enhances security and governance, and accelerates the delivery of infrastructure as code. They provide the "missing layer" that elevates Terraform from a powerful CLI tool to a scalable and governable enterprise solution.

Major Players in the Space

Several key players dominate the Terraform TACO landscape, each with its own strengths and nuances:

  • Scalr: Commercial. Paid and free tier (Entire product included, but limited to 50 runs per month on free).
  • Terraform Cloud/Enterprise (HashiCorp): Commercial. Paid and a $500 trial credit.
  • Spacelift: Commercial. Paid and a free tier (limited to 2 users and 1 API key)
  • Env0: Commercial. Paid and a free trial. (No further details provided on their website about the free trial)
  • Atlantis: An open-source tool that pioneered the concept of automating Terraform via pull requests, providing plan and apply functionality directly within PRs.

Checklist When Selecting a TACO:

  • How much concurrency is included on the paid plan, and does it cost extra for more?
  • How many self-hosted agents are included on the paid plan, and does it cost extra for more?
  • Where is state stored? Can you store it in your own bucket if needed?
  • How granular is the RBAC? Can you create custom roles or are you force into "system" roles?
  • Is there a hierachical model put in place so objects like credentials, variables, and more can be applied to lower level workspaces?
  • Can you use the Terraform or Tofu CLI by setting the TACO as the remote backend?
  • Is the TACO focused on Terraform/OpenTofu, or is it a one-size-fits-all CI/CD?
  • The ease of the integrations... is it as simple as dropping in an API key to integrate with Datadog or Slack, or do you have to deal with a "plug-in"?
  • Is there reporting on module, provider, version, etc, usage to help manage the operations at scale?
  • For GitOps workflows, does it post back the plan details to the pull request?
  • Will the TACO satisfy the various workflows your customers use? For example, using GitOps, the Terraform CLI, No Code, etc.

Why Scalr is a Leader

While each TACO solution has its merits, Scalr stands out as a leader in the space for several compelling reasons:

Scalr's unique value proposition lies in its ability to centralize administration while decentralizing operations. This allows platform teams to maintain strong control and governance, while empowering development teams with the autonomy to provision infrastructure efficiently.

Key factors contributing to Scalr's leadership include:

  • Usage-Based Pricing Model: Scalr offers a transparent pricing model based on "runs" (Terraform/OpenTofu apply or dry run), with no hidden costs for support, new features, or increasing most quotas. This allows for predictable billing and scalability.
  • Flexible Workflows: GitOps, CLI, No Code: Scalr accommodates diverse operational models, supporting native Terraform/OpenTofu CLI workflows, GitOps-driven infrastructure management, and "No Code" provisioning for non-technical users via a module catalog.
  • Extremely Granular RBAC: Scalr provides robust Role-Based Access Control (RBAC) that allows organizations to define granular permissions for users and teams across different environments and workspaces, crucial for enterprise governance.
  • Inheritance Model for All Objects: Scalr's hierarchical structure (organizations, environments, workspaces) enables object inheritance. Admins can create and assign objects like variables, credentials, and modules at higher scopes, which are then inherited by lower scopes, simplifying management and ensuring consistency.
  • State Storage Options: Store in Scalr or Bring Your Own Bucket: Users have the flexibility to store Terraform/OpenTofu state files in Scalr's managed storage or in their own cloud storage buckets (AWS S3, GCP, Azure Storage). This caters to data residency and corporate policy requirements.
  • Increased Concurrency and Agents Quotas for Free: Scalr offers generous concurrency limits, with the ability to increase them for free by opening a support ticket or by deploying self-hosted agents, which add additional concurrent runs.
  • Centralized Credential Management with OIDC: Scalr supports OpenID Connect (OIDC) authentication for provider like AWS, Azure, and GCP. This centralizes credential management and enhances security.
  • Best-in-Class Integrations: Scalr integrates with a variety of popular tools for enhanced functionality, including Datadog, AWS Eventbridge, Slack, Teams, and more. For monitoring and observability.
  • Top Security Tools with OPA and Checkov: Scalr incorporates robust security and policy enforcement:
    • Open Policy Agent (OPA): For policy-as-code enforcement, allowing organizations to define and apply custom policies.
    • Checkov: For static code analysis to scan Terraform deployments for vulnerabilities and compliance violations before runs are executed.
  • Best-in-Class Reporting: Scalr provides comprehensive reporting and dashboards for:Centralized Operations Dashboard: Scalr offers a unified dashboard that shows all Terraform/OpenTofu operations across your organization, with filtering capabilities for quick issue identification.Terraform & OpenTofu Providers Report: This report, available at the account or environment scope, shows which providers and their versions are being used across all workspaces. This is crucial for standardizing provider usage and ensuring all workspaces are pulling from approved sources.Terraform & OpenTofu Modules Report: This report displays the modules and their versions from remote sources (module registry or Git) being used across all workspaces in the account.Terraform & OpenTofu Resources Report: This centralized report, located at the account scope, aggregates all resources managed by Terraform/OpenTofu across all state files into a single dashboard.Drift Detection Reports: Automated scanning to identify discrepancies between actual infrastructure state and Terraform code, with detailed reports and remediation options.Stale Workspace Reports: Reports to identify workspaces that haven't had a run executed recently, helping to pinpoint potential drift or unmanaged infrastructure.Security Reports: Via OPA and Checkov policy reports.
  • Compatibility with Terragrunt and Ability to Use the Atlantis Workflow: Scalr supports Terragrunt for managing complex Terraform configurations and provides an "Atlantis-style" workflow, offering a familiar experience for users accustomed to pull-request-driven infrastructure deployments.

Scalr empowers organizations to scale their IaC adoption with confidence, ensuring that infrastructure provisioning is secure, compliant, cost-effective, and highly collaborative.