Blog Series: Enforcing Policy as Code in Terraform (Part 3 of 5)

Part 3 shows how to author and enforce policy-as-code in Terraform with OPA, automating security and compliance checks in every deployment.

Open Policy Agent (OPA) and conftest for Terraform Validation

Welcome back! In Part 1, we laid the groundwork for Policy as Code (PaC) and why it's indispensable for modern infrastructure. In Part 2, we explored Terraform's native validation features and acknowledged their limits, leading us to the need for dedicated PaC tools.

Today, we're taking a deep dive into one of the most powerful and versatile tools in the PaC arsenal: Open Policy Agent (OPA). We'll also see how conftest, a handy utility, helps us use OPA to validate our Terraform plans.

What is Open Policy Agent (OPA)?

Open Policy Agent (often referred to as OPA, pronounced "oh-pa") is an open-source, general-purpose policy engine. It graduated from the Cloud Native Computing Foundation (CNCF), which is a testament to its maturity and widespread adoption.

The core idea behind OPA is to decouple policy decision-making from policy enforcement.

  • Decision-making: OPA takes your policies (written in a language called Rego) and some structured data (like a JSON file) as input. It then evaluates this data against your policies and produces a decision (e.g., "allow," "deny," or a set of violations).
  • Enforcement: Your application or system (in our case, our Terraform workflow) takes OPA's decision and acts upon it. For example, if OPA denies a Terraform plan, your CI/CD pipeline could halt the deployment.

This decoupling is powerful because it allows you to use a single engine (OPA) and language (Rego) to enforce policies across a diverse range of technologies – Kubernetes, microservices, CI/CD pipelines, and, of course, Terraform.

Rego: The Language of OPA

Policies in OPA are written in a high-level declarative language called Rego. If you're new to it, Rego might seem a bit different at first, but it's specifically designed for querying complex, hierarchical data structures (like JSON) and expressing policies over them.

Here's a tiny conceptual taste of Rego:

# This is a comment in Rego
package main # Policies are organized in packages

# By default, access is denied
default allow = false

# Allow access if the user is an admin
allow = true {
    input.user.role == "admin"
}

# Deny if the request is coming from a blocked IP
deny[reason] { # 'deny' can be a set of reasons
    input.request.ip == "1.2.3.4"
    reason := "Request from a blocked IP address."
}

In Rego:

  • You define rules (like allow or deny).
  • Rules can have conditions ({ ... }). If all conditions inside the curly braces are true, the rule's head is true (or assigned a value).
  • The input document is the JSON data OPA evaluates.

We'll see a more concrete Terraform-related example shortly.

OPA and Terraform: The Workflow

So, how do we use OPA to validate our Terraform configurations? The typical workflow looks like this:

  1. Evaluate with OPA: You then feed this tfplan.json file (as the input) to OPA, along with your Rego policies. OPA evaluates the plan against the policies and outputs a decision.

Convert the Plan to JSON: OPA works with JSON data. Terraform can convert its binary plan file into a structured JSON format:

terraform show -json tfplan.binary > tfplan.json

This tfplan.json file contains a detailed description of all the changes Terraform intends to make – resources to be created, updated, or deleted, along with their proposed attributes.

Generate a Terraform Plan: First, you create a Terraform execution plan as usual, but you save it to a binary file:

terraform plan -out=tfplan.binary

Writing Your First Terraform Policy with Rego

Let's write a simple Rego policy. Imagine we want to ensure that every AWS S3 bucket we create has versioning enabled.

Create a file named s3_versioning.rego with the following content:

package terraform.aws.s3_versioning

# By default, allow the plan (no violations)
# We will add violations if we find them.
# 'deny' is a common convention for rules that list violations.
# If 'deny' is empty, the policy passes.

# This rule will generate a message for each S3 bucket
# that is being created or updated without versioning enabled.
deny[msg] {
    # Iterate over all resource changes in the Terraform plan
    resource := input.resource_changes[_]

    # We only care about aws_s3_bucket resources
    resource.type == "aws_s3_bucket"

    # We only care about resources being managed by Terraform
    resource.mode == "managed"

    # We are interested in resources being created or updated
    # (For "delete", versioning status is less relevant for this policy)
    action := resource.change.actions[_]
    action == "create" # or action == "update"

    # Check the 'after' state of the versioning configuration
    # If versioning is not present, or not enabled, it's a violation.
    # The structure is: resource.change.after.versioning[0].enabled
    # We need to be careful if 'versioning' or 'versioning[0]' might not exist.
    not resource.change.after.versioning
    msg := sprintf("S3 bucket '%s' must have versioning enabled.", [resource.address])
}

deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_s3_bucket"
    resource.mode == "managed"
    action := resource.change.actions[_]
    action == "create"

    # Explicitly check if versioning is present and enabled is not true
    # This handles cases where versioning block exists but is not enabled
    resource.change.after.versioning
    not resource.change.after.versioning[0].enabled
    msg := sprintf("S3 bucket '%s' has versioning block but it is not enabled.", [resource.address])
}

Let's break this down:

  • package terraform.aws.s3_versioning: Organizes the policy.
  • deny[msg] { ... }: This defines a rule named deny. If the conditions inside the curly braces are met, the rule generates a message msg. If the deny set is empty after evaluation, the policy passes.
  • resource := input.resource_changes[_]: This iterates through each item in the resource_changes array found in the tfplan.json (which is OPA's input). _ is a Rego convention for an iterator variable you don't need to refer to directly.
  • resource.type == "aws_s3_bucket": Filters for S3 buckets.
  • resource.mode == "managed": Ensures we're looking at resources managed by Terraform.
  • action := resource.change.actions[_]; action == "create": Checks if the action is "create". You could add "update" as well.
  • The conditions not resource.change.after.versioning and not resource.change.after.versioning[0].enabled check if the versioning block is missing or if it's present but not enabled in the configuration after the change.
  • msg := sprintf(...): Creates the violation message.

This is a basic example. Rego can express much more complex logic!

conftest: Your Local Policy Testing Tool

While OPA is the engine, conftest is a fantastic utility that makes it easy to test structured configuration data (like our tfplan.json) against OPA policies. It's a CLI tool that bundles OPA.

Installation: You can download conftest from its GitHub releases page or install it using package managers like Homebrew.

Usage: Assuming you have:

  1. Your Terraform plan JSON: tfplan.json
  2. Your Rego policy file: s3_versioning.rego (let's say it's in a directory called ./policy)

You can test your plan with conftest like this:

conftest test --policy ./policy/ tfplan.json
  • --policy ./policy/: Tells conftest where to find your Rego policy files.
  • tfplan.json: The input file to test.

Output:

  • If the policy passes (e.g., all S3 buckets have versioning): conftest will typically exit with a status code 0 and might show something like 1 test, 1 passed, 0 warnings, 0 failures.

If the policy fails (e.g., an S3 bucket is missing versioning): conftest will exit with a non-zero status code and print the violation messages generated by your deny rule(s):

FAIL - tfplan.json - terraform.aws.s3_versioning - S3 bucket 'aws_s3_bucket.my_bucket_without_versioning' must have versioning enabled.

This immediate feedback is invaluable for developers and for integrating PaC into CI/CD pipelines.

Benefits of Using OPA with Terraform

  • Expressive & Flexible: Rego is a powerful language for defining fine-grained policies.
  • Decoupled & Centralized: Policy logic lives separately from your Terraform HCL code, promoting cleaner code and allowing policies to be managed centrally (e.g., in their own Git repo).
  • General Purpose: Learn Rego once, and you can apply it to policy enforcement across many parts of your stack (Kubernetes, APIs, etc.).
  • Testable: OPA and conftest provide excellent support for testing policies.

Considerations

  • Rego Learning Curve: While powerful, Rego has a learning curve, especially if you're new to declarative languages or logic programming concepts.
  • Workflow Setup: You need to script the terraform plan -> terraform show -> conftest test sequence, especially for CI/CD integration.

Conclusion and What's Next

Open Policy Agent, combined with conftest, offers a robust and flexible way to implement Policy as Code for your Terraform projects. By writing policies in Rego and evaluating your Terraform plan JSON, you can catch violations before they hit your infrastructure.

While OPA is a fantastic general-purpose engine, it's not the only player in the PaC for Terraform game. In Part 4, we'll explore HashiCorp Sentinel, a PaC framework tightly integrated into Terraform Cloud and Terraform Enterprise, and also look at popular static analysis tools like tfsec and Checkov that provide out-of-the-box checks for common misconfigurations. Stay tuned!