Terraform Provisioners Guide
Learn how to use Terraform provisioners effectively, pitfalls to avoid, best practices and Scalr's approach for safer, automated infrastructure tasks.
Terraform has revolutionized how we manage infrastructure, championing a declarative approach to define and deploy resources. Yet, sometimes, the real world demands actions that don't neatly fit into this declarative model. This is where Terraform provisioners enter the picture – a feature designed as a "measure of pragmatism" but one that HashiCorp itself advises using only as a "last resort."
This post goes into what Terraform provisioners are, how they work, their inherent challenges, and crucially, the more robust alternatives that modern IaC practices favor. We'll also touch upon best practices for those rare moments when provisioners become unavoidable.
1. What Exactly Are Terraform Provisioners?
Terraform provisioners allow you to execute scripts or specific actions on a local or remote machine during a resource's lifecycle – typically after it's created or before it's destroyed. They exist to bridge the gap for tasks not directly supported by Terraform's declarative language or its vast provider ecosystem, such as bootstrapping servers, running configuration management tools, or performing custom cleanup.
While Terraform providers manage the state and lifecycle of infrastructure components (VMs, networks, etc.), provisioners perform tasks on or because of these components. HashiCorp includes them pragmatically, acknowledging that some imperative actions are occasionally necessary. However, this pragmatism comes with trade-offs. Provisioner actions aren't fully managed by Terraform's planning engine, potentially leading to discrepancies between perceived and actual system states.
This is why the official guidance is clear: use provisioners as a last resort. Their use can introduce:
- Complexity: Mixing imperative scripts with declarative code.
- State Inaccuracy: Actions aren't fully tracked in Terraform state or plans.
- Idempotency Challenges: Scripts must be manually made idempotent.
- Debugging Difficulties: Harder to troubleshoot than declarative configurations.
- Portability Issues: Scripts can be OS-specific.
The shift away from vendor-specific provisioners (like Chef, Puppet) to generic ones (file
, local-exec
, remote-exec
) underscores Terraform's focus on orchestration, delegating complex software configuration to specialized tools or, preferably, more integrated IaC patterns. Effective IaC governance often involves establishing clear policies on when, if ever, provisioners are acceptable, guiding teams towards more maintainable solutions.
2. The Core Trio: local-exec
, remote-exec
, and file
Three main provisioners form the backbone of this functionality:
local-exec
Invokes a command locally on the machine running terraform apply
, after the resource is created. Useful for local tasks, not directly configuring the remote resource.
resource "aws_instance" "web" {
ami = "ami-0c55b31ad2c456998"
instance_type = "t2.micro"
# ... other resource arguments ...
provisioner "local-exec" {
command = "echo Instance ${self.id} has IP ${self.public_ip} >> instance_ips.txt"
# Best practice: Pass sensitive data or complex variables via environment
environment = {
INSTANCE_ID = self.id
PUBLIC_IP = self.public_ip
}
# Example using environment variables in the command
# command = "echo Instance $INSTANCE_ID has IP $PUBLIC_IP >> instance_ips.txt"
# interpreter = ["/bin/bash", "-c"] # Optional: specify interpreter
}
}
Security Note: Avoid direct variable interpolation in the command
string to prevent shell injection. Use the environment
argument instead.
remote-exec
Invokes a script on the remote resource after it's created. Common for software installation or configuration. Requires a connection
block.
resource "aws_instance" "app_server" {
ami = "ami-0c55b31ad2c456998"
instance_type = "t2.micro"
# Ensure your security group allows SSH access from your Terraform execution environment
# key_name = "your-ssh-key-name" # Specify your SSH key
# ... other resource arguments ...
connection {
type = "ssh"
user = "ec2-user" # Or "ubuntu", "admin" etc. depending on the AMI
private_key = file("~/.ssh/your-private-key.pem")
host = self.public_ip
}
provisioner "remote-exec" {
inline = [
"sudo yum update -y",
"sudo yum install -y httpd",
"sudo systemctl start httpd",
"sudo systemctl enable httpd"
]
}
# Example for executing a script file
# provisioner "file" {
# source = "scripts/setup_app.sh"
# destination = "/tmp/setup_app.sh"
# }
# provisioner "remote-exec" {
# inline = [
# "chmod +x /tmp/setup_app.sh",
# "/tmp/setup_app.sh some_argument" # Arguments passed directly
# ]
# }
}
Note on script arguments: remote-exec
doesn't natively pass arguments to scripts defined via script
or scripts
. The workaround is to first use file
to upload the script, then remote-exec
with inline
to execute it with arguments.
file
Copies files or directories from the Terraform machine to the remote resource. Essential for transferring configs, binaries, or scripts.
resource "aws_instance" "db_server" {
ami = "ami-0c55b31ad2c456998"
instance_type = "t2.micro"
# key_name = "your-ssh-key-name"
# ... other resource arguments ...
connection {
type = "ssh"
user = "ec2-user"
private_key = file("~/.ssh/your-private-key.pem")
host = self.public_ip
}
provisioner "file" {
source = "configs/app.conf" # Local path
destination = "/etc/myapp/app.conf" # Remote path
}
provisioner "file" {
content = "db_host=${self.private_ip}"
destination = "/etc/myapp/db.ini"
}
}
WinRM Considerations: File transfer with WinRM is more complex and has security implications for the destination
path. SSH is generally preferred for Windows if OpenSSH is available.
Here's a quick comparison:
Table 1: Core Provisioner Comparison
Feature |
|
|
|
---|---|---|---|
Execution Locus | Terraform Host (Local Machine) | Remote Resource | Remote Resource (copies from Local) |
Primary Use Case | Run local scripts/commands post-resource creation | Run remote scripts/commands post-resource creation | Copy files/directories to remote resource |
Connection Required | No | Yes (SSH or WinRM) | Yes (SSH or WinRM) |
Key Arguments |
|
|
|
Common Scenario | Triggering local build/notification scripts | Software installation, service configuration on VM | Uploading config files, binaries, or scripts for execution |
3. Understanding Provisioner Mechanics
Provisioners are declared in provisioner
blocks within resources. They use the self
object to access attributes of their parent resource (e.g., self.public_ip
).
Connection Block (connection
): For remote-exec
and file
, a connection
block (nested in the provisioner or resource) defines how Terraform connects (SSH or WinRM), including host, user, credentials, and timeouts. Securely managing these credentials, especially in automated environments, is paramount. Platforms that integrate with secrets management systems can simplify this, ensuring credentials aren't exposed in code.
Execution Lifecycle:
- Creation-Time (default): Run when a resource is created. Don't re-run on updates unless the resource is replaced. Failure taints the resource.
- Destroy-Time (
when = "destroy"
): Run before a resource is destroyed. Useful for cleanup. Scripts should be idempotent, as they might re-run on failure. They won't run ifcreate_before_destroy
is true or the resource is tainted.
Failure Behavior (on_failure
):
fail
(default): Stops the apply, taints the resource (if creation-time).continue
: Ignores the error, continues apply, but still taints the resource (if creation-time).
A tainted resource is scheduled for destruction and recreation on the next apply.
Table 2: Provisioner Lifecycle and Failure Behavior
Provisioner Timing |
| Outcome on Script Success | Outcome on Script Failure (Terraform Action) | Outcome on Script Failure (Resource State) | Key Considerations |
---|---|---|---|---|---|
Creation-Time |
| Resource created, apply continues | Apply stops, error raised | Tainted | Default behavior; ensures problematic resources are recreated. |
Creation-Time |
| Resource created, apply continues | Apply continues (with warning) | Tainted | Use with caution; can mask issues. Resource is still tainted and will be recreated on next apply. |
Destroy-Time |
| Resource destroyed (after provisioner), apply continues | Apply stops, error raised. Provisioner reruns on next destroy attempt. | Not destroyed (pending successful provisioner) | Script should be idempotent. Does not run if resource is tainted or |
Destroy-Time |
| Resource destroyed (after provisioner), apply continues | Apply continues (with warning). Resource destroyed (after provisioner). | Destroyed | Allows destruction even if cleanup fails. Script should ideally be idempotent. Does not run if resource is tainted or |
4. The Catch: Why Provisioners Are a "Last Resort"
The "last resort" mantra isn't arbitrary. Provisioners introduce significant challenges:
- Idempotency Burden: Scripts aren't inherently idempotent. You must write them to be safely re-runnable. Terraform won't manage this for you.
- State and Plan Blind Spots:
terraform plan
cannot predict what scripts will do. Changes made by provisioners aren't accurately reflected in Terraform's state, leading to potential configuration drift and undermining the single source of truth. - Complexity & Maintenance Overhead: Managing imperative scripts alongside declarative HCL increases cognitive load and maintenance. Each script is an artifact needing its own lifecycle management.
- Dependency & Portability Nightmares: Scripts can create hidden dependencies on tools or OS versions, hindering portability.
- Security Risks:
- Credential Management: Securely handling SSH keys/passwords for remote provisioners is critical.
- Injection Vulnerabilities:
local-exec
commands with direct variable interpolation are risky. - WinRM PowerShell: The
destination
path infile
provisioner for WinRM can be a vector if not handled carefully.
These challenges are amplified in collaborative environments or CI/CD pipelines. Without strong governance and visibility – often provided by an IaC management platform – provisioners can quickly become a source of instability and security holes.
5. Beyond Provisioners: Smarter Alternatives for Configuration
Fortunately, robust alternatives exist for most scenarios where provisioners might seem tempting:
- Custom Machine Images (e.g., Packer): Create "golden images" with software and configurations pre-baked. HashiCorp Packer is excellent for this. Instances launch faster and more consistently. This aligns with immutable infrastructure principles.
- Configuration Management Tools (Ansible, Chef, Puppet, SaltStack): For ongoing, complex configuration management, these dedicated tools are superior. They are designed for idempotency and managing systems at scale. Terraform can provision infrastructure, and then these tools (triggered via CI/CD or even a carefully considered
local-exec
) configure it. - Native Provider Functionality & Data Sources: Always check if your cloud provider's Terraform provider can achieve the configuration declaratively. Providers are constantly evolving.
- Serverless Functions / Orchestration Services: For asynchronous post-provisioning tasks (health checks, external service registration), consider AWS Lambda, Azure Functions, or tools like Spinnaker.
Cloud-Init / User Data: Most cloud providers allow passing scripts (user_data
) to instances at boot. cloud-init
(Linux) or cloudbase-init
(Windows) handle tasks like package installs, user setup, or running initial scripts. This is often the preferred method for initial bootstrapping as it's native and avoids direct Terraform-to-instance network dependencies.
resource "aws_instance" "web_via_userdata" {
ami = "ami-0c55b31ad2c456998" # Amazon Linux 2
instance_type = "t2.micro"
user_data = <<-EOF
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<h1>Hello from User Data!</h1>" > /var/www/html/index.html
EOF
tags = {
Name = "web-via-userdata"
}
}
Managing these diverse alternatives effectively across an organization benefits from a centralized platform that can enforce standards, provide visibility, and integrate with various toolchains. This ensures that even as teams choose the best tool for the job, the overall IaC landscape remains manageable and secure.
Table 3: Comparison of Provisioner Alternatives
Alternative | How it Works | Pros | Cons | Typical Use Cases | Integration with Terraform |
---|---|---|---|---|---|
Cloud-Init / User Data | Scripts/directives executed by the instance at boot time using data passed by Terraform. | Native to cloud platforms; good for initial bootstrapping; avoids direct Terraform-to-instance network dependency for setup. | Limited to boot time; can become complex for extensive configurations; debugging can be tricky. | Initial package installs, SSH key setup, basic service configuration, running first-boot scripts. | Terraform passes user data (e.g., via |
Custom Machine Images (Packer) | Pre-bake software and configurations into a machine image; Terraform launches instances from this image. | Faster instance startup; promotes immutability; consistent deployments; reduces runtime configuration errors. | Image build pipeline required; image management overhead; updates require rebuilding and re-deploying images. | Creating "golden images" with standardized OS, security hardening, common tools, and application baselines pre-installed. | Terraform references the custom image ID when creating instances. Packer manages the image build process separately. |
Config. Mgmt. Tools (Ansible, Chef, etc.) | Dedicated tools manage software installation, configuration, and ongoing state of systems. | Robust, scalable, idempotent configuration; mature ecosystems; good for complex application setup and ongoing management. | Adds another tool to the stack; learning curve; can be slower than baked images for initial deployment. | Complex application deployment, ongoing configuration management, compliance enforcement, patching. | Can be triggered by |
Native Provider Functionality | Use Terraform resource types and arguments provided by cloud/service providers to manage configurations. | Fully declarative; integrated with Terraform state and plan; often most reliable and efficient way to configure provider services. | Limited by what the provider exposes; may not cover all custom needs or software installation. | Configuring service-specific settings (e.g., database parameters, load balancer rules, IAM policies) directly via HCL. | Core Terraform functionality; define resources and their attributes in HCL. |
Serverless Functions / Orchestration | External services or functions triggered by events (e.g., resource creation) to perform actions. | Decoupled; event-driven; scalable; good for asynchronous tasks or complex workflows. | Adds architectural complexity; requires managing the serverless functions/orchestration system. | Post-provisioning health checks, registration with external systems, data seeding, complex application initialization steps. | Terraform creates the infrastructure; events from this infrastructure (e.g., via CloudWatch Events, Azure Event Grid) trigger the external logic. |
6. When You Must: Best Practices for Using Provisioners
If, after exhausting all alternatives, a provisioner is deemed absolutely necessary:
- Reinforce "Last Resort": Document why it's needed and why alternatives failed. This scrutiny is crucial for maintaining discipline.
- Ensure Script Idempotency: This is non-negotiable. Scripts must be safe to re-run without unintended side effects.
- Secure Credential Handling: Use environment variables, secrets managers (like HashiCorp Vault), or cloud provider KMS, not hardcoded values.
- Robust Error Handling & Logging in Scripts: Scripts should
set -e
(or equivalent) and produce clear logs. - Thorough Testing: Test in isolated environments to verify script behavior and idempotency.
- Keep Logic Simple & Focused: If a provisioner script becomes complex, it's a sign you need a dedicated CM tool or a different approach.
Use null_resource
for Decoupled Actions: For provisioner logic not tied to a specific resource or triggered by changes, null_resource
with triggers
can offer more control.
resource "null_resource" "run_script_on_change" {
triggers = {
# Re-run when this S3 object's version changes
# (assuming data.aws_s3_object_version.example is defined)
# script_version = data.aws_s3_object_version.example.version_id
# Or, run when a local file's content changes
config_file_sha1 = filesha1("configs/my_local_config.json")
}
provisioner "local-exec" {
command = "echo 'Triggered by change: ${self.triggers.config_file_sha1}' && ./my_script.sh"
}
}
Implementing these best practices often requires strong organizational standards and tooling that can enforce policies and provide visibility into how provisioners are being used.
7. The Future of Provisioners
HashiCorp deprecated and removed vendor-specific provisioners (Chef, Puppet, etc.) in Terraform 0.15, signaling a clear direction: Terraform is for infrastructure orchestration, not detailed configuration management.
However, the generic file
, local-exec
, and remote-exec
provisioners remain supported and are part of Terraform's v1.x compatibility promises. They fill a niche for edge cases. The long-term vision, though, is to make providers and integrations so comprehensive that the need for these "last resort" mechanisms diminishes.
8. Conclusion: Declarative Purity
Terraform provisioners are a powerful, yet potentially problematic, tool. While they offer an escape hatch for imperative actions, they come with significant trade-offs in complexity, state accuracy, and security.
The clear industry trend, and HashiCorp's guidance, points towards minimizing their use. Prioritizing native provider features, cloud-init
, custom images via Packer, and dedicated configuration management tools leads to more robust, maintainable, and scalable Infrastructure as Code.
When provisioners are unavoidable, rigorous adherence to best practices is essential. For organizations looking to scale their IaC adoption, establishing strong governance, promoting declarative patterns, and leveraging platforms that provide visibility and control over their Terraform workflows are key to taming complexity and ensuring that even "last resort" tools are used safely and effectively.