Understanding Terraform file provisioners
Understand Terraform file provisioners: how they copy files to servers, common use cases, pitfalls, and smarter, safer alternatives.
Terraform file provisioners copy files from your local machine to newly created resources, bridging the gap between infrastructure provisioning and configuration. While useful for bootstrapping and initial setup, HashiCorp explicitly recommends using them only as a last resort due to their limitations in Terraform's declarative model.
What file provisioners are and why they exist
File provisioners address a practical challenge in infrastructure automation: transferring configuration files, scripts, or other assets to newly provisioned resources. They're part of Terraform's broader provisioner concept that HashiCorp describes as "a measure of pragmatism, knowing that there are always certain behaviors that cannot be directly represented in Terraform's declarative model."
The file provisioner transfers files or directories from the local machine running Terraform to newly created resources. It operates alongside two other built-in provisioners:
- remote-exec: Executes commands or scripts on the remote resource
- local-exec: Runs commands on the local machine running Terraform
File provisioners occupy a specific place in Terraform's execution model:
- By default, they run only when their parent resource is created (not during updates)
- They can be configured to run during resource destruction with
when = destroy
- When multiple provisioners exist, they execute in the order specified
- If a creation-time provisioner fails, Terraform marks the resource as "tainted"
- They operate outside Terraform's state management system
This last point is crucial: provisioner actions are not recorded in state files, meaning Terraform cannot detect or remediate drift in files managed through provisioners.
Complete syntax reference with real-world examples
The file provisioner uses this syntax within a resource block:
resource "aws_instance" "web" {
# Resource configuration...
provisioner "file" {
source = "config/app.conf" # Local file/directory to copy
content = "configuration text" # Alternative to source - direct content
destination = "/etc/app/config.conf" # Remote path to place file/content
# Connection information
connection {
type = "ssh" # SSH or WinRM
user = "ec2-user" # Remote username
private_key = file("key.pem") # Authentication
host = self.public_ip # Remote address
# Other connection options...
}
# Meta-arguments
when = "create" # When to run: "create" (default) or "destroy"
on_failure = "fail" # What to do on failure: "fail" (default) or "continue"
}
}
You must specify either source
or content
(never both), along with a mandatory destination
.
Connection configuration patterns
You can define connections at the resource level (affecting all provisioners) or inline for each provisioner:
# Resource-level connection (applies to all provisioners)
resource "aws_instance" "web" {
# Resource configuration...
connection {
type = "ssh"
user = "ubuntu"
private_key = file("${path.module}/id_rsa")
host = self.public_ip
}
provisioner "file" { ... }
provisioner "remote-exec" { ... }
}
# Provisioner-specific connection
resource "aws_instance" "web" {
# Resource configuration...
provisioner "file" {
# File provisioner config...
connection {
# Connection details specific to this provisioner
}
}
}
Directory handling considerations
When copying directories, behavior depends on trailing slashes:
- Source
/local/dir
(no trailing slash) to/remote/path
→ contents copied to/remote/path/dir
- Source
/local/dir/
(with trailing slash) to/remote/path
→ contents copied directly into/remote/path
With SSH connections, the destination directory must already exist. This often requires creating it first:
resource "aws_instance" "web" {
# Resource configuration...
# First create the directory
provisioner "remote-exec" {
inline = ["mkdir -p /opt/application/config"]
}
# Then copy files to it
provisioner "file" {
source = "configs/"
destination = "/opt/application/config"
}
}
With WinRM connections, the destination directory is created automatically if it doesn't exist.
Security and best practices that can't be ignored
HashiCorp's official stance is clear: use provisioners as a last resort. This recommendation stems from several limitations:
- Provisioners operate outside Terraform's declarative model
- Terraform cannot track provisioner actions in state files
- They require direct network access to resources
- They add complexity with connection details and credentials
Security considerations demand attention
Credential management poses significant risks:
- Credentials in provisioner connection blocks may expose sensitive information
- Credentials are stored in Terraform state files, which may not be properly secured
- SSH keys for provisioners may have broader access than necessary
Network security issues arise:
- File provisioners require direct network access (often SSH/WinRM ports)
- This often means opening ports that would otherwise remain closed
- Creates potential attack vectors if not properly restricted
File permissions need careful handling:
- Files inherit permissions from the remote system
- When using SSH, provisioners run with the specific user's permissions
- Sensitive files might become accessible to unauthorized users
Essential best practices to follow
- Use alternatives whenever possible:
- Cloud-init with user_data for bootstrapping
- Pre-built images with HashiCorp Packer
- Configuration management tools for complex needs
- Provider-specific features (when available)
- Secure credential handling:
- Use SSH key-based authentication instead of passwords
- Store private keys securely (not in version control)
- Use environment variables or secure backends for sensitive data
- Consider just-in-time or dynamic credential generation
- Optimize for performance and reliability:
- Set appropriate timeouts to prevent hanging operations
- Use
on_failure = continue
when appropriate - Split large file transfers into smaller chunks
- Consider connection pooling for multiple resources
- Address idempotency challenges:
- Implement checks to avoid unnecessary file transfers
- Use checksums or timestamps to compare files
- Consider conditional execution based on resource metadata
Dealing with common issues and effective troubleshooting
File provisioners frequently encounter several types of issues:
Connection problems need systematic diagnosis
Error: timeout - last error: dial tcp x.x.x.x:22: i/o timeout
Common causes include:
- Security groups/firewalls blocking SSH/WinRM traffic
- Incorrect SSH key or credentials
- Instance not fully booted when Terraform attempts to connect
Solutions:
- Ensure security groups allow inbound traffic on port 22 (SSH) or 5985/5986 (WinRM)
- Verify SSH key pair or credentials
- Add a
depends_on
or increaseconnection.timeout
to allow more boot time - Check network routing and VPN settings
Authentication failures require verification
Error: ssh: handshake failed: ssh: unable to authenticate
Solutions:
- Verify the private key matches the public key on the instance
- Check permissions on the private key file (should be 400 or 600)
- Ensure the SSH user exists on the instance
- Try connecting manually to verify credentials
Destination path issues often require pre-creation
Error: Upload failed: scp: /path/to/dir: No such file or directory
Solutions:
- Use
remote-exec
provisioner first to create the directory - Use absolute paths instead of relative paths or
~
- Use a path where the SSH user has write permissions
Windows requires special attention
Error: PowerShell exited with code 1
Solutions:
- Ensure WinRM is enabled and configured
- Use forward slashes in paths even on Windows
- Configure WinRM in user_data or during image creation
Alternatives that align better with IaC principles
Several approaches offer better alternatives to file provisioners in most scenarios:
Cloud-init / user data excels for initial configuration
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = "t2.micro"
user_data = file("scripts/setup.yaml")
}
Best when:
- Working with major cloud providers
- Using auto-scaling groups
- Bootstrapping instances with initial configuration
- Avoiding direct network access requirements
Template providers enable dynamic configuration
resource "aws_instance" "web" {
ami = data.aws_ami.ubuntu.id
instance_type = "t2.micro"
user_data = templatefile("${path.module}/templates/init.tpl", {
server_name = var.server_name
db_address = aws_db_instance.database.address
})
}
Best when:
- Incorporating dynamic values from Terraform
- Generating configuration files with variables
- Working with auto-scaling groups
Pre-built images with Packer offer immutability
# Packer builds the image with files included
# Terraform simply references the pre-built image
resource "aws_instance" "web" {
ami = "ami-0dbaca5d269497603" # Pre-built with Packer
instance_type = "t2.micro"
}
Best when:
- Deploying complex applications with many dependencies
- Optimizing for deployment speed and reliability
- Creating golden images for auto-scaling groups
- Building immutable infrastructure
Cloud-init provider enables complex configurations
data "cloudinit_config" "config" {
gzip = true
base64_encode = true
part {
content_type = "text/cloud-config"
content = yamlencode({
write_files = [{
path = "/etc/app/config.json"
content = jsonencode(local.app_config)
permissions = "0644"
}]
})
}
}
resource "aws_instance" "web" {
user_data = data.cloudinit_config.config.rendered
}
Best when:
- Needing multi-part configurations
- Combining cloud-config YAML with shell scripts
- Including binary files with base64 encoding
Practical real-world implementation patterns
Deploying with dynamic template rendering
This pattern demonstrates using template files to generate configuration dynamically:
data "template_file" "app_config" {
template = file("${path.module}/templates/app_config.json.tpl")
vars = {
db_host = aws_db_instance.database.address
db_port = aws_db_instance.database.port
api_key = var.api_key
environment = var.environment
}
}
resource "aws_instance" "app_server" {
# Instance configuration...
connection {
type = "ssh"
user = "ubuntu"
private_key = file("${path.module}/ssh_key.pem")
host = self.public_ip
}
provisioner "file" {
content = data.template_file.app_config.rendered
destination = "/etc/app/config.json"
}
provisioner "remote-exec" {
inline = ["sudo systemctl restart app-service"]
}
}
Using null resources for post-deployment operations
For updating configurations without rebuilding infrastructure:
resource "null_resource" "deploy_config" {
# Trigger when configuration changes
triggers = {
config_contents = data.template_file.service_config.rendered
}
connection {
type = "ssh"
user = "ubuntu"
private_key = file("${path.module}/ssh_key.pem")
host = aws_instance.web_server.public_ip
}
provisioner "file" {
content = data.template_file.service_config.rendered
destination = "/etc/nginx/sites-available/default"
}
provisioner "remote-exec" {
inline = [
"sudo nginx -t",
"sudo systemctl reload nginx"
]
}
}
Multi-stage deployment process for complex applications
For more sophisticated deployments:
resource "aws_instance" "application_server" {
# Instance configuration...
# Stage 1: Set up directories
provisioner "remote-exec" {
inline = [
"mkdir -p /opt/app/config",
"mkdir -p /opt/app/logs",
"mkdir -p /opt/app/data"
]
}
# Stage 2: Deploy configuration files
provisioner "file" {
source = "config/"
destination = "/opt/app/config"
}
# Stage 3: Deploy application binary
provisioner "file" {
source = "builds/application.jar"
destination = "/opt/app/application.jar"
}
# Stage 4: Configure and start the application
provisioner "remote-exec" {
inline = [
"chmod +x /opt/app/config/start.sh",
"sudo systemctl enable application",
"sudo systemctl start application"
]
}
}
Cross-platform deployment with OS detection
For infrastructure supporting both Windows and Linux:
locals {
is_windows = var.os_type == "windows" ? true : false
}
resource "aws_instance" "server" {
# Instance configuration...
connection {
type = local.is_windows ? "winrm" : "ssh"
user = local.is_windows ? "Administrator" : "ec2-user"
password = local.is_windows ? var.admin_password : null
private_key = local.is_windows ? null : file("${path.module}/key.pem")
host = self.public_ip
}
provisioner "file" {
source = local.is_windows ? "scripts/windows/" : "scripts/linux/"
destination = local.is_windows ? "C:/temp" : "/tmp"
}
provisioner "remote-exec" {
inline = [
local.is_windows ?
"powershell -Command \"C:/temp/setup.ps1\"" :
"chmod +x /tmp/setup.sh && /tmp/setup.sh"
]
}
}
Conclusion: use sparingly, but effectively
Terraform file provisioners provide a pragmatic solution for transferring files to newly created resources, addressing use cases that fall outside Terraform's declarative model. While they're powerful tools for bootstrapping and initial configuration, they should be used judiciously due to their significant limitations.
The key takeaway is to follow HashiCorp's guidance: use provisioners as a last resort. For most scenarios, alternatives like cloud-init, templated user data, or pre-built images with Packer offer better solutions that align with infrastructure as code principles.
When file provisioners are necessary, understanding their execution model, behavior on failure, and technical mechanisms enables you to use them effectively while mitigating their limitations. By following the best practices and implementation patterns outlined in this guide, you can leverage file provisioners appropriately within your Terraform workflows.