Understanding Terraform file provisioners

Understand Terraform file provisioners: how they copy files to servers, common use cases, pitfalls, and smarter, safer alternatives.

Terraform file provisioners copy files from your local machine to newly created resources, bridging the gap between infrastructure provisioning and configuration. While useful for bootstrapping and initial setup, HashiCorp explicitly recommends using them only as a last resort due to their limitations in Terraform's declarative model.

What file provisioners are and why they exist

File provisioners address a practical challenge in infrastructure automation: transferring configuration files, scripts, or other assets to newly provisioned resources. They're part of Terraform's broader provisioner concept that HashiCorp describes as "a measure of pragmatism, knowing that there are always certain behaviors that cannot be directly represented in Terraform's declarative model."

The file provisioner transfers files or directories from the local machine running Terraform to newly created resources. It operates alongside two other built-in provisioners:

  • remote-exec: Executes commands or scripts on the remote resource
  • local-exec: Runs commands on the local machine running Terraform

File provisioners occupy a specific place in Terraform's execution model:

  • By default, they run only when their parent resource is created (not during updates)
  • They can be configured to run during resource destruction with when = destroy
  • When multiple provisioners exist, they execute in the order specified
  • If a creation-time provisioner fails, Terraform marks the resource as "tainted"
  • They operate outside Terraform's state management system

This last point is crucial: provisioner actions are not recorded in state files, meaning Terraform cannot detect or remediate drift in files managed through provisioners.

Complete syntax reference with real-world examples

The file provisioner uses this syntax within a resource block:

resource "aws_instance" "web" {
  # Resource configuration...

  provisioner "file" {
    source      = "config/app.conf"      # Local file/directory to copy
    content     = "configuration text"   # Alternative to source - direct content
    destination = "/etc/app/config.conf" # Remote path to place file/content
    
    # Connection information
    connection {
      type        = "ssh"          # SSH or WinRM
      user        = "ec2-user"     # Remote username
      private_key = file("key.pem") # Authentication
      host        = self.public_ip # Remote address
      # Other connection options...
    }
    
    # Meta-arguments
    when       = "create"         # When to run: "create" (default) or "destroy"
    on_failure = "fail"           # What to do on failure: "fail" (default) or "continue"
  }
}

You must specify either source or content (never both), along with a mandatory destination.

Connection configuration patterns

You can define connections at the resource level (affecting all provisioners) or inline for each provisioner:

# Resource-level connection (applies to all provisioners)
resource "aws_instance" "web" {
  # Resource configuration...
  
  connection {
    type        = "ssh"
    user        = "ubuntu"
    private_key = file("${path.module}/id_rsa")
    host        = self.public_ip
  }

  provisioner "file" { ... }
  provisioner "remote-exec" { ... }
}

# Provisioner-specific connection
resource "aws_instance" "web" {
  # Resource configuration...

  provisioner "file" {
    # File provisioner config...
    
    connection {
      # Connection details specific to this provisioner
    }
  }
}

Directory handling considerations

When copying directories, behavior depends on trailing slashes:

  • Source /local/dir (no trailing slash) to /remote/path → contents copied to /remote/path/dir
  • Source /local/dir/ (with trailing slash) to /remote/path → contents copied directly into /remote/path

With SSH connections, the destination directory must already exist. This often requires creating it first:

resource "aws_instance" "web" {
  # Resource configuration...
  
  # First create the directory
  provisioner "remote-exec" {
    inline = ["mkdir -p /opt/application/config"]
  }
  
  # Then copy files to it
  provisioner "file" {
    source      = "configs/"
    destination = "/opt/application/config"
  }
}

With WinRM connections, the destination directory is created automatically if it doesn't exist.

Security and best practices that can't be ignored

HashiCorp's official stance is clear: use provisioners as a last resort. This recommendation stems from several limitations:

  • Provisioners operate outside Terraform's declarative model
  • Terraform cannot track provisioner actions in state files
  • They require direct network access to resources
  • They add complexity with connection details and credentials

Security considerations demand attention

Credential management poses significant risks:

  • Credentials in provisioner connection blocks may expose sensitive information
  • Credentials are stored in Terraform state files, which may not be properly secured
  • SSH keys for provisioners may have broader access than necessary

Network security issues arise:

  • File provisioners require direct network access (often SSH/WinRM ports)
  • This often means opening ports that would otherwise remain closed
  • Creates potential attack vectors if not properly restricted

File permissions need careful handling:

  • Files inherit permissions from the remote system
  • When using SSH, provisioners run with the specific user's permissions
  • Sensitive files might become accessible to unauthorized users

Essential best practices to follow

  1. Use alternatives whenever possible:
    • Cloud-init with user_data for bootstrapping
    • Pre-built images with HashiCorp Packer
    • Configuration management tools for complex needs
    • Provider-specific features (when available)
  2. Secure credential handling:
    • Use SSH key-based authentication instead of passwords
    • Store private keys securely (not in version control)
    • Use environment variables or secure backends for sensitive data
    • Consider just-in-time or dynamic credential generation
  3. Optimize for performance and reliability:
    • Set appropriate timeouts to prevent hanging operations
    • Use on_failure = continue when appropriate
    • Split large file transfers into smaller chunks
    • Consider connection pooling for multiple resources
  4. Address idempotency challenges:
    • Implement checks to avoid unnecessary file transfers
    • Use checksums or timestamps to compare files
    • Consider conditional execution based on resource metadata

Dealing with common issues and effective troubleshooting

File provisioners frequently encounter several types of issues:

Connection problems need systematic diagnosis

Error: timeout - last error: dial tcp x.x.x.x:22: i/o timeout

Common causes include:

  • Security groups/firewalls blocking SSH/WinRM traffic
  • Incorrect SSH key or credentials
  • Instance not fully booted when Terraform attempts to connect

Solutions:

  • Ensure security groups allow inbound traffic on port 22 (SSH) or 5985/5986 (WinRM)
  • Verify SSH key pair or credentials
  • Add a depends_on or increase connection.timeout to allow more boot time
  • Check network routing and VPN settings

Authentication failures require verification

Error: ssh: handshake failed: ssh: unable to authenticate

Solutions:

  • Verify the private key matches the public key on the instance
  • Check permissions on the private key file (should be 400 or 600)
  • Ensure the SSH user exists on the instance
  • Try connecting manually to verify credentials

Destination path issues often require pre-creation

Error: Upload failed: scp: /path/to/dir: No such file or directory

Solutions:

  • Use remote-exec provisioner first to create the directory
  • Use absolute paths instead of relative paths or ~
  • Use a path where the SSH user has write permissions

Windows requires special attention

Error: PowerShell exited with code 1

Solutions:

  • Ensure WinRM is enabled and configured
  • Use forward slashes in paths even on Windows
  • Configure WinRM in user_data or during image creation

Alternatives that align better with IaC principles

Several approaches offer better alternatives to file provisioners in most scenarios:

Cloud-init / user data excels for initial configuration

resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t2.micro"
  user_data     = file("scripts/setup.yaml")
}

Best when:

  • Working with major cloud providers
  • Using auto-scaling groups
  • Bootstrapping instances with initial configuration
  • Avoiding direct network access requirements

Template providers enable dynamic configuration

resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t2.micro"
  user_data     = templatefile("${path.module}/templates/init.tpl", {
    server_name = var.server_name
    db_address  = aws_db_instance.database.address
  })
}

Best when:

  • Incorporating dynamic values from Terraform
  • Generating configuration files with variables
  • Working with auto-scaling groups

Pre-built images with Packer offer immutability

# Packer builds the image with files included
# Terraform simply references the pre-built image
resource "aws_instance" "web" {
  ami           = "ami-0dbaca5d269497603"  # Pre-built with Packer
  instance_type = "t2.micro"
}

Best when:

  • Deploying complex applications with many dependencies
  • Optimizing for deployment speed and reliability
  • Creating golden images for auto-scaling groups
  • Building immutable infrastructure

Cloud-init provider enables complex configurations

data "cloudinit_config" "config" {
  gzip          = true
  base64_encode = true

  part {
    content_type = "text/cloud-config"
    content      = yamlencode({
      write_files = [{
        path        = "/etc/app/config.json"
        content     = jsonencode(local.app_config)
        permissions = "0644"
      }]
    })
  }
}

resource "aws_instance" "web" {
  user_data = data.cloudinit_config.config.rendered
}

Best when:

  • Needing multi-part configurations
  • Combining cloud-config YAML with shell scripts
  • Including binary files with base64 encoding

Practical real-world implementation patterns

Deploying with dynamic template rendering

This pattern demonstrates using template files to generate configuration dynamically:

data "template_file" "app_config" {
  template = file("${path.module}/templates/app_config.json.tpl")
  vars = {
    db_host     = aws_db_instance.database.address
    db_port     = aws_db_instance.database.port
    api_key     = var.api_key
    environment = var.environment
  }
}

resource "aws_instance" "app_server" {
  # Instance configuration...
  
  connection {
    type        = "ssh"
    user        = "ubuntu"
    private_key = file("${path.module}/ssh_key.pem")
    host        = self.public_ip
  }
  
  provisioner "file" {
    content     = data.template_file.app_config.rendered
    destination = "/etc/app/config.json"
  }
  
  provisioner "remote-exec" {
    inline = ["sudo systemctl restart app-service"]
  }
}

Using null resources for post-deployment operations

For updating configurations without rebuilding infrastructure:

resource "null_resource" "deploy_config" {
  # Trigger when configuration changes
  triggers = {
    config_contents = data.template_file.service_config.rendered
  }

  connection {
    type        = "ssh"
    user        = "ubuntu"
    private_key = file("${path.module}/ssh_key.pem")
    host        = aws_instance.web_server.public_ip
  }

  provisioner "file" {
    content     = data.template_file.service_config.rendered
    destination = "/etc/nginx/sites-available/default"
  }
  
  provisioner "remote-exec" {
    inline = [
      "sudo nginx -t",
      "sudo systemctl reload nginx"
    ]
  }
}

Multi-stage deployment process for complex applications

For more sophisticated deployments:

resource "aws_instance" "application_server" {
  # Instance configuration...
  
  # Stage 1: Set up directories
  provisioner "remote-exec" {
    inline = [
      "mkdir -p /opt/app/config",
      "mkdir -p /opt/app/logs",
      "mkdir -p /opt/app/data"
    ]
  }
  
  # Stage 2: Deploy configuration files
  provisioner "file" {
    source      = "config/"
    destination = "/opt/app/config"
  }
  
  # Stage 3: Deploy application binary
  provisioner "file" {
    source      = "builds/application.jar"
    destination = "/opt/app/application.jar"
  }
  
  # Stage 4: Configure and start the application
  provisioner "remote-exec" {
    inline = [
      "chmod +x /opt/app/config/start.sh",
      "sudo systemctl enable application",
      "sudo systemctl start application"
    ]
  }
}

Cross-platform deployment with OS detection

For infrastructure supporting both Windows and Linux:

locals {
  is_windows = var.os_type == "windows" ? true : false
}

resource "aws_instance" "server" {
  # Instance configuration...
  
  connection {
    type     = local.is_windows ? "winrm" : "ssh"
    user     = local.is_windows ? "Administrator" : "ec2-user"
    password = local.is_windows ? var.admin_password : null
    private_key = local.is_windows ? null : file("${path.module}/key.pem")
    host     = self.public_ip
  }
  
  provisioner "file" {
    source      = local.is_windows ? "scripts/windows/" : "scripts/linux/"
    destination = local.is_windows ? "C:/temp" : "/tmp"
  }
  
  provisioner "remote-exec" {
    inline = [
      local.is_windows ? 
        "powershell -Command \"C:/temp/setup.ps1\"" : 
        "chmod +x /tmp/setup.sh && /tmp/setup.sh"
    ]
  }
}

Conclusion: use sparingly, but effectively

Terraform file provisioners provide a pragmatic solution for transferring files to newly created resources, addressing use cases that fall outside Terraform's declarative model. While they're powerful tools for bootstrapping and initial configuration, they should be used judiciously due to their significant limitations.

The key takeaway is to follow HashiCorp's guidance: use provisioners as a last resort. For most scenarios, alternatives like cloud-init, templated user data, or pre-built images with Packer offer better solutions that align with infrastructure as code principles.

When file provisioners are necessary, understanding their execution model, behavior on failure, and technical mechanisms enables you to use them effectively while mitigating their limitations. By following the best practices and implementation patterns outlined in this guide, you can leverage file provisioners appropriately within your Terraform workflows.