Understanding Terraform provisioners

Master Terraform provisioners: learn what they do, when to use or avoid them, and best-practice tips for cleaner, more reliable infrastructure code.

Sebastian Stadil

22 May 2025 • 11 min read

Infrastructure as code (IaC) often requires operations that don't map directly to physical infrastructure. Terraform and OpenTofu's "resourceless provisioners" provide a powerful solution for this gap, enabling essential orchestration, configuration, and dependency management beyond what declarative resources alone can achieve.

Provisioners without resources: solving the abstraction gap

Terraform and OpenTofu are fundamentally declarative tools, focusing on what infrastructure should exist rather than how to create it. However, infrastructure management frequently requires imperative actions like running scripts, coordinating multiple resources, or performing operations at specific lifecycle points.

Provisioners without resources address this need through two special resource types:

null_resource: A resource from the null provider that implements Terraform's resource lifecycle but creates no actual infrastructure. It serves solely as a container for provisioners, triggered by changes to values in its triggers map.

resource "null_resource" "cluster_setup" {
  triggers = {
    cluster_instance_ids = join(",", aws_instance.cluster[*].id)
  }

  provisioner "local-exec" {
    command = "echo The cluster IPs are ${join(" ", aws_instance.cluster[*].private_ip)}"
  }
}

terraform_data: Introduced in Terraform 1.4 (2023) as a built-in alternative to null_resource, eliminating the need for an external provider. It uses triggers_replace to control execution and can store values through its input/output attributes.

resource "terraform_data" "cluster_setup" {
  triggers_replace = aws_instance.cluster[*].id
  
  provisioner "local-exec" {
    command = "echo The cluster IPs are ${join(" ", aws_instance.cluster[*].private_ip)}"
  }
}

Both resources participate in Terraform's dependency graph and state management but don't represent actual infrastructure. They create an "execution hook" within Terraform's apply phase, allowing you to run scripts, manage configurations, or orchestrate processes not directly supported through declarative resources.

Key differences between them:

Feature	null_resource	terraform_data
Provider	External null provider	Built-in (terraform.io/builtin/terraform)
Trigger mechanism	`triggers` map	`triggers_replace` value
Data storage	No built-in storage	Can store values with `input`/`output`
First introduced	Earlier Terraform versions	Terraform 1.4 (2023)
Future direction	May eventually be deprecated	Preferred going forward

When to use them: While HashiCorp emphasizes that provisioners should be used as a "last resort," these resources prove invaluable when you need to execute commands not tied to specific resources, coordinate actions across multiple resources, or create artificial dependencies to control execution order.

Key scenarios where resourceless provisioners shine

Provisioners without resources excel in specific scenarios where Terraform's declarative model falls short. The most valuable use cases include:

Orchestration and dependency management

When resources need coordination beyond what depends_on can provide, resourceless provisioners create additional control points:

# Run cluster initialization only after all nodes are created
resource "terraform_data" "initialize_cluster" {
  triggers_replace = aws_instance.cluster[*].id
  
  provisioner "remote-exec" {
    connection {
      host = aws_instance.cluster[0].public_ip
      # connection details...
    }
    
    inline = [
      "cluster-init.sh ${join(" ", aws_instance.cluster[*].private_ip)}"
    ]
  }
}

Integrating with external systems

When you need to interact with systems outside Terraform's control:

# Register a new server with load balancer after creation
resource "null_resource" "register_with_lb" {
  triggers = {
    server_ip = aws_instance.web.private_ip
  }
  
  provisioner "local-exec" {
    command = "curl -X POST https://loadbalancer-api.example.com/servers -d '{\"ip\":\"${aws_instance.web.private_ip}\"}'"
  }
}

Running local scripts conditionally

When certain scripts should run only when specific resources change:

# Generate configuration files when database changes
resource "terraform_data" "config_generator" {
  triggers_replace = aws_db_instance.main.endpoint
  
  provisioner "local-exec" {
    command = "python generate_config.py --db-host=${aws_db_instance.main.address} --db-name=${aws_db_instance.main.name}"
  }
}

Specialized resource cleanup

Executing cleanup operations before resource destruction:

resource "null_resource" "unmount_volume" {
  triggers = {
    volume_id = aws_ebs_volume.data.id
    instance_id = aws_instance.web.id
    instance_ip = aws_instance.web.public_ip
  }
  
  connection {
    type = "ssh"
    host = self.triggers.instance_ip
    # connection details...
  }
  
  # Run only when destroying this resource
  provisioner "remote-exec" {
    when = destroy
    inline = [
      "sudo umount /data",
      "sudo sed -i '/\\/data/d' /etc/fstab"
    ]
  }
}

Tracking state changes with minimal infrastructure impact

Using terraform_data specifically to track a value without creating physical infrastructure:

resource "terraform_data" "deployment_version" {
  input = var.app_version
}

output "current_version" {
  value = terraform_data.deployment_version.output
}

Implementation patterns: practical examples

Let's explore how to implement resourceless provisioners effectively across different scenarios.

Local-exec integration

The local-exec provisioner runs commands on the machine executing Terraform:

resource "null_resource" "database_setup" {
  triggers = {
    db_instance_id = aws_db_instance.main.id
    schema_version = var.schema_version
  }

  provisioner "local-exec" {
    # Use multiline script with environment variables
    command = <<-EOT
      export DB_HOST=${aws_db_instance.main.address}
      export DB_PORT=${aws_db_instance.main.port}
      export DB_NAME=${aws_db_instance.main.name}
      export DB_USER=${aws_db_instance.main.username}
      export DB_PASS=${aws_db_instance.main.password}
      
      # Run database migrations
      ./migrate.sh up
    EOT
  }
}

Remote-exec integration

The remote-exec provisioner runs commands on a remote resource:

resource "terraform_data" "configure_server" {
  triggers_replace = aws_instance.web.id
  
  connection {
    type        = "ssh"
    user        = "ec2-user"
    private_key = file("~/.ssh/id_rsa")
    host        = aws_instance.web.public_ip
  }
  
  provisioner "remote-exec" {
    inline = [
      "sudo yum update -y",
      "sudo yum install -y nginx",
      "sudo systemctl enable nginx",
      "sudo systemctl start nginx"
    ]
  }
}

File provisioner integration

The file provisioner copies files to a remote resource:

resource "null_resource" "deploy_config" {
  triggers = {
    config_hash = md5(file("${path.module}/config.json"))
    instance_id = aws_instance.app.id
  }
  
  connection {
    type        = "ssh"
    host        = aws_instance.app.public_ip
    # connection details...
  }
  
  # Upload configuration file
  provisioner "file" {
    source      = "${path.module}/config.json"
    destination = "/etc/app/config.json"
  }
  
  # Apply configuration
  provisioner "remote-exec" {
    inline = ["sudo systemctl restart app-service"]
  }
}

Effective trigger patterns

The real power of resourceless provisioners comes from their triggering mechanisms:

# Run on every apply
resource "null_resource" "always_run" {
  triggers = {
    always_run = timestamp()
  }
  # provisioner...
}

# Run when content changes
resource "null_resource" "content_triggered" {
  triggers = {
    content_hash = md5(file("${path.module}/script.sh"))
  }
  # provisioner...
}

# Run when multiple resources change
resource "terraform_data" "multi_resource_trigger" {
  triggers_replace = [
    aws_instance.web.id,
    aws_security_group.web.id,
    aws_db_instance.main.id
  ]
  # provisioner...
}

State persistence patterns

To store and retrieve data from resourceless provisioners:

# Using terraform_data for simple value storage
resource "terraform_data" "app_settings" {
  input = {
    version = var.app_version
    environment = var.environment
    features = var.enabled_features
  }
}

# Reference elsewhere
resource "aws_ssm_parameter" "app_config" {
  name  = "/app/config"
  type  = "String"
  value = jsonencode(terraform_data.app_settings.output)
}

# Alternative approach using files
resource "null_resource" "generate_data" {
  provisioner "local-exec" {
    command = "get-data.sh > ${path.module}/output.json"
  }
}

data "local_file" "generated_data" {
  depends_on = [null_resource.generate_data]
  filename = "${path.module}/output.json"
}

Best practices for resourceless provisioners

Follow these guidelines to use resourceless provisioners effectively:

Design principles

Use as a last resort: Always consider declarative alternatives first
Keep it simple: Move complex logic to external scripts
Isolate side effects: Ensure provisioners have minimal external dependencies
Design for idempotence: Scripts should be safe to run multiple times
Prefer terraform_data: For new projects, use the built-in resource rather than null_resource

Resource organization

Descriptive naming: Name resources to clearly indicate their purpose
Logical grouping: Keep null resources close to the resources they interact with
Module encapsulation: Consider encapsulating complex patterns in reusable modules

Trigger design

Precise triggering: Only include attributes that should cause re-execution
Avoid timestamp() abuse: Only use timestamp() when you truly need to run every time
Content-based triggers: Use hashes of files or content when appropriate

# Good: targeted triggers
resource "null_resource" "db_migration" {
  triggers = {
    # Only run when database endpoint or schema version changes
    db_endpoint = aws_db_instance.main.endpoint
    schema_version = var.db_schema_version
  }
  # provisioner...
}

# Bad: overly broad triggers
resource "null_resource" "db_migration" {
  triggers = {
    # Will run whenever any instance attribute changes
    instance = jsonencode(aws_instance.web)
  }
  # provisioner...
}

Lifecycle management

Explicit dependencies: Always use depends_on to ensure correct execution order
Destroy-time provisioners: Use when = destroy for cleanup operations
Error handling: Use on_failure = continue for non-critical operations

resource "null_resource" "example" {
  depends_on = [aws_instance.web]
  
  provisioner "local-exec" {
    on_failure = continue  # Don't fail the whole apply if this fails
    command    = "echo 'Instance created: ${aws_instance.web.id}'"
  }
  
  provisioner "local-exec" {
    when    = destroy
    command = "echo 'Cleaning up instance: ${self.triggers.instance_id}'"
  }
  
  triggers = {
    instance_id = aws_instance.web.id
  }
}

Comparing approaches: null_resource vs terraform_data vs alternatives

Let's compare the different approaches for running operations without direct resource association:

null_resource vs terraform_data

Advantages of terraform_data:

Built into Terraform core (no provider needed)
Cleaner syntax with triggers_replace instead of triggers map
Value storage through input/output attributes
Better integration with Terraform's dependency system
Official HashiCorp recommendation for new projects

Advantages of null_resource:

Better supported in older Terraform versions
More established patterns and examples
Full support in OpenTofu
More familiar to experienced practitioners

When to use each approach

For new projects: Prefer terraform_data as it's the built-in solution
For existing projects: Consider gradually migrating from null_resource to terraform_data
For OpenTofu projects: If full compatibility is needed, stick with null_resource until OpenTofu fully supports terraform_data

Declarative alternatives

When possible, consider these alternatives to provisioners:

Cloud-init/user-data scripts: For server initialization

resource "aws_instance" "web" {
  ami           = "ami-a1b2c3d4"
  instance_type = "t2.micro"
  
  user_data = <<-EOF
    #!/bin/bash
    apt-get update
    apt-get install -y nginx
    systemctl enable nginx
    systemctl start nginx
  EOF
}

Custom data sources: For dynamic values and API interactions

data "external" "get_latest_ami" {
  program = ["bash", "${path.module}/scripts/get-ami.sh"]
}

resource "aws_instance" "example" {
  ami = data.external.get_latest_ami.result.id
  # ...
}

replace_triggered_by lifecycle meta-argument: For resource replacement

resource "aws_instance" "example" {
  # ... configuration ...
  
  lifecycle {
    replace_triggered_by = [
      terraform_data.revision.output
    ]
  }
}

resource "terraform_data" "revision" {
  input = var.revision
}

Lifecycle management and dependency handling

Proper lifecycle and dependency management is crucial when using resourceless provisioners.

Creating explicit dependencies

Use depends_on to ensure resources are created in the correct order:

resource "terraform_data" "bootstrap" {
  depends_on = [
    aws_instance.cluster,
    aws_security_group_rule.allow_ssh
  ]
  
  # This ensures the security group rule is applied before 
  # we try to connect to the instance
  connection {
    host = aws_instance.cluster.public_ip
    # ...
  }
  
  provisioner "remote-exec" {
    # ...
  }
}

Managing resource replacement

Control when resources are replaced using the trigger mechanisms:

# Force replacement of a resource when a variable changes
resource "terraform_data" "force_replacement" {
  input = var.force_replacement_version
}

resource "aws_instance" "app" {
  # ... configuration ...
  
  lifecycle {
    replace_triggered_by = [terraform_data.force_replacement]
  }
}

Creation-time vs. destroy-time provisioners

Use the when attribute to control provisioner execution timing:

resource "null_resource" "cleanup" {
  triggers = {
    instance_id = aws_instance.web.id
    instance_ip = aws_instance.web.public_ip
  }
  
  # Runs when the resource is created
  provisioner "remote-exec" {
    connection {
      host = aws_instance.web.public_ip
      # ...
    }
    
    inline = [
      "echo 'Setting up instance'"
    ]
  }
  
  # Runs when the resource is destroyed
  provisioner "remote-exec" {
    when = destroy
    
    connection {
      host = self.triggers.instance_ip
      # Must use self.triggers since the instance might be gone
    }
    
    inline = [
      "echo 'Cleaning up instance ${self.triggers.instance_id}'"
    ]
  }
}

Conditional execution

Use count or for_each to conditionally create resourceless provisioners:

resource "null_resource" "conditional" {
  count = var.enable_provisioning ? 1 : 0
  
  triggers = {
    instance_id = aws_instance.web.id
  }
  
  provisioner "local-exec" {
    command = "echo 'Provisioning is enabled'"
  }
}

Common pitfalls and how to avoid them

Resourceless provisioners introduce several challenges that need careful management.

Overusing provisioners

Pitfall: Using provisioners for tasks better handled by declarative resources or specialized tools.

Solution: Always consider if your use case can be addressed with:

Native provider resources
Data sources for dynamic values
Cloud-init or similar initialization methods
Purpose-built tools like Ansible or Chef

Poor trigger design

Pitfall: Triggers that are too broad cause unnecessary execution; triggers that are too narrow miss necessary executions.

Solution:

Include only the specific resource attributes that should trigger execution
Test trigger behavior with terraform plan before applying
Document trigger design choices for future reference

# Too broad - runs when ANY attribute of the instance changes
resource "null_resource" "bad_trigger" {
  triggers = {
    instance = jsonencode(aws_instance.web)
  }
}

# Just right - runs only when relevant attributes change
resource "null_resource" "good_trigger" {
  triggers = {
    instance_id = aws_instance.web.id
    public_ip = aws_instance.web.public_ip
  }
}

State file pollution

Pitfall: Overuse of resourceless provisioners bloats the state file.

Solution:

Consolidate related provisioners into a single resource
Use external state storage for large datasets
Clean up unnecessary null resources when they're no longer needed

Dependency race conditions

Pitfall: Provisioners executing before dependencies are fully ready.

Solution:

Use explicit depends_on declarations
Add wait conditions or retry logic in scripts
Verify resources are ready before taking action

resource "null_resource" "setup_database" {
  depends_on = [aws_db_instance.main]
  
  provisioner "local-exec" {
    command = <<-EOT
      # Wait for database to be ready
      until mysql -h ${aws_db_instance.main.address} -u ${aws_db_instance.main.username} -p${aws_db_instance.main.password} -e "SELECT 1"; do
        echo "Waiting for database connection..."
        sleep 5
      done
      
      echo "Database is ready, proceeding with setup"
      # Setup commands...
    EOT
  }
}

Script error handling

Pitfall: Scripts failing silently or with unclear errors.

Solution:

Set -e in bash scripts to exit on errors
Return clear error codes
Log detailed error information
Use the on_failure attribute appropriately

resource "null_resource" "with_error_handling" {
  provisioner "local-exec" {
    command = <<-EOT
      #!/bin/bash
      set -e  # Exit immediately if a command fails
      
      # Redirect output to log file
      exec > >(tee -a /tmp/terraform-script.log) 2>&1
      
      echo "Starting operation at $(date)"
      
      # Your commands here
      
      echo "Operation completed successfully"
    EOT
    
    # Continue Terraform execution even if this fails
    on_failure = continue
  }
}

Managing destroy-time provisioner state

Pitfall: Destroy-time provisioners can't access current resource attributes.

Solution: Store necessary values in the triggers map:

resource "null_resource" "cleanup" {
  triggers = {
    # Store values needed during destroy
    instance_ip = aws_instance.web.public_ip
    bucket_name = aws_s3_bucket.logs.bucket
  }
  
  provisioner "local-exec" {
    when = destroy
    command = "cleanup-script.sh ${self.triggers.instance_ip} ${self.triggers.bucket_name}"
  }
}

Modern alternatives and evolving best practices

The infrastructure as code ecosystem is shifting toward more declarative, cloud-native patterns.

The shift from imperative to declarative

Current trend: Both HashiCorp and the OpenTofu community emphasize that provisioners should be a "last resort" due to their imperative nature in an otherwise declarative system.

Evolving best practices:

Use cloud provider-native initialization (user-data, cloud-init)
Leverage immutable infrastructure patterns
Separate infrastructure provisioning from configuration management
Adopt container-based deployment strategies

Cloud-native alternatives

Each major cloud provider offers native initialization mechanisms:

AWS: EC2 user data, Systems Manager, Lambda functions
Azure: Custom Script Extension, Automation Runbooks
GCP: Startup scripts, Cloud Run jobs

Example with AWS Systems Manager:

resource "aws_ssm_document" "web_setup" {
  name          = "web-server-setup"
  document_type = "Command"
  
  content = jsonencode({
    schemaVersion = "2.2"
    description   = "Setup web server"
    mainSteps = [
      {
        action = "aws:runShellScript"
        name   = "InstallNginx"
        inputs = {
          runCommand = [
            "apt-get update",
            "apt-get install -y nginx",
            "systemctl enable nginx",
            "systemctl start nginx"
          ]
        }
      }
    ]
  })
}

resource "aws_ssm_association" "web_setup" {
  name = aws_ssm_document.web_setup.name
  targets {
    key    = "InstanceIds"
    values = [aws_instance.web.id]
  }
}

Configuration management integration

Rather than embedding configuration logic in Terraform:

resource "null_resource" "ansible_config" {
  depends_on = [aws_instance.web]
  
  provisioner "local-exec" {
    command = "ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -i '${aws_instance.web.public_ip},' playbook.yml"
  }
}

Container-based alternatives

For application deployment, container-based approaches offer better separation of concerns:

resource "aws_ecs_task_definition" "app" {
  family                   = "app"
  container_definitions    = jsonencode([
    {
      name      = "app"
      image     = "app:${var.app_version}"
      essential = true
      # ...
    }
  ])
  # ...
}

HashiCorp ecosystem integration

The broader HashiCorp ecosystem offers better alternatives for many use cases:

Packer for building machine images
Vault for secrets management
Consul for service discovery
Nomad for workload orchestration

Infrastructure modules as a declarative alternative

Well-designed modules can replace many provisioner use cases:

module "web_cluster" {
  source = "./modules/web-cluster"
  
  instance_count    = 3
  instance_type     = "t3.medium"
  app_version       = var.app_version
  bootstrap_enabled = true
}

Real-world scenarios: when provisioners without resources still make sense

Despite the trend toward declarative approaches, several scenarios still benefit from resourceless provisioners.

Case study: Database migration orchestration

A financial services company needed to safely migrate database schemas during infrastructure updates:

resource "aws_db_instance" "main" {
  # ... configuration ...
  
  # Skip final snapshot on destroy to avoid naming conflicts on recreate
  skip_final_snapshot = true
}

resource "null_resource" "db_migration" {
  triggers = {
    schema_version = var.schema_version
    db_endpoint    = aws_db_instance.main.endpoint
  }
  
  provisioner "local-exec" {
    command = <<-EOT
      # Wait for database to be fully available
      until mysql -h ${aws_db_instance.main.address} -u ${aws_db_instance.main.username} -p${aws_db_instance.main.password} -e "SELECT 1"; do
        echo "Waiting for database connection..."
        sleep 5
      done
      
      # Run migrations up to the current schema version
      DB_HOST=${aws_db_instance.main.address} \
      DB_PORT=${aws_db_instance.main.port} \
      DB_NAME=${aws_db_instance.main.name} \
      DB_USER=${aws_db_instance.main.username} \
      DB_PASS=${aws_db_instance.main.password} \
      ./migrate.sh up
    EOT
  }
}

Key insights: This approach allowed the team to maintain infrastructure and database schema in the same code, ensuring migrations ran exactly when needed.

Case study: Multi-cloud certificate management

A SaaS provider needed to generate and distribute SSL certificates across multiple cloud providers:

resource "null_resource" "certificate_generator" {
  triggers = {
    domains = join(",", var.domains)
  }
  
  provisioner "local-exec" {
    command = "generate-cert.sh ${join(" ", var.domains)} > ${path.module}/certs/cert.pem"
  }
}

data "local_file" "certificate" {
  depends_on = [null_resource.certificate_generator]
  filename   = "${path.module}/certs/cert.pem"
}

# AWS certificate
resource "aws_acm_certificate" "cert" {
  certificate_body = data.local_file.certificate.content
  # ...
}

# Azure certificate
resource "azurerm_app_service_certificate" "cert" {
  certificate_content = data.local_file.certificate.content
  # ...
}

Key insights: This pattern allowed centralized certificate generation with distribution to multiple cloud providers, maintaining consistency across environments.

Case study: Legacy system integration

A manufacturing company needed to integrate Terraform-managed infrastructure with a legacy ERP system:

resource "aws_instance" "erp_connector" {
  # ... configuration ...
}

resource "null_resource" "erp_registration" {
  triggers = {
    instance_id = aws_instance.erp_connector.id
    instance_ip = aws_instance.erp_connector.private_ip
  }
  
  provisioner "local-exec" {
    command = <<-EOT
      # Custom script to register with legacy ERP system
      python register_with_erp.py \
        --system-name="AWS-Connector-${aws_instance.erp_connector.id}" \
        --system-ip=${aws_instance.erp_connector.private_ip} \
        --system-type=CONNECTOR
    EOT
  }
  
  # Deregister on destroy
  provisioner "local-exec" {
    when    = destroy
    command = "python deregister_from_erp.py --system-id=${self.triggers.instance_id}"
  }
}

Key insights: This approach allowed the team to maintain infrastructure in Terraform while bridging to systems without API-based integration capabilities.

Conclusion: embracing provisioners while moving toward declarative patterns

Provisioners without resources fill an important gap in Terraform and OpenTofu's capabilities, enabling operations that don't fit neatly into the declarative model. While the ecosystem is moving toward more declarative approaches, these tools remain valuable when used judiciously.

For new projects, prefer terraform_data over null_resource as it's built into Terraform core and represents the direction HashiCorp is taking. Always consider whether your use case could be better addressed with cloud-native initialization, specialized tools, or other declarative approaches.

When provisioners are necessary, follow best practices:

Use precise, minimal triggers
Handle errors gracefully
Document your approach clearly
Design for idempotence and repeatability
Consider the lifecycle implications

By understanding when and how to use these powerful tools appropriately, you can effectively bridge the gap between Terraform's declarative model and the sometimes messy reality of infrastructure management.