Terraform Quick Ops and Troubleshooting

Fast command recall and common failure handling for Terraform and OpenTofu operations.

Core Command Sequence

Terraform

terraform fmt -check
terraform init
terraform validate
terraform plan -out=plan.bin
terraform show -json plan.bin > plan.json

OpenTofu

tofu fmt -check
tofu init
tofu validate
tofu plan -out=plan.bin
tofu show -json plan.bin > plan.json

Common Failures and Fixes

CI Passes Locally but Fails in Runner

Causes:

  • Mismatch in runtime/provider versions
  • Missing lockfile updates
  • Environment variables present locally but missing in CI

Fix:

  • Pin runtime and providers
  • Commit lockfile
  • Make required env vars explicit in pipeline

Large Unexpected Replacements in Plan

Causes:

  • Unstable iteration keys
  • Hidden rename without moved mapping
  • Data source drift feeding identity fields

Fix:

  • Stabilize keys
  • Add moved blocks
  • Separate identity from mutable attributes

AWS RDS Identifier Validation Errors

Symptoms: InvalidParameterValue for identifier or final_snapshot_identifier, names include dots.

Fix: Normalize to lowercase letters, numbers, and hyphens:

locals {
  rds_base = regexreplace(lower("${var.project}-prod"), "[^a-z0-9-]", "-")
}

resource "aws_db_instance" "main" {
  identifier                = local.rds_base
  final_snapshot_identifier = "${local.rds_base}-final"
}

Apply Contention on Shared State

Cause: Concurrent pipelines targeting same backend key.

Fix:

  • Serialize applies for that stack
  • Use lock timeout and per-stack concurrency guard

Tests Are Too Costly

Fix:

  • Tag tests by risk (fast, integration, destructive)
  • Run full suite nightly, risk-tier suite on PRs
  • Auto-clean ephemeral infra with TTL tags

State Lock Stuck

Symptom: Error: Error acquiring the state lock

Fix:

# Identify the lock holder from the error message (lock ID shown)
terraform force-unlock LOCK_ID
# OpenTofu equivalent:
tofu force-unlock LOCK_ID

Only force-unlock when you are certain no other apply is running. Check CI pipelines and team activity first.

State Corruption or Lost State

Fix:

  • Restore from versioned state backend (S3 versioning, GCS versioning)
  • If no backup: re-import resources using import blocks
  • Never manually edit state JSON unless absolutely no alternative and with peer review
# Pull current state for inspection
terraform state pull > state-backup.json
# List all tracked resources
terraform state list

Backend Migration

When changing state backends (e.g., local to S3, or S3 to different bucket):

# Update backend config in code, then:
terraform init -migrate-state
  • Always backup state before migration
  • Verify resource count matches after migration
  • Test plan shows no changes after migration

Provider Authentication Failures in CI

Symptom: Error: No valid credential sources found

Fix:

  • Verify environment variables are set in CI runner
  • Prefer workload identity federation over static keys
  • Check credential expiry for short-lived tokens
  • Ensure CI runner IAM role/service account has required permissions

null_resource vs terraform_data

Use terraform_data (TF 1.4+) instead of null_resource + null provider:

# Prefer this (no extra provider needed):
resource "terraform_data" "bootstrap" {
  triggers_replace = [var.config_hash]

  provisioner "local-exec" {
    command = "bootstrap.sh"
  }
}

results matching ""

    No results matching ""