Terraform Quick Ops and Troubleshooting
Fast command recall and common failure handling for Terraform and OpenTofu operations.
Core Command Sequence
Terraform
terraform fmt -check
terraform init
terraform validate
terraform plan -out=plan.bin
terraform show -json plan.bin > plan.json
OpenTofu
tofu fmt -check
tofu init
tofu validate
tofu plan -out=plan.bin
tofu show -json plan.bin > plan.json
Common Failures and Fixes
CI Passes Locally but Fails in Runner
Causes:
- Mismatch in runtime/provider versions
- Missing lockfile updates
- Environment variables present locally but missing in CI
Fix:
- Pin runtime and providers
- Commit lockfile
- Make required env vars explicit in pipeline
Large Unexpected Replacements in Plan
Causes:
- Unstable iteration keys
- Hidden rename without
movedmapping - Data source drift feeding identity fields
Fix:
- Stabilize keys
- Add
movedblocks - Separate identity from mutable attributes
AWS RDS Identifier Validation Errors
Symptoms: InvalidParameterValue for identifier or final_snapshot_identifier, names include dots.
Fix: Normalize to lowercase letters, numbers, and hyphens:
locals {
rds_base = regexreplace(lower("${var.project}-prod"), "[^a-z0-9-]", "-")
}
resource "aws_db_instance" "main" {
identifier = local.rds_base
final_snapshot_identifier = "${local.rds_base}-final"
}
Apply Contention on Shared State
Cause: Concurrent pipelines targeting same backend key.
Fix:
- Serialize applies for that stack
- Use lock timeout and per-stack concurrency guard
Tests Are Too Costly
Fix:
- Tag tests by risk (
fast,integration,destructive) - Run full suite nightly, risk-tier suite on PRs
- Auto-clean ephemeral infra with TTL tags
State Lock Stuck
Symptom: Error: Error acquiring the state lock
Fix:
# Identify the lock holder from the error message (lock ID shown)
terraform force-unlock LOCK_ID
# OpenTofu equivalent:
tofu force-unlock LOCK_ID
Only force-unlock when you are certain no other apply is running. Check CI pipelines and team activity first.
State Corruption or Lost State
Fix:
- Restore from versioned state backend (S3 versioning, GCS versioning)
- If no backup: re-import resources using
importblocks - Never manually edit state JSON unless absolutely no alternative and with peer review
# Pull current state for inspection
terraform state pull > state-backup.json
# List all tracked resources
terraform state list
Backend Migration
When changing state backends (e.g., local to S3, or S3 to different bucket):
# Update backend config in code, then:
terraform init -migrate-state
- Always backup state before migration
- Verify resource count matches after migration
- Test plan shows no changes after migration
Provider Authentication Failures in CI
Symptom: Error: No valid credential sources found
Fix:
- Verify environment variables are set in CI runner
- Prefer workload identity federation over static keys
- Check credential expiry for short-lived tokens
- Ensure CI runner IAM role/service account has required permissions
null_resource vs terraform_data
Use terraform_data (TF 1.4+) instead of null_resource + null provider:
# Prefer this (no extra provider needed):
resource "terraform_data" "bootstrap" {
triggers_replace = [var.config_hash]
provisioner "local-exec" {
command = "bootstrap.sh"
}
}