Terraform Identity Churn: Preventing Resource Address Instability
Identity churn is one of the most common and dangerous Terraform failure modes. It occurs when resource addresses or object identity shift unexpectedly, causing destroy/create cycles that can take down production infrastructure.
Symptoms of Identity Churn
- Plan shows broad replace actions after small list edits
- Renaming resources/modules triggers destroy/create instead of in-place updates
- Refactor from
counttofor_eachcauses churn - Imported resources keep drifting because addressing is unstable
Primary Causes
- Index-based identity (
count) used for long-lived logical objects - Unstable keys derived from sorted lists or transient IDs
- Missing
movedblocks during refactors - Hidden dependencies forcing replacement chains
for_eachkeys derived from values unknown at plan time
Prevention Rules
- Use
for_eachfor long-lived identities - Choose stable keys from business identity (e.g.,
zone-a,payments-api) - Keep identity attributes separate from mutable attributes
- Add
movedblocks before first apply after rename/restructure
Decision Matrix: count vs for_each
Use count only when:
- Resource is truly optional singleton (
0or1) - No downstream references depend on stable per-item addresses
Use for_each when:
- Multiple logical instances are expected
- Insertion/removal/reordering happens over time
- Downstream references need stable keys
- Keys are fully known during planning
When keys are unknown at plan time:
- Drive
for_eachfrom known input keys - Use
countfor conditional/singleton creation when key-stablefor_eachis not possible
Safe Migration: count to for_each
- Define stable key map
- Refactor resource to
for_each - Add one
movedblock per old index - Verify plan reports move operations, not replace
- Apply in lower environment first
Example
locals {
app_subnets = {
a = { cidr = "10.40.1.0/24", az = "us-east-1a" }
b = { cidr = "10.40.2.0/24", az = "us-east-1b" }
}
}
resource "aws_subnet" "app" {
for_each = local.app_subnets
vpc_id = aws_vpc.main.id
cidr_block = each.value.cidr
availability_zone = each.value.az
tags = {
Name = "app-${each.key}"
}
}
moved {
from = aws_subnet.app[0]
to = aws_subnet.app["a"]
}
moved {
from = aws_subnet.app[1]
to = aws_subnet.app["b"]
}
Rename Playbook
When renaming resource/module labels, always add moved first:
moved {
from = module.network_core
to = module.network_foundation
}
Known-at-Plan Failure Pattern
Bad โ key depends on apply-time value:
resource "aws_security_group_rule" "egress" {
for_each = toset([aws_security_group.ecs.id])
type = "egress"
from_port = 443
to_port = 443
protocol = "tcp"
security_group_id = each.value
cidr_blocks = ["0.0.0.0/0"]
}
Safer fallback for optional singleton behavior:
resource "aws_security_group_rule" "egress" {
count = var.enable_egress_rule ? 1 : 0
type = "egress"
from_port = 443
to_port = 443
protocol = "tcp"
security_group_id = aws_security_group.ecs.id
cidr_blocks = ["0.0.0.0/0"]
}
LLM Mistake Checklist
Common model mistakes the Terraform skill corrects:
- Defaults to
countfor every collection - Omits
movedblocks in refactors - Uses list index as identity key
- Suggests
terraform state mvin automation wheremovedis safer and reviewable - Builds
for_eachkeys from computed IDs not known until apply
Verification Commands
terraform fmt -check
terraform validate
terraform plan -out=plan.bin
terraform show plan.bin | grep -i moved
OpenTofu equivalent:
tofu fmt -check
tofu validate
tofu plan -out=plan.bin
tofu show plan.bin | grep -i moved