The 7-Step Terraform Skill Workflow

The core of TerraShark is a 7-step operational workflow defined in SKILL.md. Unlike traditional reference manuals that dump information and hope the AI uses it correctly, this workflow forces the AI to diagnose before generating — the single most important pattern for preventing Terraform hallucinations.

Overview

Capture Context → Diagnose Failure Modes → Load References → Propose Fix → Generate Artifacts → Validate → Output Contract

Step 1: Capture Execution Context

Before writing any code, the Terraform skill records:

  • Runtime: terraform or tofu and exact version
  • Providers: which cloud providers and their versions
  • Target platform: AWS, Azure, GCP, etc.
  • State backend: S3, GCS, Azure Blob, HCP Terraform, etc.
  • Execution path: local CLI, CI, HCP Terraform/TFE, Atlantis
  • Environment criticality: dev, shared, or production

If any of these are unknown, the skill states assumptions explicitly in the output contract.

Why this matters: A module targeting Terraform 1.1 cannot use import blocks (requires 1.5+). A CI pipeline needs different patterns than local CLI. Production requires approval gates that dev does not.

Step 2: Diagnose Likely Failure Modes

The skill selects one or more failure modes based on the user's intent and risk level:

Failure Mode When Diagnosed
Identity churn Refactors, collection changes, count/for_each decisions
Secret exposure Credential handling, state access, CI artifact management
Blast radius Stack design, environment isolation, state boundaries
CI drift Pipeline setup, version management, plan/apply separation
Compliance gate gaps Policy setup, framework requirements, approval workflows

This step is what makes TerraShark fundamentally different from static reference skills. The AI must identify what could go wrong before it starts generating code.

Step 3: Load Only Relevant References

Based on the diagnosed failure modes, the skill loads targeted reference files:

Primary references (one per failure mode):

  • references/identity-churn.md
  • references/secret-exposure.md
  • references/blast-radius.md
  • references/ci-drift.md
  • references/compliance-gates.md

Supplemental references (loaded only when needed):

  • Testing, CI delivery, module architecture, coding standards, migration playbooks, security governance, quick ops, examples, and more

This granularity means a query about secret handling never loads CI delivery patterns, and a query about module architecture never loads compliance gates. Only 1-2 small, focused files are loaded per query instead of one massive dump.

Step 4: Propose Fix Path with Explicit Risk Controls

For each proposed fix, the skill includes:

  • Why this addresses the failure mode — direct mapping from diagnosis to solution
  • What could still go wrong — honest risk assessment
  • Guardrails — tests, approvals, and rollback steps to mitigate remaining risk

This forces transparency. The AI cannot silently generate code that might cause damage without disclosing the risks.

Step 5: Generate Implementation Artifacts

When applicable, the output includes:

  • HCL changes: typed variables, stable keys, bounded version constraints
  • Migration blocks: moved blocks, import strategy
  • CI pipeline updates: plan/apply separation, artifact management, policy checks
  • Compliance controls: approval gates, policy rules, evidence paths

Step 6: Validate Before Finalize

The skill provides a command sequence tailored to the runtime and risk tier:

terraform fmt -check
terraform validate
terraform plan -out=plan.bin
terraform show -json plan.bin > plan.json

The skill never recommends direct production apply without a reviewed plan and approval gate.

Step 7: Output Contract

Every response includes a structured contract:

Section Content
Assumptions and version floor What was assumed about the environment
Selected failure modes Which risks were diagnosed
Chosen remediation and tradeoffs What was recommended and what was traded off
Validation/test plan How to verify the output
Rollback/recovery notes How to undo if something goes wrong

This makes every output auditable. A reviewer can check assumptions, verify failure mode coverage, and validate the rollback path — all before applying anything.

Why This Architecture Works

The 7-step workflow prevents the most common failure pattern in LLM-assisted infrastructure-as-code: producing syntactically valid but operationally dangerous code.

A static reference manual tells the AI what good Terraform looks like. The 7-step workflow tells the AI how to think about Terraform problems. This is the difference between giving someone a cookbook and giving them a diagnostic checklist.

results matching ""

    No results matching ""