Terraform Skill Design Philosophy

This page describes the architectural decisions and empirical process behind TerraShark's design.

Failure-Mode-First Architecture

TerraShark is built around a single insight: telling an LLM what good Terraform looks like is less effective than telling it how to think about Terraform problems.

The core SKILL.md is not a reference manual. It is a 7-step operational workflow that forces the model to diagnose before it generates. This prevents the most common failure pattern in LLM-assisted IaC: producing syntactically valid but operationally dangerous code.

Token Efficiency as a Design Constraint

Context window space is a finite resource. Every token spent on skill content is a token unavailable for the user's actual codebase, conversation history, and tool results.

TerraShark is designed for minimal activation cost:

Metric	TerraShark	Typical Alternative
Activation cost	~600 tokens	~4,400 tokens
Reference files	18 focused files	6 large files
Loaded per query	1-2 small files	Large reference dumps

The core SKILL.md is 79 lines containing no HCL examples, no inline code blocks, and no tutorial material. It is purely procedural. Depth lives in 18 granular reference files loaded on demand.

LLM-Aware Guardrails

Every reference file that covers a risk domain includes an LLM mistake checklist — a list of specific errors that language models make when generating Terraform code:

Defaulting to count instead of for_each for collections
Omitting moved blocks during refactors, causing destroy/create cycles
Using sensitive and assuming the value is safe from state
Proposing plaintext credential defaults "for demo purposes"
Recommending CLI-only terraform import instead of declarative import blocks

These checklists exist because the model needs to know what it gets wrong, not just what is correct. A reference that only shows the right pattern still allows the model to hallucinate the wrong one. A reference that explicitly names the hallucination pattern reduces it.

The Feature Guard Table in coding-standards.md maps Terraform features to their minimum version and the specific LLM error pattern associated with each, letting the model check feature availability before emitting code.

Output Contracts

Every TerraShark response includes a structured output contract:

Assumptions and version floor — what the model assumed
Selected failure modes — which risks were diagnosed
Chosen remediation and tradeoffs — what was recommended and why
Validation/test plan — how to verify the output
Rollback/recovery notes — how to undo if something goes wrong

This makes outputs auditable. A reader can check assumptions, verify failure mode coverage, and validate the rollback path before applying anything.

Reference Granularity

The 18 reference files are organized by concern, not by Terraform concept:

Category	Files	When Loaded
Primary failure modes	Identity churn, secret exposure, blast radius, CI drift, compliance gates	When that failure mode is diagnosed
Structural guidance	Structure/state, module architecture, coding standards	When designing or refactoring
Operational references	Migration playbooks, testing matrix, CI delivery, security/governance, quick ops	For specific operational tasks
Pattern banks	Good examples, bad examples, neutral examples, do/don't patterns	For review or teaching
Integration and meta	MCP integration, token balance rationale	When relevant

Each file is self-contained. No file depends on another file being loaded simultaneously.

Deep Hierarchy Model

For platform engineering at scale, TerraShark defines a 5-level module hierarchy:

Level	Role	Scope
L0	Primitives	One resource family, strict contract
L1	Composites	Capability units built from primitives
L2	Domain stacks	Bounded business domains
L3	Environment roots	Env-specific wiring and configuration
L4	Org orchestration	Account/project vending and shared policy

Dependencies flow downward only. Each level owns its state boundary and apply lifecycle.

Content Inclusion Rules

Content enters TerraShark only when at least one condition is met:

It materially lowers the probability of destructive or non-compliant changes
It prevents common plan/apply surprises
It encodes organizational guardrails that general model knowledge cannot infer

Content is excluded when:

It is generic Terraform/OpenTofu knowledge with low failure impact
It is provider-specific deep design that belongs in project docs
It duplicates an existing rule without adding a new decision signal

The Token Experiment

The content in TerraShark was empirically tested, not designed by intuition.

Process

Started large — broader coverage, more examples, more tutorial material
Built automated test suite — practical Terraform/OpenTofu task patterns
Measured baseline quality — correctness, safety, completeness, hallucination rate
Stripped iteratively — removed sections one at a time, re-running the full test suite
Measured quality impact — if quality dropped, content was restored; if stable, content was permanently removed
Converged — continued until every remaining section was load-bearing

What Survived (Models Need Help With)

Module role boundaries and composition rules
Migration playbooks (moved blocks, count-to-for_each, imports)
Native test caveats (set indexing, computed values, mocked providers)
CI delivery templates (policy checks, artifact integrity, env protection)
Quick troubleshooting (stuck locks, backend migration, provider auth in CI)

What Was Removed (Models Already Know)

Generic HCL syntax tutorials
Provider-specific resource deep dives
Broad "best practice" prose without failure-mode framing
Duplicate explanations of concepts covered by multiple rules

Core Design Principle

High signal density. Every line must earn its token cost by preventing a specific failure mode or encoding knowledge the model demonstrably lacks. Content that merely restates what the model already knows is actively harmful — it burns context window space without improving output quality.

Design Philosophy