What is SAE?
Structured Agent Execution is a execution template that breaks agent work into defined phases with checkpoints between them. Instead of “do the whole task in one shot,” the agent follows a structured pipeline:Why Structure Matters
Without structure, agents make two common mistakes:- Premature execution — editing code before understanding the codebase
- Thrashing — cycling between build, test, and edit without progress
The Phases
Pre-analysis (deterministic, zero LLM cost)
Parse the project before the LLM sees it:
- Import scanning (what frameworks are used?)
- Dependency analysis (what’s in the POM?)
- File structure mapping (where’s the code?)
- KB routing (which knowledge applies?)
Plan
The agent reads pre-analysis results and relevant knowledge, then produces an execution plan:
- What files to modify
- In what order
- What to verify at each step
Execute
The agent follows its own plan:
- Tool calls scoped to the current plan step
- Build verification after each modification
- Automatic rollback on critical failures
SAE vs Unstructured Execution
| Dimension | Unstructured | SAE |
|---|---|---|
| First action | Agent decides (often: start editing) | Read pre-analysis, then plan |
| Knowledge access | Agent searches ad-hoc | Pre-routed by deterministic analysis |
| Build failures | Agent retries blindly | Diagnostic feedback classifies the failure |
| Cost | Variable (thrashing wastes tokens) | Predictable (phases bound token spend) |
| Observability | Opaque tool-call stream | Phase-tagged traces, plan-to-execution diff |
Evidence
Code Coverage v1 — Most Efficient Variant
SAE (variant 5) was the most efficient variant across all 9 configurations:- 70 expected steps to completion (vs 100+ for unstructured variants)
- $2.84 per run (lowest cost)
- Cleanest Markov fingerprint — fewest thrashing loops
The Pre-Analysis Multiplier
In the issue classification experiment, deterministic pre-analysis (+pre-analysis) routes to the correct KB subset at zero LLM cost. The agent starts with the right context instead of spending tokens searching for it.
For the Arize dataset, pre-analysis parses _pytest imports and routes to the Python testing KB. This single deterministic step eliminates an entire exploration phase that would cost $0.10-0.50 per item.
Implementation
SAE is implemented through the collaboration of several projects:| Component | Project | Role |
|---|---|---|
| Pre-analysis tools | Agent Tools & Skills | Deterministic parsing |
| Execution loop | Agent Workflow | Phase management, guard rails |
| Knowledge routing | Forge | KB structure, index routing |
| Verification | Agent Judge | Post-execution jury evaluation |
| Diagnostic feedback | Agent Experiment | Gap classification, remediation |
How to Define SAE for Your Project
- Identify what’s deterministic — What can you parse, analyze, or route without an LLM?
- Define phase boundaries — What must be true before the agent moves to the next phase?
- Set guard rails — Which tools are allowed in each phase? What’s the cost limit?
- Wire verification — What judges check the output?
Related
Forge Pipeline
How SAE templates get created and packaged
Markov Fingerprinting
How SAE shows up in behavioral traces