Code Coverage v1 — Knowledge Injection Baseline

Parameter	Value
Target	Spring Boot projects (gs-rest-service, spring-petclinic)
Variants	9 (baseline → full forge with SAE)
Evaluation	Four-tier jury (T0-T3)
Build tool	Maven
Agent engine	Agent Workflow

Parameter

Value

Target

Spring Boot projects (gs-rest-service, spring-petclinic)

Variants

9 (baseline → full forge with SAE)

Evaluation

Four-tier jury (T0-T3)

Build tool

Maven

Agent engine

Agent Workflow

The 9 Variants

#	Variant	Knowledge Level
1	Simple prompt	None
2	+ System prompt	Minimal guidance
3	+ Flat knowledge base	File-based domain knowledge
4	+ Skills (SkillsJars)	Structured, agent-accessible knowledge
5	+ Skills + SAE	Skills + Structured Agent Execution
6	+ Hardened prompt	Defensive instructions
7	+ Hardened + KB	Hardened + flat knowledge
8	+ Hardened + Skills	Hardened + structured knowledge
9	+ Forge (full stack)	Complete knowledge-directed execution

Key Findings

Two independent axes discovered — Knowledge injection and prompt hardening improve quality independently

Model floor exists — PetClinic achieves 92-94% coverage across all variants (the model already knows PetClinic)

SAE is most efficient — 70 expected steps, $2.84 per run

Partial knowledge paradox — Some knowledge without structure can decrease performance

First Markov fingerprints — Tool-call traces reveal distinct behavioral signatures per variant

Markov Analysis

Agent behavior varies dramatically across variants even when final outcomes are similar. The Markov fingerprint analysis revealed:

JAR cluster patterns — How much time agents spend in dependency inspection

Thrashing loops — BUILD→TEST→EDIT cycles that indicate the agent is stuck

Loop amplification — Quantified via transition probability engineering (TPE)

Full traces, Markov analysis scripts, raw data

Experiments

Code Coverage v1 — Knowledge Injection Baseline

Hypothesis

Setup

The 9 Variants

Key Findings

Markov Analysis

Resources

Experiment Repo

Blog: Agent Fingerprint

​Hypothesis

​Setup

​The 9 Variants

​Key Findings

​Markov Analysis

​Resources

Experiment Repo

Blog: Agent Fingerprint

Hypothesis

Setup

The 9 Variants

Key Findings

Markov Analysis

Resources