Skip to main content

The Problem

Agents read files. The question is: which files, in what order, with what structure? A flat directory of Markdown files forces the agent to read everything or guess. A well-structured KB lets the agent navigate to exactly what it needs in 1-2 file reads.

The Agent-Consumption Weighting

Not all documentation types are equally useful to agents. Based on Diataxis (Daniele Procida), we weight the four document types for agent consumption:
TypeAgent ValueWhy
ReferenceHighestStructured, predictable, greppable. Consistent format means the agent can parse reliably
How-toHighAction-oriented recipes map directly to agent tasks. Step-by-step instructions translate into actions
ExplanationMediumProvides context for judgment calls. But costs tokens proportional to discursiveness. Best accessed on-demand
TutorialLowAgents don’t build confidence, learn by repetition, or benefit from β€œwe” language. Almost entirely wasted tokens
This inverts the typical human documentation priority. Humans want tutorials first; agents want reference first.

Directory Layout

A KB serving both human and agent consumers:
knowledge-store/
β”œβ”€β”€ index.md                    # Entry point: routing table
β”œβ”€β”€ reference/                  # Agent-primary
β”‚   β”œβ”€β”€ api-changes.md
β”‚   β”œβ”€β”€ configuration.md
β”‚   └── error-codes.md
β”œβ”€β”€ howto/                      # Agent-primary
β”‚   β”œβ”€β”€ migrate-security.md
β”‚   β”œβ”€β”€ handle-deprecation.md
β”‚   └── configure-logging.md
β”œβ”€β”€ explanation/                # Agent-secondary (on-demand)
β”‚   β”œβ”€β”€ why-api-changed.md
β”‚   └── design-rationale.md
└── tutorials/                  # Human-only (agent ignores)
    └── getting-started.md

The Index Pattern

The index.md at every directory level is the agent’s entry point. It contains a routing table β€” not content, but pointers:
# Spring Migration Knowledge

| Topic | File | Read when... |
|-------|------|-------------|
| Import changes | reference/javax-to-jakarta.md | Task involves import migration |
| Security config | howto/migrate-security.md | Project uses Spring Security |
| JPA changes | reference/jpa-changes.md | Task involves data access |
| Why APIs changed | explanation/api-rationale.md | Agent needs design context |
The β€œRead when…” column is critical. It tells the agent under what conditions to read the file. This is more useful than a document type label β€” it encodes priority and relevance.

Routing precedence

  • β€œAlways read first” β€” mandatory context
  • β€œTask involves X” β€” conditional on the current task
  • β€œOnly when stuck” β€” fallback for debugging

Progressive Disclosure

The agent reads in layers:
1

Read the root index

~50 lines. The agent sees what domains exist and which are relevant to its task.
2

Read the domain index

~30 lines. The agent sees specific topics and their routing conditions.
3

Read the relevant file

Full content β€” but only for the 1-3 files that match the task. Not the whole KB.
A well-structured KB turns a 50-file knowledge base into 2-3 file reads. The agent spends tokens on knowledge, not navigation.

Two KB Types

The lab uses two distinct KB architectures:

Code-Agent KB (task-driven)

For agents that execute coding tasks. Optimized for lookup and action.
  • Root index.md ≀100 lines
  • VOCABULARY.md β€” controlled vocabulary for consistent terminology
  • Domain directories with per-domain index.md
  • Cheatsheets and structured reference files
  • Update cadence: when frameworks or tools change
  • Agent roles: Curator (read-write maintenance) + Navigator (read-only consumption)

Research-Partner KB (question-driven)

For research synthesis and strategic context. Optimized for understanding and connections.
  • CLAUDE.md as session bridge (routing + context)
  • synthesis/ hierarchy with theme index and per-theme docs
  • Immutable source conversations
  • Update cadence: after each research conversation
  • Agent role: session bridge (one agent, dual modes β€” synthesis intake + Q&A)
Don’t mix them. The same domain can appear in both KB types with different purposes. A code-agent KB about Spring Security has migration recipes. A research-partner KB about Spring Security has strategic analysis of the migration’s impact on the product roadmap.

Design Rules

  1. Index files contain pointers, not content. If you’re putting explanation in the index, it belongs in a separate file.
  2. Reference format should be greppable. Consistent headings, predictable structure, machine-parseable tables. The agent’s first retrieval is typically Grep for a keyword, then Read of the matching file.
  3. One topic per file. A file that covers both β€œhow to migrate security” and β€œwhy the security API changed” should be split. The agent might need one without the other.
  4. Negative knowledge is explicit. If something is out of scope, say so in the index. β€œThis KB does NOT cover: deployment, monitoring, performance tuning.” This prevents the agent from searching fruitlessly.
  5. KnowledgeRefs are relative paths. In experiment datasets, knowledgeRefs point to files relative to knowledgeBaseDir. Typically 1-5 directory refs per item (usually 2-3). The agent reads the pointed-to index, then drills down.

Evidence

Code Coverage v1

Variant 3 (flat knowledge base) vs Variant 4 (structured skills) β€” identical content, different packaging. Variant 4 outperformed Variant 3 in efficiency metrics. The agent using structured skills showed 0% JAR_INSPECT β€” it stopped needing to inspect dependencies because the knowledge was delivered proactively.

SkillsBench

SkillsBench confirmed that structure matters: AgentSkillOS found that hierarchically structured skills outperform flat files even with identical content.

Partial Knowledge Paradox

Some knowledge without structure decreases performance (Code Coverage v1, finding #4). An unstructured KB is worse than no KB β€” the agent wastes tokens navigating and gets confused by contradictory or irrelevant information.

Further Reading

For a narrative walkthrough of how these patterns were discovered and applied across 1,772 files and six federated KBs, see Look Ma, No RAG! on the blog.

Knowledge Base Freshness

How knowledge stays true after it’s written β€” drift, rituals, and the trust principle

Forge Pipeline

How knowledge gets packaged into agent-ready artifacts

Extending Loopy

Skills, SkillsJars, and progressive disclosure in practice