The Problem
Agents read files. The question is: which files, in what order, with what structure? A flat directory of Markdown files forces the agent to read everything or guess. A well-structured KB lets the agent navigate to exactly what it needs in 1-2 file reads.The Agent-Consumption Weighting
Not all documentation types are equally useful to agents. Based on Diataxis (Daniele Procida), we weight the four document types for agent consumption:| Type | Agent Value | Why |
|---|---|---|
| Reference | Highest | Structured, predictable, greppable. Consistent format means the agent can parse reliably |
| How-to | High | Action-oriented recipes map directly to agent tasks. Step-by-step instructions translate into actions |
| Explanation | Medium | Provides context for judgment calls. But costs tokens proportional to discursiveness. Best accessed on-demand |
| Tutorial | Low | Agents donβt build confidence, learn by repetition, or benefit from βweβ language. Almost entirely wasted tokens |
Directory Layout
A KB serving both human and agent consumers:The Index Pattern
Theindex.md at every directory level is the agentβs entry point. It contains a routing table β not content, but pointers:
Routing precedence
- βAlways read firstβ β mandatory context
- βTask involves Xβ β conditional on the current task
- βOnly when stuckβ β fallback for debugging
Progressive Disclosure
The agent reads in layers:Read the root index
~50 lines. The agent sees what domains exist and which are relevant to its task.
Two KB Types
The lab uses two distinct KB architectures:Code-Agent KB (task-driven)
For agents that execute coding tasks. Optimized for lookup and action.- Root
index.mdβ€100 lines VOCABULARY.mdβ controlled vocabulary for consistent terminology- Domain directories with per-domain
index.md - Cheatsheets and structured reference files
- Update cadence: when frameworks or tools change
- Agent roles: Curator (read-write maintenance) + Navigator (read-only consumption)
Research-Partner KB (question-driven)
For research synthesis and strategic context. Optimized for understanding and connections.CLAUDE.mdas session bridge (routing + context)synthesis/hierarchy with theme index and per-theme docs- Immutable source conversations
- Update cadence: after each research conversation
- Agent role: session bridge (one agent, dual modes β synthesis intake + Q&A)
Donβt mix them. The same domain can appear in both KB types with different purposes. A code-agent KB about Spring Security has migration recipes. A research-partner KB about Spring Security has strategic analysis of the migrationβs impact on the product roadmap.
Design Rules
- Index files contain pointers, not content. If youβre putting explanation in the index, it belongs in a separate file.
-
Reference format should be greppable. Consistent headings, predictable structure, machine-parseable tables. The agentβs first retrieval is typically
Grepfor a keyword, thenReadof the matching file. - One topic per file. A file that covers both βhow to migrate securityβ and βwhy the security API changedβ should be split. The agent might need one without the other.
- Negative knowledge is explicit. If something is out of scope, say so in the index. βThis KB does NOT cover: deployment, monitoring, performance tuning.β This prevents the agent from searching fruitlessly.
-
KnowledgeRefs are relative paths. In experiment datasets,
knowledgeRefspoint to files relative toknowledgeBaseDir. Typically 1-5 directory refs per item (usually 2-3). The agent reads the pointed-to index, then drills down.
Evidence
Code Coverage v1
Variant 3 (flat knowledge base) vs Variant 4 (structured skills) β identical content, different packaging. Variant 4 outperformed Variant 3 in efficiency metrics. The agent using structured skills showed 0% JAR_INSPECT β it stopped needing to inspect dependencies because the knowledge was delivered proactively.SkillsBench
SkillsBench confirmed that structure matters: AgentSkillOS found that hierarchically structured skills outperform flat files even with identical content.Partial Knowledge Paradox
Some knowledge without structure decreases performance (Code Coverage v1, finding #4). An unstructured KB is worse than no KB β the agent wastes tokens navigating and gets confused by contradictory or irrelevant information.Further Reading
For a narrative walkthrough of how these patterns were discovered and applied across 1,772 files and six federated KBs, see Look Ma, No RAG! on the blog.Related
Knowledge Base Freshness
How knowledge stays true after itβs written β drift, rituals, and the trust principle
Forge Pipeline
How knowledge gets packaged into agent-ready artifacts
Extending Loopy
Skills, SkillsJars, and progressive disclosure in practice