Documentation Index
Fetch the complete documentation index at: https://lab.pollack.ai/llms.txt
Use this file to discover all available pages before exploring further.
Grow your agents, don’t just prompt them.
Run. Judge. Read the journal. Fix the hotspot. Run again. That’s the loop that turns unpredictable agents into reliable ones.
Agents Are Workflows
An agent isn’t a magic black box — it’s a workflow. Each step is either deterministic (build, lint, test, measure coverage) or an AI step (reason about an error, generate code, decide what to fix next). What people casually call “an agent” is usually just the AI step — one node in a larger pipeline.knowledge + structured execution > model — and the build-measure-improve loop below is how you prove it for your agent.
How It Works
- Run the agent on a real task
- Judge the output — did it actually work? (correctness)
- Read the journal — why did it behave that way? (behavior)
- Fix the hotspot — better skill, better prompt, better tool
- Run again — measure if it improved
What You Can Build
The methodology supports four project variants — each with its own feedback loop:| Variant | Use when… | Feedback loop |
|---|---|---|
| Eval-Agent | Building an autonomous agent with judge-based evaluation | Loss optimization — judges score each run, journal explains behavior |
| Project | Bootstrapping a library, service, or application | QA review loop with vision → design → roadmap |
| Research | Investigating a question or testing hypotheses | Vision ↔ research iteration, multi-roadmap pattern |
| Steward | Ongoing maintenance of a project or domain | Health monitoring — continuous, not convergent |
Agents Built So Far
| Agent | What it does |
|---|---|
| PR Merge | Automated pull request review, merge, and pipeline orchestration |
| Issue Classification | Categorize and triage issues (tested against SWE-bench) |
| Code Coverage | Generate tests to hit coverage targets across multiple projects |
| Liquibase → Flyway | Migration agent for database schema tooling conversion |
Blog
- I Read My Agent’s Diary — Markov chain analysis of agent tool-call traces across eval-agent experiments
- Look Ma, No RAG! — How the knowledge layer works: routing tables, two KB types, federation, and health checks
Try It
The repo includes slash commands in.claude/commands/ — they’re available automatically when you launch Claude Code from within the repo: