Skip to main content

Documentation Index

Fetch the complete documentation index at: https://lab.pollack.ai/llms.txt

Use this file to discover all available pages before exploring further.

Agento waving hello
Grow your agents, don’t just prompt them.
Run. Judge. Read the journal. Fix the hotspot. Run again. That’s the loop that turns unpredictable agents into reliable ones.
Agento Studio is a systematic approach to growing AI agents. A judge tells you if the agent got it right. A journal tells you why it behaved the way it did. You need both — without the judge you’re guessing, without the journal you’re tuning blind. github.com/markpollack/agento-studio

Agents Are Workflows

An agent isn’t a magic black box — it’s a workflow. Each step is either deterministic (build, lint, test, measure coverage) or an AI step (reason about an error, generate code, decide what to fix next). What people casually call “an agent” is usually just the AI step — one node in a larger pipeline.
fetch PR  →  rebase  →  detect conflicts  →  run tests  →  fix & retest  →  cleanup  →  build gate
 [determ.]   [determ.]     [deterministic]    [determ.]       [AI step]      [determ.]    [judge]

                                                              ┌────────────────────────┐    │ pass
                                                              │ version-pattern judge  │◄───┘
                                                              │    [deterministic]     │
                                                              └──────────┬─────────────┘

                                                              ┌──────────┴─────────────┐
                                                              │    parallel AI steps   │
                                                              │  assess-code-quality   │
                                                              │  assess-backport       │
                                                              └──────────┬─────────────┘

                                                              ┌──────────┴─────────────┐
                                                              │   quality judge → report│
                                                              └────────────────────────┘
This is the actual AgentWorks PR Review pipeline — seven deterministic steps, a judge gate, then parallel AI assessment only if the build passes. Most of the workflow never touches an LLM. This is exactly the pattern Stripe arrived at independently with their Minions system — 1,300+ PRs per week at $1T scale. They call the pattern “blueprints”: deterministic nodes interleaved with agent nodes. As Stripe’s Alistair Gray put it: “Blueprints combine the determinism of workflows with agents’ flexibility in dealing with the unknown.” The insight behind Agento Studio: the workflow structure matters more than the model powering the AI steps. Better prompts, targeted knowledge, and deterministic checkpoints consistently outperform model upgrades. That’s the thesis — knowledge + structured execution > model — and the build-measure-improve loop below is how you prove it for your agent.

How It Works

  1. Run the agent on a real task
  2. Judge the output — did it actually work? (correctness)
  3. Read the journal — why did it behave that way? (behavior)
  4. Fix the hotspot — better skill, better prompt, better tool
  5. Run again — measure if it improved
Each lever you can turn — skills, knowledge bases, pre-analysis, steering hooks — gets validated through this loop, not assumed to help.

What You Can Build

The methodology supports four project variants — each with its own feedback loop:
VariantUse when…Feedback loop
Eval-AgentBuilding an autonomous agent with judge-based evaluationLoss optimization — judges score each run, journal explains behavior
ProjectBootstrapping a library, service, or applicationQA review loop with vision → design → roadmap
ResearchInvestigating a question or testing hypothesesVision ↔ research iteration, multi-roadmap pattern
StewardOngoing maintenance of a project or domainHealth monitoring — continuous, not convergent

Agents Built So Far

AgentWhat it does
PR MergeAutomated pull request review, merge, and pipeline orchestration
Issue ClassificationCategorize and triage issues (tested against SWE-bench)
Code CoverageGenerate tests to hit coverage targets across multiple projects
Liquibase → FlywayMigration agent for database schema tooling conversion
Each one goes through the same cycle: run on real tasks, judge the output, read the journal, fix the hotspots, run again.

Blog

  • I Read My Agent’s Diary — Markov chain analysis of agent tool-call traces across eval-agent experiments
  • Look Ma, No RAG! — How the knowledge layer works: routing tables, two KB types, federation, and health checks

Try It

The repo includes slash commands in .claude/commands/ — they’re available automatically when you launch Claude Code from within the repo:
git clone https://github.com/markpollack/agento-studio.git ~/agento-studio
cd ~/agento-studio
claude
# then: /forge-research-kb ~/my-research-kb "your topic here"
To use the commands from another project, add the repo as a context directory:
claude --add-dir ~/agento-studio
The getting-started guide walks through building your first research-partner KB end-to-end — five seed papers, 20 minutes, a working research agent at the end.

License

BSL 1.1 — converts to Apache 2.0 on April 1, 2029.