Agento Studio

Grow your agents, don’t just prompt them.

Run. Judge. Read the journal. Fix the hotspot. Run again. That’s the loop that turns unpredictable agents into reliable ones.

Agento Studio is a systematic approach to growing AI agents. A judge tells you if the agent got it right. A journal tells you why it behaved the way it did. You need both — without the judge you’re guessing, without the journal you’re tuning blind. github.com/markpollack/agento-studio

Agents Are Workflows

An agent isn’t a magic black box — it’s a workflow. Each step is either deterministic (build, lint, test, measure coverage) or an AI step (reason about an error, generate code, decide what to fix next). What people casually call “an agent” is usually just the AI step — one node in a larger pipeline.

fetch PR  →  rebase  →  detect conflicts  →  run tests  →  fix & retest  →  cleanup  →  build gate
 [determ.]   [determ.]     [deterministic]    [determ.]       [AI step]      [determ.]    [judge]
                                                                                            │
                                                              ┌────────────────────────┐    │ pass
                                                              │ version-pattern judge  │◄───┘
                                                              │    [deterministic]     │
                                                              └──────────┬─────────────┘
                                                                         │
                                                              ┌──────────┴─────────────┐
                                                              │    parallel AI steps   │
                                                              │  assess-code-quality   │
                                                              │  assess-backport       │
                                                              └──────────┬─────────────┘
                                                                         │
                                                              ┌──────────┴─────────────┐
                                                              │   quality judge → report│
                                                              └────────────────────────┘

This is the actual AgentWorks PR Review pipeline — seven deterministic steps, a judge gate, then parallel AI assessment only if the build passes. Most of the workflow never touches an LLM. This is exactly the pattern Stripe arrived at independently with their Minions system — 1,300+ PRs per week at $1T scale. They call the pattern “blueprints”: deterministic nodes interleaved with agent nodes. As Stripe’s Alistair Gray put it: “Blueprints combine the determinism of workflows with agents’ flexibility in dealing with the unknown.” The insight behind Agento Studio: the workflow structure matters more than the model powering the AI steps. Better prompts, targeted knowledge, and deterministic checkpoints consistently outperform model upgrades. That’s the thesis — knowledge + structured execution > model — and the build-measure-improve loop below is how you prove it for your agent.

How It Works

Run the agent on a real task
Judge the output — did it actually work? (correctness)
Read the journal — why did it behave that way? (behavior)
Fix the hotspot — better skill, better prompt, better tool
Run again — measure if it improved

Each lever you can turn — skills, knowledge bases, pre-analysis, steering hooks — gets validated through this loop, not assumed to help.

What You Can Build

The methodology supports four project variants — each with its own feedback loop:

Variant	Use when…	Feedback loop
Eval-Agent	Building an autonomous agent with judge-based evaluation	Loss optimization — judges score each run, journal explains behavior
Project	Bootstrapping a library, service, or application	QA review loop with vision → design → roadmap
Research	Investigating a question or testing hypotheses	Vision ↔ research iteration, multi-roadmap pattern
Steward	Ongoing maintenance of a project or domain	Health monitoring — continuous, not convergent

Agents Built So Far

Agent	What it does
PR Merge	Automated pull request review, merge, and pipeline orchestration
Issue Classification	Categorize and triage issues (tested against SWE-bench)
Code Coverage	Generate tests to hit coverage targets across multiple projects
Liquibase → Flyway	Migration agent for database schema tooling conversion

Each one goes through the same cycle: run on real tasks, judge the output, read the journal, fix the hotspots, run again.

Blog

I Read My Agent’s Diary — Markov chain analysis of agent tool-call traces across eval-agent experiments
Look Ma, No RAG! — How the knowledge layer works: routing tables, two KB types, federation, and health checks

Try It

The repo includes slash commands in .claude/commands/ — they’re available automatically when you launch Claude Code from within the repo:

git clone https://github.com/markpollack/agento-studio.git ~/agento-studio
cd ~/agento-studio
claude
# then: /forge-research-kb ~/my-research-kb "your topic here"

To use the commands from another project, add the repo as a context directory:

claude --add-dir ~/agento-studio

The getting-started guide walks through building your first research-partner KB end-to-end — five seed papers, 20 minutes, a working research agent at the end.

License

BSL 1.1 — converts to Apache 2.0 on April 1, 2029.

Projects

AgentWorks

Agento

Supporting Projects

Migration

Agents Are Workflows

How It Works

What You Can Build

Agents Built So Far

Blog

Try It

License

​Agents Are Workflows

​How It Works

​What You Can Build

​Agents Built So Far

​Blog

​Try It

​License

Agents Are Workflows

How It Works

What You Can Build

Agents Built So Far

Blog

Try It

License