Agent Workflow

What’s New →

What’s new in 0.10.0: Upgraded to Spring AI 2.0.0 GA, which clears CVE-2026-41712. Builds on 0.9.0 — AgentCallback.onQuestion now takes a tool-agnostic Question/Option callback record (no longer leaking a tool-library type), and judge integration moved to the io.github.markpollack namespace — plus 0.8.0 trace capture (AgentClientStep propagates trace file paths into the workflow journal) and 0.7.0 ManagedAgentStep (run steps in Anthropic’s hosted sandbox).

Overview

Agent Workflow helps you build agents that really work — understand why they work, then improve them in a controlled, experimental manner. You compose steps into workflows, each step doing one thing: call an LLM, run a function, invoke an external agent. Quality gates evaluate output at each stage. Every step transition is traced, feeding Agent Journal for behavioral analysis — so you can answer: does the agent need better real-time steering? What knowledge is it missing to achieve its goal? Which steps should be deterministic instead of LLM-driven? What new tools should be built? The philosophy follows what Stripe learned building Minions at scale: “The model does not run the system. The system runs the model.” A fluent DSL makes workflows easy to define — branching, loops, parallel execution, LLM-driven routing, error recovery. Steps exchange data through typed context. The workflow compiles to a graph intermediate representation that separates definition from execution, enabling portable runtimes without changing workflow code.

Workflow.define("pr-review")
    .step(fetchDiff)
    .then(analyzeDiff)
    .gate(new JudgeGate(jury, 0.8))
        .onPass(postComment)
        .onFail(revise)
    .end()
    .run(event);

Core Concepts

Steps are the building blocks. Each step takes input, does work, and produces output. Steps can be:

Deterministic — a Java function (GitHub API call, string formatting, file parsing)
Single LLM call — ChatClientStep wraps a Spring AI ChatClient call
Agentic CLI tools — ClaudeStep uses the Claude Agent SDK for full multi-turn agent sessions with deep tracing. AgentClientStep wraps other agentic CLI tools — Google Gemini, OpenAI Codex, Amazon Q — via Agent Client, giving you a unified interface
Hosted agents — ManagedAgentStep delegates to Anthropic Managed Agents, running steps in Anthropic’s cloud sandbox with full tool access. A2AStep delegates via the A2A protocol to any remote agent

A ClaudeStep or AgentClientStep isn’t a single API call — it runs a complete agentic loop internally (dozens of tool calls, minutes of execution) and returns a typed result. The workflow sees it as one step. Context threads through every step. Steps read input parameters by key, do their work, and write output parameters back. Downstream steps pick up what upstream steps produced — all type-safe via ContextKey<T>. The graph means the workflow definition is pure data — nodes and edges, not opaque lambdas. This enables:

Portable runtimes — the graph decouples definition from execution. Ships with LocalStepRunner (in-process, zero overhead), CheckpointingStepRunner (JDBC crash recovery via workflow-batch), and TemporalStepRunner (distributed durable execution via workflow-temporal) — same workflow code, swap a single @Bean
Tracing — every step transition is recorded for observability and behavioral analysis via Agent Journal
Steering (planned) — runtime hooks that intercept before/after steps to enforce constraints, redirect behavior, or inject guidance. Deterministic or LLM-powered. Integrates with Spring AI advisors and the Claude Agent SDK hook system
Inspection — the graph is pure data (nodes + edges), not opaque lambdas

Start Here

Tutorial: Build a Workflow

From a single step to a supervised agent pipeline — 8 progressive examples

Tutorial Code

Clone and run all 13 examples with real LLM calls

Documentation

Getting Started

Steps, context, portable runtimes, first workflow

DSL Primitives

10+ composable patterns with code

Annotation Model

@Agent, AgentHandler, exception handling, registry

Durability

Crash recovery, checkpointing, Temporal integration

Parameterization

4 patterns for getting data into steps

API Reference

Step, AgentContext, Gate, WorkflowGraph, StepRunner

Why Deterministic Steps Matter

The biggest insight from running real agent experiments: the AI shouldn’t do everything. This is the pattern Stripe describes in their Minions system, now shipping 1,300 PRs a week: the model does not run the system — the system runs the model. A real PR merge workflow illustrates the point. The pipeline has 8 steps:

checkoutPR → formatCode → collectContext → compile → squash → rebase → resolveConflicts → review

Six steps are deterministic — git operations, Java formatting, GitHub API calls, Maven compile. Only two need LLM reasoning: resolving merge conflicts and reviewing the diff. The deterministic steps are free, fast, and perfectly reliable. The LLM steps are expensive and variable. By minimizing what the LLM needs to do, you reduce cost, increase reliability, and make the whole pipeline easier to debug. This isn’t obvious until you measure it. In our code coverage experiments, adding a deterministic pre-analysis step cut agent steps by 27%. The agent still explored the codebase — but it read source files instead of decompiling JARs. Same attention budget, better allocation.

Resources

Source Code

Source code (0.10.0 on Maven Central)

Tutorial Code

13 runnable examples — validated with real LLM calls

Used In

Code Coverage v1 — Agent execution engine for all 9 variants
Code Coverage v2 — Agent execution with skills injection
Issue Classification — SWE-bench agent runner

Projects

AgentWorks

Agento

Supporting Projects

Migration

Overview

Core Concepts

Start Here

Tutorial: Build a Workflow

Tutorial Code

Documentation

Getting Started

DSL Primitives

Annotation Model

Durability

Parameterization

API Reference

Why Deterministic Steps Matter

Resources

Source Code

Tutorial Code

Used In

​Overview

​Core Concepts

​Start Here

Tutorial: Build a Workflow

Tutorial Code

​Documentation

Getting Started

DSL Primitives

Annotation Model

Durability

Parameterization

API Reference

​Why Deterministic Steps Matter

​Resources

Source Code

Tutorial Code

​Used In

Overview

Core Concepts

Start Here

Documentation

Why Deterministic Steps Matter

Resources

Used In