The Graduation Path
Agent Workflow separates workflow definition from execution. TheStepRunner interface is the seam — swap the bean, not the workflow:
| Level | Runner | What it adds |
|---|---|---|
| 0 | LocalStepRunner | In-process, zero overhead. Default. |
| 1 | CheckpointingStepRunner | JDBC crash recovery — resume from last completed step |
| 2 | TemporalStepRunner | Distributed durable execution via Temporal activities |
CheckpointingStepRunner
Persists step outputs to a JDBC database. On restart with the samerunId, completed steps are skipped — their cached output is returned directly.
How it works
- Before executing a step, queries by
(runId, stepName)— the checkpoint key - If a
COMPLETEDrecord exists, returns the cachedoutputPayload(skip) - Otherwise, creates a
STARTEDrecord, executes the step, upgrades toCOMPLETEDwith the serialized output - On exception, records
FAILEDwith the error message
Maven coordinates
DataSource on the classpath. H2 works for development; Postgres or MySQL for production.
Crash-and-resume example
A 4-step workflow crashes at step 3. On resume with the samerunId, steps 1-2 are skipped (cached), steps 3-4 execute normally:
A complete runnable example is in
workflow-dsl-examples/CrashRecoveryIT — @DataJpaTest + H2, no LLM needed.JPA entities
Two JPA entities back the checkpoint system:| Entity | Table | Purpose |
|---|---|---|
AgentStepExecution | agent_step_executions | Per-step checkpoint. Key: (runId, stepName) unique constraint. Tracks status, output, tokens, cost. |
AgentFlowExecution | agent_flow_executions | Per-run envelope. Tracks workflow name, steps total/completed, total cost. |
BatchStatus (severity-ordered enum ported from Spring Batch) and ExitStatus (embeddable record with severity-based composition via and()).
JdbcTraceRecorder
Records every step transition to astep_transitions table. Auto-creates the table on first use.
StepTransition record includes: run_id, workflow_name, from_step, to_step, timestamp, duration_ms, tokens_used, cost_usd, node_type, label.
Query traces for a run:
TemporalStepRunner
Dispatches each step as a Temporal Activity. Steps must be registered withStepActivityImpl on the worker side.
Maven coordinates
Activity dispatch
Worker-side step registration
ConcurrentHashMap registry. The activity creates a fresh AgentContext with the runId for each execution.
Steps dispatched via Temporal must be idempotent — Temporal may retry activities on timeout or failure.
Related
API Reference
StepRunner interface, TraceRecorder, WorkflowExecutor
Spring Batch Mapping
How CheckpointingStepRunner maps to JobRepository