Skip to main content

What You’ll Build

A series of workflows that progressively introduce every DSL primitive: sequential pipelines, conditional branching, error recovery, loops, parallel execution, LLM-driven routing, quality gates, and supervised agent delegation. Each step is a real integration test validated against GPT-4.1.

Prerequisites

  • Java 21+
  • An OpenAI API key (OPENAI_API_KEY environment variable)
  • Agent Workflow 0.10.0:
<dependency>
  <groupId>io.github.markpollack</groupId>
  <artifactId>workflow-flows</artifactId>
  <version>0.10.0</version>
</dependency>

Setup

All examples share a ChatClient configured for GPT-4.1 with low temperature for test stability:
String apiKey = System.getenv("OPENAI_API_KEY");
OpenAiApi api = OpenAiApi.builder().apiKey(apiKey).build();
OpenAiChatModel model = OpenAiChatModel.builder()
        .openAiApi(api)
        .defaultOptions(OpenAiChatOptions.builder()
                .model("gpt-4.1")
                .maxTokens(1024)
                .temperature(0.3)
                .build())
        .build();
ChatClient chat = ChatClient.builder(model).build();

Step 1: Define a Step and Chain a Pipeline

A Step is the building block — a named function that takes context and input, produces output. Chain steps with .then() and each step’s output flows into the next.
Step<Object, Object> write = Step.named("write", (ctx, in) ->
        chat.prompt()
                .user("You are a creative writer. Write a 3-sentence story about: " + in)
                .call().content());

Step<Object, Object> editForAudience = Step.named("edit-audience", (ctx, in) ->
        chat.prompt()
                .user("Rewrite this story for young adults. Return only the story: " + in)
                .call().content());

Step<Object, Object> editForStyle = Step.named("edit-style", (ctx, in) ->
        chat.prompt()
                .user("Rewrite this story in a humorous style. Return only the story: " + in)
                .call().content());

String result = (String) Workflow.<String, Object>define("novel-creator")
        .step(write)
        .then(editForAudience)
        .then(editForStyle)
        .run("dragons and wizards");
Three LLM calls in sequence: write a story, rewrite for audience, rewrite for style. The Workflow.define() + .run() pattern compiles the graph and executes it in one call. View source: SequentialDemo.java

Step 2: Branch on Classification

Route to different steps based on a predicate applied to the previous step’s output.
Step<Object, Object> classify = Step.named("classify", (ctx, in) ->
        chat.prompt()
                .user("Classify this as either 'medical' or 'legal'. " +
                      "Reply with exactly one word: " + in)
                .call().content().strip().toLowerCase());

Step<Object, Object> medicalExpert = Step.named("medical", (ctx, in) ->
        chat.prompt()
                .user("You are a medical expert. Briefly advise on: " + in)
                .call().content());

Step<Object, Object> legalExpert = Step.named("legal", (ctx, in) ->
        chat.prompt()
                .user("You are a legal expert. Briefly advise on: " + in)
                .call().content());

String result = (String) Workflow.<String, Object>define("category-router")
        .step(classify)
        .branch(output -> "medical".equals(output))
            .then(medicalExpert)
            .otherwise(legalExpert)
        .run("I broke my leg, what should I do?");
The .strip().toLowerCase() on the classify output matters — LLMs sometimes return trailing whitespace or mixed case. The branch() predicate is a plain Java Predicate<Object>, so you can test any condition. View source: BranchDemo.java

Step 3: Handle Errors

Route exceptions to a recovery step instead of crashing the workflow. The recovery step’s output flows into the next step as if the risky step had succeeded.
Step<Object, Object> riskyStep = Step.named("risky", (ctx, in) -> {
    if (((String) in).contains("bad")) {
        throw new IllegalArgumentException("Bad input detected");
    }
    return chat.prompt()
            .user("Process this: " + in)
            .call().content();
});

Step<Object, Object> recovery = Step.named("recovery", (ctx, in) ->
        chat.prompt()
                .user("The previous step failed. " +
                      "Generate a safe default response for: " + in)
                .call().content());

Step<Object, Object> finalStep = Step.named("finalize", (ctx, in) ->
        "Final: " + in);

String result = (String) Workflow.<String, Object>define("error-recovery")
        .step(riskyStep)
            .onError(IllegalArgumentException.class, recovery)
        .then(finalStep)
        .run("bad input");

// result starts with "Final:" — the workflow continued through recovery
The .onError() clause is type-specific — you can attach different recovery steps for different exception types. The workflow graph wires the recovery path at compile time, not at catch time. View source: ErrorRecoveryDemo.java

Step 4: Loop Until Quality Converges

Iterate a block of steps until a predicate on the output is satisfied. This is the most complex primitive — LLM score parsing needs care.
AtomicInteger iterations = new AtomicInteger(0);

Step<Object, Object> editor = Step.named("editor", (ctx, in) ->
        chat.prompt()
                .user("Write a very short (2-sentence) extremely funny joke about dragons. " +
                      "Be hilarious.")
                .call().content());

Step<Object, Object> scorer = Step.named("scorer", (ctx, in) -> {
    iterations.incrementAndGet();
    String response = chat.prompt()
            .user("Rate this text for humor on a scale of 0.0 to 1.0. " +
                  "Reply with ONLY a decimal number, nothing else: " + in)
            .call().content().strip();

    try {
        return Double.parseDouble(response);
    } catch (NumberFormatException e) {
        var matcher = java.util.regex.Pattern.compile("\\d+\\.\\d+").matcher(response);
        return matcher.find() ? Double.parseDouble(matcher.group()) : 0.0;
    }
});

Object result = Workflow.<String, Object>define("humor-loop")
        .repeatUntilOutput(score -> score instanceof Double d && d >= 0.6)
            .step(editor)
            .step(scorer)
        .end()
        .run("A dragon walked into a bar.");
The loop alternates between editor (generate) and scorer (evaluate) until the score reaches the threshold. GPT-4.1 returns clean decimal numbers with the “Reply with ONLY a decimal number” prompt — the regex fallback is there for other models. View source: LoopDemo.java

Step 5: Fan Out in Parallel

Run steps concurrently and collect results into a list, ordered to match step order.
Step<Object, Object> findMeals = Step.named("find-meals", (ctx, in) ->
        chat.prompt()
                .user("Suggest 3 meals for a " + in + " evening. " +
                      "Just list the meal names, one per line.")
                .call().content());

Step<Object, Object> findMovies = Step.named("find-movies", (ctx, in) ->
        chat.prompt()
                .user("Suggest 3 movies for a " + in + " evening. " +
                      "Just list the movie titles, one per line.")
                .call().content());

@SuppressWarnings("unchecked")
List<Object> results = (List<Object>) Workflow.<String, Object>define("evening-planner")
        .parallel(findMeals, findMovies)
        .run("romantic");

// results.get(0) = meal suggestions
// results.get(1) = movie suggestions
Both LLM calls execute concurrently. You can pass any number of steps to .parallel() — they all receive the same input and their outputs are collected in order. View source: ParallelDemo.java

Step 6: LLM-Driven Routing

Let the LLM choose which step to execute. Unlike branch() (predicate-based), decision() gives the LLM a menu of labeled options and it picks one.
Step<Object, Object> summarize = Step.named("summarize", (ctx, in) ->
        chat.prompt()
                .user("Summarize this in one sentence: " + in)
                .call().content());

Step<Object, Object> translate = Step.named("translate", (ctx, in) ->
        chat.prompt()
                .user("Translate this to French: " + in)
                .call().content());

String result = (String) Workflow.<String, Object>define("decision-router")
        .decision(chat)
            .option("summarize", summarize)
            .option("translate", translate)
        .end()
        .run("The quick brown fox jumps over the lazy dog. " +
             "This is a classic English pangram used for testing.");
The DSL generates a routing prompt from the option names. The LLM returns a clean single-word label and the corresponding step executes. Use decision() when the routing logic is semantic (the LLM needs to understand the input to choose) rather than structural (a simple predicate suffices). View source: DecisionDemo.java

Step 7: Quality Gate

Evaluate output quality and route to pass or fail paths. The gate is a function that returns GateDecision.PASS or GateDecision.FAIL.
Gate<Object> qualityGate = (ctx, output) -> {
    String response = chat.prompt()
            .user("Rate this text for quality on a scale of 0.0 to 1.0. " +
                  "Reply with ONLY a decimal number: " + output)
            .call().content().strip();

    double score;
    try {
        score = Double.parseDouble(response);
    } catch (NumberFormatException e) {
        var matcher = java.util.regex.Pattern.compile("\\d+\\.\\d+").matcher(response);
        score = matcher.find() ? Double.parseDouble(matcher.group()) : 0.0;
    }

    return score >= 0.7 ? GateDecision.PASS : GateDecision.FAIL;
};

Step<Object, Object> generate = Step.named("generate", (ctx, in) ->
        chat.prompt()
                .user("Write a well-crafted 2-sentence story about: " + in)
                .call().content());

Step<Object, Object> approve = Step.named("approve", (ctx, in) ->
        "APPROVED: " + in);

Step<Object, Object> reject = Step.named("reject", (ctx, in) ->
        "REJECTED: " + in);

String result = (String) Workflow.<String, Object>define("gated-pipeline")
        .step(generate)
        .gate(qualityGate)
            .onPass(approve)
            .onFail(reject)
        .end()
        .run("a heroic knight");
Gates are the integration point for Agent Judge — replace the lambda gate with a JudgeGate backed by a jury for multi-judge evaluation with voting strategies. View source: GateDemo.java

Step 8: Supervisor

The LLM autonomously selects which sub-agent to invoke each iteration, terminating when a condition is met.
Step<Object, Object> review = Step.named("review", (ctx, in) ->
        chat.prompt()
                .user("Review this text and suggest one improvement: " + in)
                .call().content());

Step<Object, Object> edit = Step.named("edit", (ctx, in) ->
        chat.prompt()
                .user("Edit this text to be more concise: " + in)
                .call().content());

Object result = Workflow.<String, Object>supervisor("text-improver", chat)
        .agents(review, edit)
        .until(ctx -> ctx.get(AgentContext.ITERATION_COUNT).orElse(0) >= 3)
        .run("The very big and extremely large dragon was flying very high " +
             "up in the sky above the tall mountains.");
The supervisor generates a routing prompt from agent names. Each iteration, the LLM picks the most appropriate agent for the current state. The .until() predicate reads from context — ITERATION_COUNT is automatically maintained, but you can use any context key. View source: SupervisorDemo.java

Runnable Code

Every step in this tutorial has a corresponding runnable module in the workflow-dsl-examples repository. Clone it and run any module with ./mvnw exec:java -pl module-NN-name.
git clone https://github.com/markpollack/workflow-dsl-examples.git
cd workflow-dsl-examples
export OPENAI_API_KEY=sk-...
./mvnw exec:java -pl module-01-sequential

What’s Next

Annotation Model

Declarative workflows with @Agent, @ExceptionHandler, and AgentRegistry

Parameterization

4 patterns for getting data into steps

Durability

Crash recovery with CheckpointingStepRunner and Temporal

All Examples

Complete reference with all 9 patterns plus sub-workflow composition