This project now publishes new releases under the Maven groupId
io.github.markpollack. If you previously used org.springaicommunity,
update your dependency coordinates to the current values shown below.JudgmentContext; the same judges and juries then evaluate the result.
agent-judge-core) has zero external dependencies and contains the evaluation abstractions plus basic deterministic judges. Four additional judge-family modules add command execution, file comparison, LLM evaluation, and RAG assessment. Four framework bridge modules connect to Spring AI, LangChain4j, Koog, and AgentClient.
The agent-judge-ai-core module provides framework-neutral AI-backed judge infrastructure. ModelBackedJudge composes a prompt template, model backend, and response classifier into a judge β no subclassing needed. JudgeModel implementations in agent-judge-llm (SpringAiJudgeModel) and agent-judge-agent-client (AgentClientJudgeModel) connect to specific AI backends.
Quick Start
Every runtime is adapted into aJudgmentContext, then evaluated by judges and juries:
Core Abstractions
Judge
Functional interface β takes
JudgmentContext, returns Judgment with score, status, reasoning, and granular checksJury
Multi-judge aggregation with voting strategies β majority, consensus, weighted average, median
ModelBackedJudge
Composable AI-backed judge: prompt template, model backend, and classifier composed into a judge
Judge Families
| Type | Module | Cost | Example |
|---|---|---|---|
| Deterministic | agent-judge-core | Free | FileExistsJudge, FileContentJudge, custom rules |
| Command | agent-judge-exec | Compute only | BuildSuccessJudge (Maven/Gradle), CommandJudge |
| File Comparison | agent-judge-file | Free | JavaSemanticJudge, MavenSemanticJudge, AST-based diffs |
| LLM | agent-judge-llm | Token cost | CorrectnessJudge, custom LLM evaluation |
| RAG | agent-judge-rag | Token cost | FaithfulnessJudge, HallucinationJudge, ContextualRelevanceJudge |
Framework Bridges
| Framework | Module | Evaluator |
|---|---|---|
| Spring AI | agent-judge-spring-ai | SpringAiEvaluator adapts ChatResponse |
| LangChain4j | agent-judge-langchain4j | LangChain4jEvaluator adapts Result<T> |
| Koog | agent-judge-koog | KoogEvaluator adapts AIAgent output |
| AgentClient | agent-judge-agent-client | AgentClientEvaluator adapts CLI-agent responses |
Installation
Each module is published separately underio.github.markpollack:
agent-judge-core, then add only the judge-family or bridge modules you need.
See Getting Started for the full dependency list.
Documentation
Getting Started
Add evaluation to your agent pipeline
Tutorial: Evaluation Pipeline
Build a multi-judge jury step by step
Writing Custom Judges
Lambda judges, DeterministicJudge, LLMJudge template method
Built-in Judges
Complete catalog of every judge across all modules
Jury System
SimpleJury, CascadedJury, voting strategies, composition
API Reference
All public types, interfaces, and records
License
Agent Judge is source-available under BSL 1.1. Internal enterprise use is welcome; commercial redistribution or competing hosted/managed offerings require permission.Resources
Source Code
10 modules β 5 judge families + AI core + 4 framework bridges
Tutorial Code
8 runnable modules β from single judge to ModelBackedJudge
Design Philosophy
Why zero deps, functional interface, sealed scores, cascaded cost