Documentation Index Fetch the complete documentation index at: https://lab.pollack.ai/llms.txt
Use this file to discover all available pages before exploring further.
Why a Jury?
A single judge gives you one judgment.
A jury aggregates multiple judgments into a verdict with diagnostic information — when something fails, you can see which checks failed, which tier failed, and why.
Agent Judge provides two jury types:
Jury Use when SimpleJuryAll judges run as peers — aggregate with voting CascadedJuryJudges are organized in cost tiers — fail fast on cheap checks, escalate to expensive ones
The snippets below omit imports for brevity. See API Reference for package names.
SimpleJury
Run multiple judges and aggregate results with a voting strategy:
SimpleJury jury = SimpleJury . builder ()
. judge ( new FileExistsJudge ( "output.txt" ))
. judge ( BuildSuccessJudge . maven ( "compile" ), 2.0 ) // Weight of 2.0
. judge ( new FileContentJudge ( "output.txt" , "expected" , FileContentJudge . MatchMode . CONTAINS ))
. votingStrategy ( new MajorityVotingStrategy ())
. parallel ( true ) // Execute judges concurrently (default)
. build ();
Verdict verdict = jury . vote (context);
Builder API
Method Description Default .judge(Judge)Add judge with weight 1.0 — .judge(Judge, double)Add judge with custom weight — .votingStrategy(VotingStrategy)Aggregation method (required) — .parallel(boolean)Concurrent execution true.executor(Executor)Custom thread pool for parallel execution Common pool
Reading the Verdict
Verdict verdict = jury . vote (context);
// Aggregated result
Judgment overall = verdict . aggregated ();
System . out . println ( overall . status ()); // PASS or FAIL
System . out . println ( overall . reasoning ()); // "Majority passed (2/3)"
// Individual results
verdict . individualByName (). forEach ((name, judgment) ->
System . out . println (name + " -> " + judgment . status ()));
// Weights used
Map < String , Double > weights = verdict . weights ();
SimpleJury aggregates peers. It does not provide fail-fast cost control. Use CascadedJury when you want cheap checks to prevent expensive judges from running.
Voting Strategies
Strategy Pass condition Best for MajorityVotingStrategypassCount > failCountGeneral purpose ConsensusStrategyAll judges agree High-stakes evaluation AverageVotingStrategyaverage(scores) >= 0.5Continuous scores WeightedAverageStrategyweightedAvg(scores) >= 0.5Judges with different importance MedianVotingStrategymedian(scores) >= 0.5Outlier-resistant scoring
Configuring MajorityVotingStrategy
MajorityVotingStrategy strategy = new MajorityVotingStrategy (
TiePolicy . FAIL , // What to do on a tie
ErrorPolicy . TREAT_AS_FAIL // How to handle ERROR judgments
);
TiePolicy — when pass count equals fail count:
Policy Behavior TiePolicy.PASSOptimistic — resolve ties as PASS TiePolicy.FAILPessimistic — resolve ties as FAIL (default) TiePolicy.ABSTAINNeutral — no verdict
ErrorPolicy — when a judge returns JudgmentStatus.ERROR:
Policy Behavior ErrorPolicy.TREAT_AS_FAILCount errors as failures (default) ErrorPolicy.TREAT_AS_ABSTAINCount the judge as having abstained ErrorPolicy.IGNOREExclude the errored judge from the vote count and diagnostics
CascadedJury
A cascaded jury organizes judges into tiers.
Each tier is itself a jury (typically a SimpleJury).
Tiers execute sequentially — if a cheap tier already has a verdict, expensive tiers never run.
// Tier 1: Deterministic guardrails
Jury deterministic = SimpleJury . builder ()
. judge ( new FileExistsJudge ( "src/main/java/App.java" ))
. judge ( BuildSuccessJudge . maven ( "compile" ))
. votingStrategy ( new MajorityVotingStrategy ())
. build ();
// Tier 2: Structural (cheap, compares against reference)
// Requires context.metadata().get("expectedDir") to point at the reference directory
Jury structural = SimpleJury . builder ()
. judge ( new FileComparisonJudge ())
. votingStrategy ( new ConsensusStrategy ())
. build ();
// Tier 3: Semantic (LLM cost)
Jury semantic = SimpleJury . builder ()
. judge ( new CorrectnessJudge (chatClientBuilder))
. votingStrategy ( new MajorityVotingStrategy ())
. build ();
CascadedJury jury = CascadedJury . builder ()
. tier ( "deterministic" , deterministic, TierPolicy . REJECT_ON_ANY_FAIL )
. tier ( "structural" , structural, TierPolicy . ACCEPT_ON_ALL_PASS )
. tier ( "semantic" , semantic, TierPolicy . FINAL_TIER )
. build ();
Verdict verdict = jury . vote (context);
Tier Policies
Policy Behavior Typical use REJECT_ON_ANY_FAILStop immediately if any judge in this tier fails Guardrails: must compile, files must exist ACCEPT_ON_ALL_PASSStop if all judges pass — accept without escalating to later tiers Consensus gate when this tier is strong enough on its own FINAL_TIERRuns when reached and produces the final verdict Last tier (required)
The last tier in a CascadedJury must use TierPolicy.FINAL_TIER.
The builder validates this at build time.
Inspecting Tier Results
Verdict verdict = jury . vote (context);
// Overall result
System . out . println ( verdict . aggregated (). status ());
// Per-tier sub-verdicts
for ( Verdict tierVerdict : verdict . subVerdicts ()) {
System . out . println ( "Tier: " + tierVerdict . aggregated (). reasoning ());
tierVerdict . individualByName (). forEach ((name, j) ->
System . out . println ( " " + name + " -> " + j . status ()));
}
Jury Composition
Named Judges
Wrap any judge with a name for readable verdict output:
Judge named = Judges . named (myJudge, "build-check" , "Verifies compilation" );
Without names, judges get auto-generated identifiers in the verdict.
Combining Juries
The Juries utility class provides shortcuts:
import io.github.markpollack.judge.jury.Juries;
// Quick jury from judges
Jury quick = Juries . fromJudges ( new MajorityVotingStrategy (), judge1, judge2, judge3);
// Meta-jury: combine two juries
Jury meta = Juries . combine (jury1, jury2, new ConsensusStrategy ());
// Multiple juries with a shared strategy
Jury combined = Juries . allOf ( new AverageVotingStrategy (), jury1, jury2, jury3);
Choosing a Pattern
Scenario Use Same-tier judges, single vote SimpleJury with majority or consensusWeighted importance among judges SimpleJury with WeightedAverageStrategyCheap-then-expensive evaluation CascadedJury with 2-3 tiersMultiple evaluation dimensions Juries.combine() to merge sub-juriesQuick one-off check Judges.and() or Judges.allOf() (no jury overhead)
Built-in Judges Catalog of judges to wire into juries
Writing Custom Judges Build domain-specific judges for your evaluation criteria