Skip to main content
ExampleDescription
Basic Agent-as-Judge EvaluationDemonstrates synchronous and asynchronous agent-as-judge evaluations.
Post-Hook Agent-as-Judge EvaluationDemonstrates synchronous and asynchronous post-hook judging.
Batch Agent-as-Judge EvaluationDemonstrates evaluating multiple cases in one run.
Binary Agent-as-Judge EvaluationDemonstrates pass/fail response quality evaluation.
Agent As Judge Custom EvaluatorDemonstrates using a custom evaluator agent for judging.
Team Agent-as-Judge EvaluationDemonstrates response quality evaluation for team outputs.
Team Post-Hook Agent-as-Judge EvaluationDemonstrates a post-hook judge running on team responses.
Agent As Judge With GuidelinesDemonstrates agent-as-judge scoring with additional guidelines.
Tool-Using Agent-as-Judge EvaluationDemonstrates judging responses generated by an agent using tools.