| Basic Agent-as-Judge Evaluation | Demonstrates synchronous and asynchronous agent-as-judge evaluations. |
| Post-Hook Agent-as-Judge Evaluation | Demonstrates synchronous and asynchronous post-hook judging. |
| Batch Agent-as-Judge Evaluation | Demonstrates evaluating multiple cases in one run. |
| Binary Agent-as-Judge Evaluation | Demonstrates pass/fail response quality evaluation. |
| Agent As Judge Custom Evaluator | Demonstrates using a custom evaluator agent for judging. |
| Team Agent-as-Judge Evaluation | Demonstrates response quality evaluation for team outputs. |
| Team Post-Hook Agent-as-Judge Evaluation | Demonstrates a post-hook judge running on team responses. |
| Agent As Judge With Guidelines | Demonstrates agent-as-judge scoring with additional guidelines. |
| Tool-Using Agent-as-Judge Evaluation | Demonstrates judging responses generated by an agent using tools. |