Learn how to evaluate your Agno Agents and Teams across three key dimensions - accuracy (using LLM-as-a-judge), performance (runtime and memory), and reliability (tool calls).
AccuracyEval
will run the Agent with the input, then use a different model (o4-mini
) to score the Agent’s response according to the guidelines provided.
AccuracyEval
on an existing output (without running the Agent).
beta
.