Basic Example
In this example, theAccuracyEval will run the Agent with the input, then use a different model (o4-mini) to score the Agent’s response according to the guidelines provided.
accuracy.py
Evaluator Agent
You can use another agent to evaluate the accuracy of the Agent’s response. This strategy is usually referred to as “LLM-as-a-judge”. You can adjust the evaluator Agent to make it fit the criteria you want to evaluate:accuracy_with_evaluator_agent.py

Accuracy with Tools
You can also run theAccuracyEval with tools.
accuracy_with_tools.py
Accuracy with given output
For comprehensive evaluation, run with a given output:accuracy_with_given_answer.py
Accuracy with asynchronous functions
Evaluate accuracy with asynchronous functions:async_accuracy.py
Accuracy with Teams
Evaluate accuracy with a team:accuracy_with_team.py
Accuracy with Number Comparison
This example demonstrates evaluating an agent’s ability to make correct numerical comparisons, which can be tricky for LLMs when dealing with decimal numbers:accuracy_comparison.py
Usage
1
Create a virtual environment
Open the
Terminal and create a python virtual environment.2
Install libraries
3
Run your Accuracy Eval Example
Track Evals in your AgentOS
The best way to track your Agno Evals is with the AgentOS platform.evals_demo.py
For more details, see the Evaluation API Reference.
1
Run the Evals Demo
2
View the Evals Demo
Head over to https://os.agno.com/evaluation to view the evals.