> ## Documentation Index > Fetch the complete documentation index at: https://docs.agno.com/llms.txt > Use this file to discover all available pages before exploring further. # What is Evals > Evals is a way to measure the quality of your Agents and Teams.
Agno provides multiple dimensions for evaluating Agents. Learn how to evaluate your Agno Agents and Teams across multiple dimensions - **accuracy** (simple correctness checks), **agent as judge** (custom quality criteria), **performance** (runtime and memory), and **reliability** (tool calls). ## Evaluation Dimensions The accuracy of the Agent's response using LLM-as-a-judge methodology. Evaluate custom quality criteria using LLM-as-a-judge with scoring. The performance of the Agent's response, including latency and memory footprint. The reliability of the Agent's response, including tool calls and error handling. ## Quick Start Here's a simple example of running an accuracy evaluation: ```python quick_eval.py theme={null} from typing import Optional from agno.agent import Agent from agno.eval.accuracy import AccuracyEval, AccuracyResult from agno.models.openai import OpenAIResponses from agno.tools.calculator import CalculatorTools # Create an evaluation evaluation = AccuracyEval( model=OpenAIResponses(id="gpt-5.2"), agent=Agent(model=OpenAIResponses(id="gpt-5.2"), tools=[CalculatorTools()]), input="What is 10*5 then to the power of 2? do it step by step", expected_output="2500", additional_guidelines="Agent output should include the steps and the final answer.", ) # Run the evaluation result: Optional[AccuracyResult] = evaluation.run(print_results=True) ``` ## Best Practices * **Start Simple:** Begin with basic accuracy tests before progressing to complex performance and reliability evaluations * **Use Multiple Test Cases:** Don't rely on a single test case—build comprehensive test suites that cover edge cases * **Track Over Time:** Monitor your eval metrics continuously as you iterate on your agents * **Combine Dimensions:** Evaluate across all three dimensions for a holistic view of agent quality ## Guides Dive deeper into each evaluation dimension: 1. **[Accuracy Evals](/evals/accuracy/overview)** - Learn LLM-as-a-judge techniques and multiple test case strategies 2. **[Agent as Judge Evals](/evals/agent-as-judge/overview)** - Define custom quality criteria with flexible scoring strategies 3. **[Performance Evals](/evals/performance/overview)** - Measure latency, memory usage, and compare different configurations 4. **[Reliability Evals](/evals/reliability/overview)** - Test tool calls, error handling, and rate limiting behavior