Skip to main content
Agent workloads spawn hundreds of instances and long-running tasks. Stateless, horizontal scalability isn’t optional, it’s the baseline. Agno optimizes performance across three dimensions:
  1. Agent performance: Instantiation time, memory footprint, tool calls, history management.
  2. System performance: Async-first API, minimal memory, parallel execution, background threads.
  3. Reliability and accuracy: Monitored through evals (covered separately).

Benchmarks

We measured instantiation time and memory footprint for an Agent with 1 tool, run 1000 times. Apple M4 MacBook Pro, Oct 2025
MetricAgnoLangGraphPydanticAICrewAI
Instantiation3μs1,587μs (529×)170μs (57×)210μs (70×)
Memory6.6 KiB161 KiB (24×)29 KiB (4×)66 KiB (10×)
Runtime performance is bottlenecked by inference and hard to benchmark accurately. We focus on minimizing overhead, reducing memory, and parallelizing tool calls.

Run it yourself

The benchmark code is available in this cookbook. Run it on your own machine, don’t take these numbers at face value.
# Setup virtual environment
./scripts/perf_setup.sh
source .venvs/perfenv/bin/activate

# Run benchmarks
python cookbook/evals/performance/instantiate_agent_with_tool.py           # Agno
python cookbook/evals/performance/comparison/langgraph_instantiation.py    # LangGraph
python cookbook/evals/performance/comparison/crewai_instantiation.py       # CrewAI
python cookbook/evals/performance/comparison/pydantic_ai_instantiation.py  # PydanticAI

How we measure memory

We use Python’s tracemalloc library. First, we calculate baseline memory with an empty function, then run the Agent 1000× and measure the difference. This isolates the Agent’s memory usage. If you spot an error in our methodology, let us know.
While we share benchmarks against other frameworks, accuracy and reliability matter more than speed.