When you run an agent in Agno, the response you get (RunOutput) includes detailed metrics about the run. These metrics help you understand resource usage (like token usage and time), performance, and other aspects of the model and tool calls. Metrics are available at multiple levels:
  • Per message: Each message (assistant, tool, etc.) has its own metrics.
  • Per run: Each RunOutput has its own metrics.
  • Per session: The AgentSession contains aggregated session_metrics that are the sum of all RunOutput.metrics for the session.

Example Usage

Suppose you have an agent that performs some tasks and you want to analyze the metrics after running it. Here’s how you can access and print the metrics: You run the following code to create an agent and run it with the following configuration:
from agno.agent import Agent
from agno.models.google import Gemini
from agno.tools.duckduckgo import DuckDuckGoTools
from agno.db.in_memory import InMemoryDb
from rich.pretty import pprint

db = InMemoryDb()

agent = Agent(
    model=Gemini(id="gemini-2.0-flash-001"),
    tools=[DuckDuckGoTools()],
    db=db,
    markdown=True,
)

run_response = agent.run(
    "What is currently happening in the world?"
)

# Print metrics per message
if run_response.messages:
    for message in run_response.messages:
        if message.role == "assistant":
            if message.content:
                print(f"Message: {message.content}")
            elif message.tool_calls:
                print(f"Tool calls: {message.tool_calls}")
            print("---" * 5, "Metrics", "---" * 5)
            pprint(message.metrics.to_dict())
            print("---" * 20)

# Print the aggregated metrics for the whole run
print("---" * 5, "Run Metrics", "---" * 5)
pprint(run_response.metrics.to_dict())
# Print the aggregated metrics for the whole session
print("---" * 5, "Session Metrics", "---" * 5)
pprint(agent.get_session_metrics().to_dict())
You’ll see the outputs with following information:
  • input_tokens: The number of tokens sent to the model.
  • output_tokens: The number of tokens received from the model.
  • total_tokens: The sum of input_tokens and output_tokens.
  • audio_input_tokens: The number of tokens sent to the model for audio input.
  • audio_output_tokens: The number of tokens received from the model for audio output.
  • audio_total_tokens: The sum of audio_input_tokens and audio_output_tokens.
  • cache_read_tokens: The number of tokens read from the cache.
  • cache_write_tokens: The number of tokens written to the cache.
  • reasoning_tokens: The number of tokens used for reasoning.
  • duration: The duration of the run in seconds.
  • time_to_first_token: The time taken until the first token was generated.
  • provider_metrics: Any provider-specific metrics.

Developer Resources