Skip to main content
The RunOutput from an agent run includes detailed metrics about token usage, cost, timing, and per-model breakdowns.
from agno.agent import Agent
from agno.models.openai import OpenAIResponses
from agno.tools.hackernews import HackerNewsTools
from agno.db.sqlite import SqliteDb
from rich.pretty import pprint

agent = Agent(
    model=OpenAIResponses(id="gpt-5.2"),
    tools=[HackerNewsTools()],
    db=SqliteDb(db_file="tmp/agents.db"),
    markdown=True,
)

run_response = agent.run("What are the top stories on HackerNews?")

# Message metrics (MessageMetrics)
for message in run_response.messages:
    if message.role == "assistant":
        pprint(message.metrics.to_dict())

# Run metrics (RunMetrics)
pprint(run_response.metrics.to_dict())

# Per-model breakdown
if run_response.metrics.details:
    for model_type, model_metrics_list in run_response.metrics.details.items():
        for m in model_metrics_list:
            print(f"{model_type}: {m.provider}/{m.id} - {m.total_tokens} tokens")

# Session metrics (SessionMetrics)
pprint(agent.get_session_metrics().to_dict())
Metrics are available at multiple levels:
  • Per message: Each assistant message has MessageMetrics with per-API-call token counts and timing.
  • Per run: Each RunOutput has RunMetrics with aggregated totals and a details breakdown by model type.
  • Per session: agent.get_session_metrics() returns SessionMetrics aggregated across all runs.
LevelTypeAccess
Per messageMessageMetricsmessage.metrics
Per runRunMetricsrun_response.metrics
Per sessionSessionMetricsagent.get_session_metrics()

Run fields (RunMetrics)

FieldDescription
input_tokensTokens sent to the model.
output_tokensTokens generated by the model.
total_tokensSum of input_tokens and output_tokens.
audio_input_tokensAudio tokens in the input.
audio_output_tokensAudio tokens in the output.
audio_total_tokensSum of audio_input_tokens and audio_output_tokens.
cache_read_tokensTokens read from cache.
cache_write_tokensTokens written to cache.
reasoning_tokensTokens used for reasoning.
costCost of the run.
durationRun duration in seconds.
time_to_first_tokenTime from run start to first token (seconds).
detailsPer-model breakdown by model type. See Metrics reference.
additional_metricsExtra metrics (e.g., eval_duration).

Message fields (MessageMetrics)

FieldDescription
input_tokensTokens sent to the model.
output_tokensTokens generated by the model.
total_tokensSum of input_tokens and output_tokens.
audio_input_tokensAudio tokens in the input.
audio_output_tokensAudio tokens in the output.
audio_total_tokensTotal audio tokens.
cache_read_tokensTokens served from cache.
cache_write_tokensTokens written to cache.
reasoning_tokensTokens used for reasoning.
costCost of this API call.
durationDuration of this API call (seconds).
time_to_first_tokenTime to first token for this API call (seconds).
provider_metricsProvider-specific metrics (e.g., Ollama timing, Groq timing, Cerebras timing).

Developer Resources