| Class | Used On | Purpose |
|---|
RunMetrics | RunOutput.metrics, TeamRunOutput.metrics | Run-level totals with per-model breakdown |
MessageMetrics | Message.metrics | Per-API-call tokens and timing |
SessionMetrics | agent.get_session_metrics() | Aggregated across all runs in a session |
ModelMetrics | Inside RunMetrics.details | Per-model aggregate by (provider, id) |
ToolCallMetrics | ToolExecution.metrics | Tool execution timing |
All metric classes except ToolCallMetrics inherit from BaseMetrics, which provides the shared token and cost fields.
BaseMetrics
Shared token fields inherited by RunMetrics, MessageMetrics, SessionMetrics, and ModelMetrics.
| Parameter | Type | Default | Description |
|---|
input_tokens | int | 0 | Tokens in the prompt/input. |
output_tokens | int | 0 | Tokens generated by the model. |
total_tokens | int | 0 | Total tokens used (input + output). |
audio_input_tokens | int | 0 | Audio tokens in the input. |
audio_output_tokens | int | 0 | Audio tokens in the output. |
audio_total_tokens | int | 0 | Total audio tokens. |
cache_read_tokens | int | 0 | Tokens served from cache. |
cache_write_tokens | int | 0 | Tokens written to cache. |
reasoning_tokens | int | 0 | Tokens used for reasoning steps. |
cost | Optional[float] | None | Cost of the API call(s). |
RunMetrics
Used on RunOutput.metrics and TeamRunOutput.metrics. Includes all BaseMetrics fields plus:
| Parameter | Type | Default | Description |
|---|
time_to_first_token | Optional[float] | None | Time from run start to first token (seconds). Set once per run. |
duration | Optional[float] | None | Total run time (seconds). |
details | Optional[Dict[str, List[ModelMetrics]]] | None | Per-model breakdown keyed by model type (e.g., "model", "output_model", "memory_model"). |
additional_metrics | Optional[Dict[str, Any]] | None | Extra metrics (e.g., eval_duration). |
details keys
The details dictionary uses model type strings as keys. Each key maps to a list of ModelMetrics objects (one per unique model (provider, id) pair).
| Key | Description |
|---|
"model" | Primary agent model |
"output_model" | Output model for structured output |
"parser_model" | Parser model |
"memory_model" | Memory model |
"reasoning_model" | Reasoning model |
"session_summary_model" | Session summary model |
"culture_model" | Culture model |
"learning_model" | Learning model |
"compression_model" | Compression model |
"followup_model" | Followup model |
Eval agent metrics are prefixed with eval_. If an eval agent uses model types "model" and "output_model", the details keys become "eval_model" and "eval_output_model".
Metrics is preserved as a backward-compatible alias for RunMetrics. Both from agno.metrics import Metrics and from agno.models.metrics import Metrics still work.
MessageMetrics
Used on Message.metrics. Per-API-call metrics. Includes all BaseMetrics fields plus:
| Parameter | Type | Default | Description |
|---|
duration | Optional[float] | None | Duration of this API call (seconds). |
time_to_first_token | Optional[float] | None | Time to first token for this API call (seconds). |
provider_metrics | Optional[Dict[str, Any]] | None | Provider-specific metrics (e.g., Ollama timing, Cerebras timing). |
SessionMetrics
Returned by agent.get_session_metrics(). Aggregated across all runs in a session. Includes all BaseMetrics fields plus:
| Parameter | Type | Default | Description |
|---|
details | Optional[Dict[str, List[ModelMetrics]]] | None | Per-model breakdown, same structure as RunMetrics.details. Tokens summed across runs. |
additional_metrics | Optional[Dict[str, Any]] | None | Aggregated additional metrics from all runs. |
ModelMetrics
Per-model aggregate stored inside RunMetrics.details[model_type]. Includes all BaseMetrics fields plus:
| Parameter | Type | Default | Description |
|---|
id | str | "" | Model ID (e.g., "gpt-4o"). |
provider | str | "" | Provider name (e.g., "OpenAI Chat", "OpenAI Responses", "Anthropic"). |
provider_metrics | Optional[Dict[str, Any]] | None | Provider-specific data for this model. |
Used on ToolExecution.metrics. Standalone class (does not inherit BaseMetrics).
| Parameter | Type | Default | Description |
|---|
start_time | Optional[float] | None | Unix timestamp when the tool call started. |
end_time | Optional[float] | None | Unix timestamp when the tool call ended. |
duration | Optional[float] | None | Tool execution time (seconds). |
Provider field availability
Not all providers populate all fields. This table shows which BaseMetrics fields each provider sets on MessageMetrics.
| Provider | Standard tokens | cache_read | cache_write | reasoning | Audio tokens | cost | provider_metrics |
|---|
| OpenAI Chat | ✅ | ✅ | | ✅ | ✅ | ✅ | |
| OpenAI Responses | ✅ | ✅ | | ✅ | | | |
| Anthropic Claude | ✅ | ✅ | ✅ | | | | server_tool_use, service_tier |
| Google Gemini | ✅ | ✅ | | ✅ | | | traffic_type |
| Groq | ✅ | | | | | | completion_time, prompt_time, queue_time, total_time |
| Mistral | ✅ | ✅ | | ✅ | | | |
| AWS Bedrock | ✅ | ✅ | ✅ | | | | |
| LiteLLM | ✅ | ✅ | | ✅ | ✅ | | |
| Perplexity | ✅ | ✅ | | ✅ | ✅ | | |
| Azure AI Foundry | ✅ | ✅ | | ✅ | | | |
| Cerebras | ✅ | | | | | | time_system, time_prompt |
| Ollama | ✅ | | | | | | total_duration, load_duration, prompt_eval_duration, eval_duration |
| Cohere | ✅ | | | | | | |
| Meta Llama | ✅ | | | | | | |
| Meta Llama (OpenAI) | ✅ | ✅ | | ✅ | ✅ | ✅ | |
| HuggingFace | ✅ | | | | | | |
| IBM WatsonX | ✅ | | | | | | |
“Standard tokens” = input_tokens, output_tokens, total_tokens. “Audio tokens” = audio_input_tokens, audio_output_tokens.