fallback_models to any Agent or Team. If the primary model fails after exhausting its retries, each fallback is tried in order until one succeeds.
gpt-4o fails after exhausting its own retries, Claude is tried automatically.
Model strings work too:
Usage with Teams
Fallback models apply to the team leader’s model calls. Member agents keep their own models and are not affected by the leader’s fallback config.Error-Specific Fallbacks
FallbackConfig lets you route different error types to different fallback models. Instead of a flat list, you specify which models to try for rate limits, context window overflows, and general errors separately.
Error routing
When the primary model fails, the error is classified and routed to the matching fallback list:| Error Type | Fallback List | Example |
|---|---|---|
| Rate limit (429/529) | on_rate_limit | Provider throttling, Anthropic overloaded |
| Context window exceeded | on_context_overflow | Input too long for model’s context window |
| Other retryable errors | on_error | Server errors (5xx), network failures |
on_rate_limit) is empty, on_error is used as a catch-all.
Non-retryable client errors like 400, 401, 403, 404, and 422 are not caught by fallback. These indicate configuration problems (bad API key, invalid request) that need to be fixed rather than masked by switching models.
Fallback Callback
Use thecallback parameter to get notified whenever a fallback model is activated. This is useful for logging, metrics, or alerting.
Retry vs. Fallback
Retry and fallback are separate layers. Retry happens inside each model. Fallback only triggers after the primary model’s retry loop is fully exhausted.Streaming
Fallback works with streaming responses. If the primary model fails mid-stream, the fallback model takes over and the response content is reset so the consumer receives a clean response from the fallback model only.Parameters
Available on bothAgent and Team:
| Parameter | Type | Description |
|---|---|---|
fallback_models | List[Model | str] | Models tried in order on any failure. Shorthand for FallbackConfig(on_error=...). |
fallback_config | FallbackConfig | Error-specific routing. Takes precedence over fallback_models if both are set. |
FallbackConfig
| Field | Type | Description |
|---|---|---|
on_error | List[Model | str] | General fallback for any retryable error. |
on_rate_limit | List[Model | str] | Fallback for rate-limit (429/529) errors. Falls back to on_error if empty. |
on_context_overflow | List[Model | str] | Fallback for context-window-exceeded errors. Falls back to on_error if empty. |
callback | Callable[[str, str, Exception], None] | Called when a fallback model is activated. Receives (primary_model_id, fallback_model_id, error). |