LiteLLM

LiteLLM provides a unified interface for various LLM providers, allowing you to use different models with the same code.

Agno integrates with LiteLLM in two ways:

  1. Direct SDK integration - Using the LiteLLM Python SDK
  2. Proxy Server integration - Using LiteLLM as an OpenAI-compatible proxy

Prerequisites

For both integration methods, you’ll need:

# Install required packages
pip install agno litellm

Set up your API key: Regardless of the model used(OpenAI, Hugging Face, or XAI) the API key is referenced as LITELLM_API_KEY.

export LITELLM_API_KEY=your_api_key_here

SDK Integration

The LiteLLM class provides direct integration with the LiteLLM Python SDK.

Basic Usage

from agno.agent import Agent
from agno.models.litellm import LiteLLM

# Create an agent with GPT-4o
agent = Agent(
    model=LiteLLM(
        id="gpt-4o",  # Model ID to use
        name="LiteLLM",  # Optional display name
    ),
    markdown=True,
)

# Get a response
agent.print_response("Share a 2 sentence horror story")

Using Hugging Face Models

LiteLLM can also work with Hugging Face models:

from agno.agent import Agent
from agno.models.litellm import LiteLLM

agent = Agent(
    model=LiteLLM(
        id="huggingface/mistralai/Mistral-7B-Instruct-v0.2",
        top_p=0.95,
    ),
    markdown=True,
)

agent.print_response("What's happening in France?")

Configuration Options

The LiteLLM class accepts the following parameters:

ParameterTypeDescriptionDefault
idstrModel identifier (e.g., “gpt-4o” or “huggingface/mistralai/Mistral-7B-Instruct-v0.2”)“gpt-4o”
namestrDisplay name for the model”LiteLLM”
providerstrProvider name”LiteLLM”
api_keyOptional[str]API key (falls back to LITELLM_API_KEY environment variable)None
api_baseOptional[str]Base URL for API requestsNone
max_tokensOptional[int]Maximum tokens in the responseNone
temperaturefloatSampling temperature0.7
top_pfloatTop-p sampling value1.0
request_paramsOptional[Dict[str, Any]]Additional request parametersNone

SDK Examples

View more examples here.