Meta offers a suite of powerful multi-modal language models known for their strong performance across a wide range of tasks, including superior text understanding and visual intelligence.

We recommend experimenting to find the best-suited model for your use-case. Here are some general recommendations:

  • Llama-4-Scout-17B: Excellent performance for most general tasks, including multi-modal scenarios.
  • Llama-3.3-70B: Powerful instruction-following model for complex reasoning tasks.

Explore all the models here.

Authentication

Set your LLAMA_API_KEY environment variable:

export LLAMA_API_KEY=YOUR_API_KEY

Example

Use Llama with your Agent:

from agno.agent import Agent
from agno.models.meta import Llama

agent = Agent(
    model=Llama(
        id="Llama-4-Maverick-17B-128E-Instruct-FP8",
    ),
    markdown=True
)

agent.print_response("Share a 2 sentence horror story.")
View more examples here.

Parameters

ParameterTypeDescriptionDefault
max_completion_tokensOptional[int]Maximum tokens in completionNone
repetition_penaltyOptional[float]Penalty for token repetitionNone
temperatureOptional[float]Sampling temperatureNone
top_pOptional[float]Nucleus sampling parameterNone
top_kOptional[int]Top-k sampling parameterNone
extra_headersOptional[Any]Additional HTTP headers to include in the API requestNone
extra_queryOptional[Any]Additional query parameters to include in the API requestNone
extra_bodyOptional[Any]Additional body parameters to include in the API requestNone
request_paramsOptional[Dict[str, Any]>Custom request parameters dictionary, merged into the API requestNone
api_keyOptional[str]Llama API key (overrides LLAMA_API_KEY environment variable)None
base_urlOptional[Union[str, httpx.URL]]Base URL for the Llama APINone
timeoutOptional[float]Timeout for API requests (seconds)None
max_retriesOptional[int]Maximum number of retries for API callsNone
default_headersOptional[Any]Default HTTP headers for the API clientNone
default_queryOptional[Any]Default query parameters for the API clientNone
http_clientOptional[httpx.Client]Custom synchronous HTTP client instanceNone
client_paramsOptional[Dict[str, Any]>Additional parameters for the HTTP client constructorNone

OpenAI-like Parameters

LlamaOpenAI supports all parameters from OpenAI Like.

Resources