HuggingFace

Hugging Face provides a wide range of state-of-the-art language models tailored to diverse NLP tasks, including text generation, summarization, translation, and question answering. These models are available through the Hugging Face Transformers library and are widely adopted due to their ease of use, flexibility, and comprehensive documentation.

Explore HuggingFace’s language models here.

Authentication

Set your HF_TOKEN environment. You can get one from HuggingFace here.

export HF_TOKEN=***

Example

Use HuggingFace with your Agent:

from agno.agent import Agent, RunResponse
from agno.models.huggingface import HuggingFace

agent = Agent(
    model=HuggingFace(
        id="meta-llama/Meta-Llama-3-8B-Instruct",
        max_tokens=4096,
    ),
    markdown=True
)

# Print the response on the terminal
agent.print_response("Share a 2 sentence horror story.")

View more examples here.

Params

Parameter	Type	Default	Description
`id`	`str`	`"meta-llama/Meta-Llama-3-8B-Instruct"`	The id of the HuggingFace model to use.
`name`	`str`	`"HuggingFace"`	The name of this chat model instance.
`provider`	`str`	`"HuggingFace"`	The provider of the model.
`store`	`Optional[bool]`	`None`	Whether or not to store the output of this chat completion request for use in the model distillation or evals products.
`frequency_penalty`	`Optional[float]`	`None`	Penalizes new tokens based on their frequency in the text so far.
`logit_bias`	`Optional[Any]`	`None`	Modifies the likelihood of specified tokens appearing in the completion.
`logprobs`	`Optional[bool]`	`None`	Include the log probabilities on the logprobs most likely tokens.
`max_tokens`	`Optional[int]`	`None`	The maximum number of tokens to generate in the chat completion.
`presence_penalty`	`Optional[float]`	`None`	Penalizes new tokens based on whether they appear in the text so far.
`response_format`	`Optional[Any]`	`None`	An object specifying the format that the model must output.
`seed`	`Optional[int]`	`None`	A seed for deterministic sampling.
`stop`	`Optional[Union[str, List[str]]]`	`None`	Up to 4 sequences where the API will stop generating further tokens.
`temperature`	`Optional[float]`	`None`	Controls randomness in the model's output.
`top_logprobs`	`Optional[int]`	`None`	How many log probability results to return per token.
`top_p`	`Optional[float]`	`None`	Controls diversity via nucleus sampling.
`request_params`	`Optional[Dict[str, Any]]`	`None`	Additional parameters to include in the request.
`api_key`	`Optional[str]`	`None`	The Access Token for authenticating with HuggingFace.
`base_url`	`Optional[Union[str, httpx.URL]]`	`None`	The base URL for API requests.
`timeout`	`Optional[float]`	`None`	The timeout for API requests.
`max_retries`	`Optional[int]`	`None`	The maximum number of retries for failed requests.
`default_headers`	`Optional[Any]`	`None`	Default headers to include in all requests.
`default_query`	`Optional[Any]`	`None`	Default query parameters to include in all requests.
`http_client`	`Optional[httpx.Client]`	`None`	An optional pre-configured HTTP client.
`client_params`	`Optional[Dict[str, Any]]`	`None`	Additional parameters for client configuration.
`client`	`Optional[InferenceClient]`	`None`	The HuggingFace Hub Inference client instance.
`async_client`	`Optional[AsyncInferenceClient]`	`None`	The asynchronous HuggingFace Hub client instance.

HuggingFace is a subclass of the Model class and has access to the same params.

On this page

Authentication
Example
Params

Explore HuggingFace’s language models here.

Authentication

Set your HF_TOKEN environment. You can get one from HuggingFace here.

export HF_TOKEN=***

Example

Use HuggingFace with your Agent:

from agno.agent import Agent, RunResponse
from agno.models.huggingface import HuggingFace

agent = Agent(
    model=HuggingFace(
        id="meta-llama/Meta-Llama-3-8B-Instruct",
        max_tokens=4096,
    ),
    markdown=True
)

# Print the response on the terminal
agent.print_response("Share a 2 sentence horror story.")

View more examples here.

Params

Parameter	Type	Default	Description
`id`	`str`	`"meta-llama/Meta-Llama-3-8B-Instruct"`	The id of the HuggingFace model to use.
`name`	`str`	`"HuggingFace"`	The name of this chat model instance.
`provider`	`str`	`"HuggingFace"`	The provider of the model.
`store`	`Optional[bool]`	`None`	Whether or not to store the output of this chat completion request for use in the model distillation or evals products.
`frequency_penalty`	`Optional[float]`	`None`	Penalizes new tokens based on their frequency in the text so far.
`logit_bias`	`Optional[Any]`	`None`	Modifies the likelihood of specified tokens appearing in the completion.
`logprobs`	`Optional[bool]`	`None`	Include the log probabilities on the logprobs most likely tokens.
`max_tokens`	`Optional[int]`	`None`	The maximum number of tokens to generate in the chat completion.
`presence_penalty`	`Optional[float]`	`None`	Penalizes new tokens based on whether they appear in the text so far.
`response_format`	`Optional[Any]`	`None`	An object specifying the format that the model must output.
`seed`	`Optional[int]`	`None`	A seed for deterministic sampling.
`stop`	`Optional[Union[str, List[str]]]`	`None`	Up to 4 sequences where the API will stop generating further tokens.
`temperature`	`Optional[float]`	`None`	Controls randomness in the model's output.
`top_logprobs`	`Optional[int]`	`None`	How many log probability results to return per token.
`top_p`	`Optional[float]`	`None`	Controls diversity via nucleus sampling.
`request_params`	`Optional[Dict[str, Any]]`	`None`	Additional parameters to include in the request.
`api_key`	`Optional[str]`	`None`	The Access Token for authenticating with HuggingFace.
`base_url`	`Optional[Union[str, httpx.URL]]`	`None`	The base URL for API requests.
`timeout`	`Optional[float]`	`None`	The timeout for API requests.
`max_retries`	`Optional[int]`	`None`	The maximum number of retries for failed requests.
`default_headers`	`Optional[Any]`	`None`	Default headers to include in all requests.
`default_query`	`Optional[Any]`	`None`	Default query parameters to include in all requests.
`http_client`	`Optional[httpx.Client]`	`None`	An optional pre-configured HTTP client.
`client_params`	`Optional[Dict[str, Any]]`	`None`	Additional parameters for client configuration.
`client`	`Optional[InferenceClient]`	`None`	The HuggingFace Hub Inference client instance.
`async_client`	`Optional[AsyncInferenceClient]`	`None`	The asynchronous HuggingFace Hub client instance.

HuggingFace is a subclass of the Model class and has access to the same params.

On this page

Authentication
Example
Params

​Authentication

​Example

​Params

Changelog

​Authentication

​Example

​Params

Authentication

Example

Params

Authentication

Example

Params