The HuggingFace model provides access to models hosted on the HuggingFace Hub.

ParameterTypeDefaultDescription
idstr"meta-llama/Meta-Llama-3-8B-Instruct"The id of the HuggingFace model to use.
namestr"HuggingFace"The name of this chat model instance.
providerstr"HuggingFace"The provider of the model.
storeOptional[bool]NoneWhether or not to store the output of this chat completion request for use in the model distillation or evals products.
frequency_penaltyOptional[float]NonePenalizes new tokens based on their frequency in the text so far.
logit_biasOptional[Any]NoneModifies the likelihood of specified tokens appearing in the completion.
logprobsOptional[bool]NoneInclude the log probabilities on the logprobs most likely tokens.
max_tokensOptional[int]NoneThe maximum number of tokens to generate in the chat completion.
presence_penaltyOptional[float]NonePenalizes new tokens based on whether they appear in the text so far.
response_formatOptional[Any]NoneAn object specifying the format that the model must output.
seedOptional[int]NoneA seed for deterministic sampling.
stopOptional[Union[str, List[str]]]NoneUp to 4 sequences where the API will stop generating further tokens.
temperatureOptional[float]NoneControls randomness in the model's output.
top_logprobsOptional[int]NoneHow many log probability results to return per token.
top_pOptional[float]NoneControls diversity via nucleus sampling.
request_paramsOptional[Dict[str, Any]]NoneAdditional parameters to include in the request.
api_keyOptional[str]NoneThe Access Token for authenticating with HuggingFace.
base_urlOptional[Union[str, httpx.URL]]NoneThe base URL for API requests.
timeoutOptional[float]NoneThe timeout for API requests.
max_retriesOptional[int]NoneThe maximum number of retries for failed requests.
default_headersOptional[Any]NoneDefault headers to include in all requests.
default_queryOptional[Any]NoneDefault query parameters to include in all requests.
http_clientOptional[httpx.Client]NoneAn optional pre-configured HTTP client.
client_paramsOptional[Dict[str, Any]]NoneAdditional parameters for client configuration.
clientOptional[InferenceClient]NoneThe HuggingFace Hub Inference client instance.
async_clientOptional[AsyncInferenceClient]NoneThe asynchronous HuggingFace Hub client instance.