> ## Documentation Index
> Fetch the complete documentation index at: https://docs.agno.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Ollama

The Ollama model provides access to open source models, both locally-hosted and via **Ollama Cloud**.

**Local Usage**: Run models on your own hardware using the Ollama client. Perfect for development, privacy-sensitive workloads, and when you want full control over your infrastructure.

**Cloud Usage**: Access cloud-hosted models via [Ollama Cloud](https://ollama.com) with an API key for scalable, production-ready deployments. No local setup required - simply set your `OLLAMA_API_KEY` and start using powerful models instantly.

## Key Features

* **Dual Deployment Options**: Choose between local hosting for privacy and control, or cloud hosting for scalability
* **Seamless Switching**: Easy transition between local and cloud deployments with minimal code changes
* **Auto-configuration**: When using an API key, the host automatically defaults to Ollama Cloud
* **Wide Model Support**: Access to extensive library of open-source models including GPT-OSS, Llama, Qwen, DeepSeek, and Phi models

## Parameters

| Parameter               | Type                          | Default                    | Description                                                      |
| ----------------------- | ----------------------------- | -------------------------- | ---------------------------------------------------------------- |
| `id`                    | `str`                         | `"llama3.2"`               | The name of the Ollama model to use                              |
| `name`                  | `str`                         | `"Ollama"`                 | The name of the model                                            |
| `provider`              | `str`                         | `"Ollama"`                 | The provider of the model                                        |
| `host`                  | `str`                         | `"http://localhost:11434"` | The host URL for the Ollama server                               |
| `timeout`               | `Optional[int]`               | `None`                     | Request timeout in seconds                                       |
| `format`                | `Optional[str]`               | `None`                     | The format to return the response in (e.g., "json")              |
| `options`               | `Optional[Dict[str, Any]]`    | `None`                     | Additional model options (temperature, top\_p, etc.)             |
| `keep_alive`            | `Optional[Union[float, str]]` | `None`                     | How long to keep the model loaded (e.g., "5m", 3600 seconds)     |
| `template`              | `Optional[str]`               | `None`                     | The prompt template to use                                       |
| `system`                | `Optional[str]`               | `None`                     | System message to use                                            |
| `raw`                   | `Optional[bool]`              | `None`                     | Whether to return raw response without formatting                |
| `stream`                | `bool`                        | `True`                     | Whether to stream the response                                   |
| `retries`               | `int`                         | `0`                        | Number of retries to attempt before raising a ModelProviderError |
| `delay_between_retries` | `int`                         | `1`                        | Delay between retries, in seconds                                |
| `exponential_backoff`   | `bool`                        | `False`                    | If True, the delay between retries is doubled each time          |