Ollama models support text and images as input.

1

Setup your virtual environment

python3 -m venv .venv
source .venv/bin/activate
2

Install dependencies

pip install -U ollama agno
3

Pull the model

ollama pull llama3.2-vision
4

Run the agent

python agent.py

Example: Image Agent

from agno.agent import Agent
from agno.media import Image
from agno.models.ollama import Ollama

agent = Agent(
    model=Ollama(id="llama3.2-vision"),
    markdown=True,
)

agent.print_response(
    "Tell me about this image",
    images=[
        Image(
            url="https://upload.wikimedia.org/wikipedia/commons/b/bf/Krakow_-_Kosciol_Mariacki.jpg"
        )
    ],
)

View More Examples