Skip to main content
This example demonstrates how to create an agent that can analyze images and generate creative text content based on the visual content.

Code

image_to_text.py
from pathlib import Path

from agno.agent import Agent
from agno.media import Image
from agno.models.openai import OpenAIResponses

agent = Agent(
    model=OpenAIResponses(id="gpt-5.2"),
    markdown=True,
)

image_path = Path(__file__).parent.joinpath("sample.jpg")
agent.print_response(
    "Write a 3 sentence fiction story about the image",
    images=[Image(filepath=image_path)],
)

Usage

1

Set up your virtual environment

uv venv --python 3.12
source .venv/bin/activate
2

Install dependencies

uv pip install -U agno openai
3

Export your OpenAI API key

  export OPENAI_API_KEY="your_openai_api_key_here"
4

Run Agent

python image_to_text.py