This example demonstrates how a team can collaborate to analyze images and create engaging fiction stories using an image analyst and creative writer.

Code

cookbook/examples/teams/multimodal/image_to_text.py
from pathlib import Path

from agno.agent import Agent
from agno.media import Image
from agno.models.openai import OpenAIChat
from agno.team import Team

image_analyzer = Agent(
    name="Image Analyst",
    role="Analyze and describe images in detail",
    model=OpenAIChat(id="gpt-5-mini"),
    instructions=[
        "Analyze images carefully and provide detailed descriptions",
        "Focus on visual elements, composition, and key details",
    ],
)

creative_writer = Agent(
    name="Creative Writer",
    role="Create engaging stories and narratives",
    model=OpenAIChat(id="gpt-5-mini"),
    instructions=[
        "Transform image descriptions into compelling fiction stories",
        "Use vivid language and creative storytelling techniques",
    ],
)

# Create a team for collaborative image-to-text processing
image_team = Team(
    name="Image Story Team",
    model=OpenAIChat(id="gpt-5-mini"),
    members=[image_analyzer, creative_writer],
    instructions=[
        "Work together to create compelling fiction stories from images.",
        "Image Analyst: First analyze the image for visual details and context.",
        "Creative Writer: Transform the analysis into engaging fiction narratives.",
        "Ensure the story captures the essence and mood of the image.",
    ],
    markdown=True,
)

image_path = Path(__file__).parent.joinpath("sample.jpg")
image_team.print_response(
    "Write a 3 sentence fiction story about the image",
    images=[Image(filepath=image_path)],
)

Usage

1

Create a virtual environment

Open the Terminal and create a python virtual environment.
python3 -m venv .venv
source .venv/bin/activate
2

Install required libraries

pip install agno
3

Set environment variables

export OPENAI_API_KEY=****
4

Add sample image

# Add a sample.jpg image file in the same directory as the script
5

Run the agent

python cookbook/examples/teams/multimodal/image_to_text.py