This example demonstrates how to provide input to an agent as a dictionary format, specifically for multimodal inputs like text and images.

Code

cookbook/agents/input_and_output/input_as_dict.py
from agno.agent import Agent

Agent().print_response(
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
                },
            },
        ],
    },
    stream=True,
    markdown=True,
)

Usage

1

Create a virtual environment

Open the Terminal and create a python virtual environment.
python3 -m venv .venv
source .venv/bin/activate
2

Install libraries

pip install -U agno
3

Run Agent

python cookbook/agents/input_and_output/input_as_dict.py