This agent can transcribe images using Mistral.

transcribe_image.py
from agno.agent import Agent
from agno.media import Image
from agno.models.mistral.mistral import MistralChat

agent = Agent(
    model=MistralChat(id="pixtral-12b-2409"),
    markdown=True,
)

agent.print_response(
    "Transcribe this document.",
    images=[
        Image(url="https://ciir.cs.umass.edu/irdemo/hw-demo/page_example.jpg"),
    ],
)

Usage

1

Install libraries

pip install -U agno mistral
2

Export API keys

export MISTRAL_API_KEY=***
3

Run the agent

python transcribe_image.py