Gemini
Audio Input (Upload the file)
Examples
- Introduction
- Getting Started
- Agents
- Teams
- Workflows
- Applications
Agent Concepts
- Multimodal
- RAG
- Knowledge
- Memory
- Async
- Hybrid Search
- Storage
- Tools
- Vector Databases
- Embedders
Models
- Anthropic
- AWS Bedrock
- AWS Bedrock Claude
- Azure AI Foundry
- Azure OpenAI
- Cohere
- DeepInfra
- DeepSeek
- Fireworks
- Gemini
- Basic Agent
- Streaming Agent
- Agent with Structured Outputs
- Agent with Tools
- Agent with Storage
- Agent with Knowledge
- Image Agent
- Flash Thinking Agent
- Audio Input (Bytes Content)
- Audio Input (Upload the file)
- Audio Input (Local file)
- Agent with PDF Input (Local file)
- Agent with PDF Input (URL)
- Video Input (Bytes Content)
- Video Input (File Upload)
- Video Input (Local File Upload)
- Groq
- Hugging Face
- Mistral
- NVIDIA
- Ollama
- OpenAI
- Perplexity
- Together
- xAI
- IBM
- LM Studio
- LiteLLM
- LiteLLM OpenAI
Gemini
Audio Input (Upload the file)
Code
cookbook/models/google/gemini/audio_input_file_upload.py
from pathlib import Path
from agno.agent import Agent
from agno.media import Audio
from agno.models.google import Gemini
model = Gemini(id="gemini-2.0-flash-exp")
agent = Agent(
model=model,
markdown=True,
)
# Please download a sample audio file to test this Agent and upload using:
audio_path = Path(__file__).parent.joinpath("sample.mp3")
audio_file = None
remote_file_name = f"files/{audio_path.stem.lower()}"
try:
audio_file = model.get_client().files.get(name=remote_file_name)
except Exception as e:
print(f"Error getting file {audio_path.stem}: {e}")
pass
if not audio_file:
try:
audio_file = model.get_client().files.upload(
file=audio_path,
config=dict(name=audio_path.stem, display_name=audio_path.stem),
)
print(f"Uploaded audio: {audio_file}")
except Exception as e:
print(f"Error uploading audio: {e}")
agent.print_response(
"Tell me about this audio",
audio=[Audio(content=audio_file)],
stream=True,
)
Usage
1
Create a virtual environment
Open the Terminal
and create a python virtual environment.
python3 -m venv .venv
source .venv/bin/activate
2
Set your API key
export GOOGLE_API_KEY=xxx
3
Install libraries
pip install -U google-genai agno
4
Run Agent
python cookbook/models/google/gemini/audio_input_file_upload.py