Examples
Audio Input Output
Hackathon Resources
- Introduction
- Setup
- Examples
- Simple Text Agent
- Agent with Tools
- Agent with Knowledge
- Agent with Structured Outputs
- Research Agent
- YouTube Agent
- Image Input + Tools
- Image Generation using DALL-E
- Image to Structured Output
- Image Generate Audio
- Image Input + Output
- Image Transcription
- Image search using Giphy
- Audio Input
- Audio Input Output
- Audio Sentiment
- Audio Transcript
- Audio Multi Turn
- Audio Generate Podcast
- Video Input
- Video Generation with Models Lab
- Video Generation with Replicate
- Video Captions
- Video to Shorts
- Models
- Pre-built Replit Template
- 🏆 Prizes
Examples
Audio Input Output
Audio Input Output
This agent takes an audio input and responds with an audio output.
import httpx
from agno.agent import Agent
from agno.media import Audio
from agno.models.openai import OpenAIChat
from agno.utils.audio import write_audio_to_file
# Fetch the audio file and convert it to a base64 encoded string
url = "https://openaiassets.blob.core.windows.net/$web/API/docs/audio/alloy.wav"
response = httpx.get(url)
response.raise_for_status()
wav_data = response.content
agent = Agent(
model=OpenAIChat(
id="gpt-4o-audio-preview",
modalities=["text", "audio"],
audio={"voice": "alloy", "format": "wav"},
),
markdown=True,
)
agent.run(
"What's in these recording?",
audio=[Audio(content=wav_data, format="wav")],
)
if agent.run_response.response_audio is not None:
write_audio_to_file(
audio=agent.run_response.response_audio.content, filename="tmp/result.wav"
)
Usage
1
Install libraries
pip install -U agno openai
2
Export API keys
export OPENAI_API_KEY=***
3
Run the agent
python audio_input_output.py
On this page