Examples of ways to do Audio to Text (Transcribe) are:
Using OpenAI Whisper (Cloud)
The following agent uses OpenAI Whisper API for audio transcription.cookbook/tools/models/openai_tools.py
Using Multimodal Models
Multimodal models like Gemini can transcribe audio directly without additional tools.cookbook/agents/multimodal/audio_to_text.py
Team-Based Transcription
Teams can handle complex audio processing workflows with multiple specialized agents.cookbook/teams/multimodal/audio_to_text.py
Developer Resources
- View Multimodal Examples
- View Team Examples
- View OpenAI Toolkit