This example demonstrates how to perform sentiment analysis on audio conversations using Agno agents with multimodal capabilities.
import requests
from agno.agent import Agent
from agno.db.sqlite import SqliteDb
from agno.media import Audio
from agno.models.google import Gemini

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"
agent = Agent(
    model=Gemini(id="gemini-2.0-flash-exp"),
    add_history_to_context=True,
    markdown=True,
    db=SqliteDb(
        session_table="audio_sentiment_analysis_sessions",
        db_file="tmp/audio_sentiment_analysis.db",
    ),
)

url = "https://agno-public.s3.amazonaws.com/demo_data/sample_conversation.wav"

response = requests.get(url)
audio_content = response.content

# Give a sentiment analysis of this audio conversation. Use speaker A, speaker B to identify speakers.
agent.print_response(
    "Give a sentiment analysis of this audio conversation. Use speaker A, speaker B to identify speakers.",
    audio=[Audio(content=audio_content)],
    stream=True,
)

agent.print_response(
    "What else can you tell me about this audio conversation?",
    stream=True,
)

Key Features

  • Audio Processing: Downloads and processes audio files from remote URLs
  • Sentiment Analysis: Analyzes emotional tone and sentiment in conversations
  • Speaker Identification: Distinguishes between different speakers in the conversation
  • Persistent Sessions: Maintains conversation history using SQLite database
  • Streaming Response: Real-time response generation for better user experience

Use Cases

  • Customer service call analysis
  • Meeting sentiment tracking
  • Interview evaluation
  • Call center quality monitoring