Skip to main content
An intelligent document summarization agent that processes various document types (PDF, text, web pages) and produces structured summaries with key points, entities, and action items.

What You’ll Learn

ConceptDescription
Structured OutputUsing Pydantic schemas for consistent agent responses
Document LoadingReading PDFs, text files, and web pages
Entity ExtractionIdentifying people, organizations, dates, technologies
Action ItemsFinding tasks and next steps in documents

Prerequisites

  • Python 3.12+
  • OpenAI API key

Setup

1

Clone the repository

git clone https://github.com/agno-agi/agno.git
cd agno
2

Create and activate virtual environment

uv venv --python 3.12
source .venv/bin/activate
3

Install dependencies

uv pip install -r cookbook/01_showcase/01_agents/document_summarizer/requirements.in
4

Set environment variables

export OPENAI_API_KEY=your-openai-key

Run the Agent

Basic Summary

Summarize a document and access structured fields:
python cookbook/01_showcase/01_agents/document_summarizer/examples/basic_summary.py
Demonstrates:
  • Summarizing a text file
  • Accessing structured summary fields
  • Displaying key points and entities

Entity Extraction

Focus on extracting and categorizing entities:
python cookbook/01_showcase/01_agents/document_summarizer/examples/extract_entities.py
Demonstrates:
  • Entity extraction from documents
  • Categorizing by type (person, organization, date, etc.)
  • Contextual information for each entity

Batch Processing

Process multiple documents:
python cookbook/01_showcase/01_agents/document_summarizer/examples/batch_processing.py

Agent Configuration

summarizer_agent = Agent(
    name="Document Summarizer",
    model=OpenAIResponses(id="gpt-5.2"),
    system_message=SYSTEM_MESSAGE,
    output_schema=DocumentSummary,
    tools=[
        ReasoningTools(add_instructions=True),
        read_pdf,
        read_text_file,
        fetch_url,
    ],
    add_datetime_to_context=True,
    add_history_to_context=True,
    num_history_runs=5,
    enable_agentic_memory=True,
    markdown=True,
)
ParameterPurpose
modelGPT-5.2 for document understanding
output_schemaPydantic model for structured output
read_pdfExtract text from PDF files
read_text_fileRead .txt and .md files
fetch_urlFetch and clean web page content
ReasoningToolsPlan extraction approach before summarizing

How It Works

Document Processing

1. Detect source type (PDF, URL, or text file)
2. Extract content using appropriate tool
3. Analyze document structure and type
4. Generate structured summary
5. Extract entities and action items
6. Return validated output with confidence score

Output Schema

class DocumentSummary(BaseModel):
    title: str                    # Document title or inferred subject
    document_type: str            # report, article, meeting_notes, etc.
    summary: str                  # Concise summary in 1-3 paragraphs
    key_points: list[str]         # 3-7 main takeaways
    entities: list[Entity]        # People, organizations, dates
    action_items: list[ActionItem] # Tasks with owner, deadline, priority
    word_count: int               # Original document word count
    confidence: float             # 0-1 confidence score

Document Type Detection

TypeIndicators
reportFormal structure, findings, conclusions
articleNews format, byline, publication date
meeting_notesAttendees, agenda, discussion items
research_paperAbstract, methodology, citations
emailTo/From headers, subject line

Troubleshooting

The PDF may be image-based (scanned). This agent handles text-based PDFs only. For scanned documents, use the Invoice Analyst which has vision capabilities.
JavaScript-heavy pages may not extract fully. The agent uses BeautifulSoup for static HTML extraction.
Low confidence indicates poor document quality, ambiguous content, or missing context. Review the document manually and provide additional context if needed.

Source Code