What You’ll Learn
| Concept | Description |
|---|---|
| Vision Extraction | Using LLM vision to understand document layouts |
| Structured Output | Extracting complex nested data (vendor, line items, totals) |
| Data Validation | Verifying extracted data for accuracy |
| Document Processing | Handling PDFs and images |
Prerequisites
- Python 3.12+
- OpenAI API key
- poppler (system dependency for PDF to image conversion)
Setup
1
Clone the repository
2
Create and activate virtual environment
3
Install system dependencies
4
Install Python dependencies
5
Set environment variables
Run the Agent
Single Invoice Extraction
Extract data from a single invoice:- Loading an invoice document
- Extracting structured data
- Accessing vendor, line items, and totals
Data Validation
Validate extracted invoice data:- Validating line item math
- Checking subtotal and total calculations
- Identifying data quality issues
Batch Processing
Process multiple invoices:Agent Configuration
| Parameter | Purpose |
|---|---|
model | GPT-5.2 with vision capabilities |
output_schema | Pydantic model for structured invoice data |
ReasoningTools | Plan extraction approach and validate data |
How It Works
Extraction Workflow
Output Schema
Validation Rules
| Check | Formula |
|---|---|
| Line item math | quantity × unit_price = amount |
| Subtotal | Sum of line items ≈ subtotal |
| Total | subtotal + tax - discount + shipping ≈ total |
Troubleshooting
PDF conversion fails
PDF conversion fails
Ensure poppler is installed:
Low confidence on scanned invoices
Low confidence on scanned invoices
Scanned documents may have poor image quality, skewed alignment, or background noise. Try improving the scan quality.
Math validation warnings
Math validation warnings
Common causes: rounding differences, hidden fees, or multi-page invoices with partial totals. Review warnings and verify manually if needed.