The S3PDFKnowledgeBase reads PDF files from an S3 bucket, converts them into vector embeddings and loads them to a vector database.

Usage

We are using a local PgVector database for this example. Make sure it’s running

from agno.knowledge.s3.pdf import S3PDFKnowledgeBase
from agno.vectordb.pgvector import PgVector

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"

knowledge_base = S3PDFKnowledgeBase(
    bucket_name="agno-public",
    key="recipes/ThaiRecipes.pdf",
    vector_db=PgVector(table_name="recipes", db_url=db_url),
)

Then use the knowledge_base with an Agent:

from agno.agent import Agent
from knowledge_base import knowledge_base

agent = Agent(
    knowledge=knowledge_base,
    search_knowledge=True,
)
agent.knowledge.load(recreate=False)

agent.print_response("How to make Thai curry?")

Params

ParameterTypeDefaultDescription
bucket_namestrNoneThe name of the S3 Bucket where the PDFs are.
keystrNoneThe key of the PDF file in the bucket.
readerS3PDFReaderS3PDFReader()A S3PDFReader that converts the PDFs into Documents for the vector database.

S3PDFKnowledgeBase is a subclass of the AgentKnowledge class and has access to the same params.

Developer Resources