Skip to main content
The ArXiv Reader allows you to search and read academic papers from the ArXiv preprint repository, converting them into vector embeddings for your knowledge base.

Code

examples/concepts/knowledge/readers/arxiv_reader.py
from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.reader.arxiv_reader import ArxivReader
from agno.vectordb.pgvector import PgVector

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"

# Create a knowledge base with the ArXiv documents
knowledge = Knowledge(
    # Table name: ai.arxiv_documents
    vector_db=PgVector(
        table_name="arxiv_documents",
        db_url=db_url,
    ),
)
# Load the knowledge
knowledge.add_content(
    topics=["Generative AI", "Machine Learning"],
    reader=ArxivReader(),
)

# Create an agent with the knowledge
agent = Agent(
    knowledge=knowledge,
    search_knowledge=True,
)

# Ask the agent about the knowledge
agent.print_response("What can you tell me about Generative AI?", markdown=True)

Usage

1

Create a virtual environment

Open the Terminal and create a python virtual environment.
python3 -m venv .venv
source .venv/bin/activate
2

Install libraries

pip install -U arxiv sqlalchemy psycopg pgvector agno
3

Run PgVector

docker run -d \
  -e POSTGRES_DB=ai \
  -e POSTGRES_USER=ai \
  -e POSTGRES_PASSWORD=ai \
  -e PGDATA=/var/lib/postgresql/data/pgdata \
  -v pgvolume:/var/lib/postgresql/data \
  -p 5532:5432 \
  --name pgvector \
  agno/pgvector:16
4

Run Agent

python examples/concepts/knowledge/readers/arxiv_reader.py

Params

ParameterTypeDefaultDescription
max_resultsint5Top articles
sort_byarxiv.SortCriterionarxiv.SortCriterion.RelevanceSort criterion for Arxiv search results
I