Skip to main content
This reference guide defines the key concepts and terminology you’ll encounter when working with Knowledge in Agno.

Key Terminology

Knowledge Base

A structured repository of information that agents can search and retrieve from at runtime. Contains processed content optimized for AI understanding and retrieval.

Agentic RAG

Retrieval Augmented Generation where the agent actively decides when to search, what to search for, and how to use the retrieved information. Unlike traditional RAG systems, the agent has full control over the search process.

Vector Embeddings

Mathematical representations of text that capture semantic meaning. Words and phrases with similar meanings have similar embeddings, enabling intelligent search beyond keyword matching.

Chunking

The process of breaking large documents into smaller, manageable pieces that are optimal for search and retrieval while preserving context.

Dynamic Few-Shot Learning

The pattern where agents retrieve specific examples or context at runtime to improve their performance on tasks, rather than having all information provided upfront.
Scenario: Building a Text-to-SQL AgentInstead of cramming all table schemas, column names, and example queries into the system prompt, you store this information in a knowledge base.When a user asks for data, the agent:
  1. Analyzes the request
  2. Searches for relevant schema information and example queries
  3. Uses the retrieved context to generate the best possible SQL query
This is “dynamic” because the agent gets exactly the information it needs for each specific query, and “few-shot” because it learns from examples retrieved at runtime.

Core Knowledge Components

Content Sources

The raw information you want your agents to access:
  • Documents: PDFs, Word files, text files
  • Websites: URLs, web pages, documentation sites
  • Databases: SQL databases, APIs, structured data
  • Text: Direct text content, notes, policies

Readers

Specialized components that parse different content types and extract meaningful text:
  • PDFReader: Extracts text from PDF files, handles encryption
  • WebsiteReader: Crawls web pages and extracts content
  • CSVReader: Processes tabular data from CSV files
  • Custom Readers: Build your own for specialized data sources

Chunking Strategies

Methods for breaking content into optimal pieces:
  • Semantic Chunking: Respects natural content boundaries
  • Fixed Size: Uniform chunk sizes with overlap
  • Document Chunking: Preserves document structure
  • Recursive Chunking: Hierarchical splitting with multiple separators

Vector Databases

Storage systems optimized for similarity search:
  • PgVector: PostgreSQL extension for vector storage
  • LanceDB: Fast, embedded vector database
  • Pinecone: Managed vector database service
  • Qdrant: High-performance vector search engine

Component Relationships

The Knowledge system combines these components in a coordinated pipeline: ReadersChunkingEmbeddersVector DatabasesAgent Retrieval.

Advanced Knowledge Features

Custom Knowledge Retrievers

For complete control over how agents search your knowledge:
def custom_retriever(
    agent: Agent,
    query: str,
    num_documents: Optional[int] = 5,
    **kwargs
) -> Optional[list[dict]]:
    # Custom search logic
    # Filter by user permissions
    # Apply business rules
    # Return curated results
    pass

agent = Agent(knowledge_retriever=custom_retriever)

Asynchronous Operations

Optimize performance with async knowledge operations:
# Async content loading for better performance
await knowledge.add_content_async(path="large_dataset/")

# Async agent responses
response = await agent.arun("What's in the dataset?")

Knowledge Filtering

Control what information agents can access:
# Filter by metadata
knowledge.add_content(
    path="docs/",
    filters={"department": "engineering", "clearance": "public"}
)

Next Steps

I