Knowledge is stored in a vector db and this searching on demand pattern is called Agentic RAG.
Agno Agents use Agentic RAG by default, meaning if you add knowledge to an Agent, it will search this knowledge, at runtime, for the specific information it needs to achieve its task. The pseudo steps for adding knowledge to an Agent are:
from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
import asyncio

# Create a knowledge instance for the Agent
knowledge = Knowledge(vector_db=...)

# Add some knowledge content
asyncio.run(
    knowledge.add_content_async(
        text_content="The sky is blue",
    )
)

# Add the Knowledge to the Agent and
# give it a tool to search its knowledge as needed
agent = Agent(knowledge=knowledge, search_knowledge=True)
If you need complete control over the knowledge base search, you can pass your own knowledge_retriever function with the following signature:
def knowledge_retriever(
    agent: Agent, query: str, num_documents: Optional[int] = 5, **kwargs
) -> Optional[list[dict]]:
  ...

my_retriever = knowledge_retriever(...)
agent = Agent(
  knowledge_retriever=my_retriever
)
This function is called during search_knowledge_base() and is used by the Agent to retrieve references from the knowledge. For more details check out the Custom Retriever page.

Vector Databases

While any type of storage can be used for knowledge, vector databases offer the best solution for retrieving relevant results from dense information quickly. Here’s how vector databases are used with Agents:
1

Chunk the information

Break down the knowledge content into smaller chunks to ensure our search query returns only relevant results.
2

Load the knowledge

Convert the chunks into embedding vectors and store them in a vector database.
3

Search the knowledge

When the user sends a message, we convert the input message into an embedding and “search” for nearest neighbors in the vector database.

Loading the Knowledge

Before you can use knowledge, it needs to be loaded with embeddings that will be used for retrieval.

Asynchronous Loading

Many vector databases support asynchronous operations, which can significantly improve performance when loading large amounts of contents into knowledge. You can leverage this capability using the aload() method:
import asyncio

from agno.agent import Agent
from agno.knowledge.knowledge import Knowledge
from agno.vectordb.qdrant import Qdrant

COLLECTION_NAME = "pdf-documents"

vector_db = Qdrant(collection=COLLECTION_NAME, url="http://localhost:6333")

# Create a knowledge instance using Qdrant vector storage
knowledge = Knowledge(
    vector_db=vector_db,
)


# Create an agent with the knowledge
agent = Agent(
    knowledge=knowledge,
    search_knowledge=True,
)

if __name__ == "__main__":
    # Asynchronously add the content of the PDF file to the knowledge.
    asyncio.run(
        knowledge.add_content_async(
            path="data/pdf",
        )
    )

    # Create and use the agent
    asyncio.run(agent.aprint_response("How to make Thai curry?", markdown=True))
The add_content() function is sync. You can also call add_content_async() for async usage. Using add_content_async() ensures you take full advantage of the non-blocking operations, concurrent processing, and reduced latency that async vector database operations offer. We recommend this approach, which is especially valuable in production environments with high throughput requirements. For more details on vector database async capabilities, see the Vector Database Introduction. Knowledge content needs to be read before it can be passed to any VectorDB for chunking, embedding and storage. For more details on readers, auto selection and content types, see the Content Types page.