Skip to main content
Markdown chunking is a method of splitting documents into smaller chunks of a specified size, with optional overlap between chunks. This is useful when you want to process large documents in smaller, manageable pieces.
1

Create a Python file

touch markdown_chunking.py
2

Add the following code to your Python file

markdown_chunking.py
import asyncio
from agno.agent import Agent
from agno.knowledge.chunking.markdown import MarkdownChunking
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.reader.markdown_reader import MarkdownReader
from agno.vectordb.pgvector import PgVector

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"

knowledge = Knowledge(
    vector_db=PgVector(table_name="recipes_markdown_chunking", db_url=db_url),
)

asyncio.run(knowledge.add_content_async(
    url="https://github.com/agno-agi/agno/blob/main/README.md",
    reader=MarkdownReader(
        name="Markdown Chunking Reader",
        chunking_strategy=MarkdownChunking(),
    ),
))
agent = Agent(
    knowledge=knowledge,
    search_knowledge=True,
)

agent.print_response("What is Agno?", markdown=True)
3

Create a virtual environment

Open the Terminal and create a python virtual environment.
python3 -m venv .venv
source .venv/bin/activate
4

Install libraries

pip install -U sqlalchemy psycopg pgvector agno
5

Run PgVector

docker run -d \
  -e POSTGRES_DB=ai \
  -e POSTGRES_USER=ai \
  -e POSTGRES_PASSWORD=ai \
  -e PGDATA=/var/lib/postgresql/data/pgdata \
  -v pgvolume:/var/lib/postgresql/data \
  -p 5532:5432 \
  --name pgvector \
  agno/pgvector:16
6

Run the script

python markdown_chunking.py

Markdown Chunking Params

ParameterTypeDefaultDescription
chunk_sizeint5000The maximum size of each chunk.
overlapint0The number of characters to overlap between chunks.