Semantic Chunking

Semantic chunking is a method of splitting documents into smaller chunks by analyzing semantic similarity between text segments using embeddings. It uses the Chonkie library to identify natural breakpoints where the semantic meaning changes significantly, based on a configurable similarity threshold. Learn more about semantic chunking. This helps preserve context and meaning better than fixed-size chunking by ensuring semantically related content stays together in the same chunk, while splitting occurs at meaningful topic transitions. Semantic chunking supports three embedder configurations: Agno Embeddings uses an Agno Embedder, Chonkie Embeddings uses Chonkie’s built-in embeddings handlers, and AutoEmbeddings uses Chonkie’s AutoEmbeddings for automatic selection based on the model string. Learn more about Chonkie embeddings.

Create a Python file

touch semantic_chunking.py

Add the following code to your Python file

from agno.agent import Agent
from agno.knowledge.chunking.semantic import SemanticChunking
from agno.knowledge.embedder.openai import OpenAIEmbedder
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.reader.pdf_reader import PDFReader
from agno.vectordb.pgvector import PgVector

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"

embedder = OpenAIEmbedder(id="text-embedding-3-small")

knowledge = Knowledge(
    vector_db=PgVector(
        table_name="recipes_semantic_chunking", db_url=db_url, embedder=embedder
    ),
)
knowledge.add_content(
    url="https://agno-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf",
    reader=PDFReader(
        name="Semantic Chunking Reader",
        chunking_strategy=SemanticChunking(
            embedder=embedder,  # Use same Agno embedder for chunking
            chunk_size=500,
            similarity_threshold=0.5,
            similarity_window=3,
            min_sentences_per_chunk=1,
            min_characters_per_sentence=24,
            delimiters=[". ", "! ", "? ", "\n"],
            include_delimiters="prev",
            skip_window=0,
            filter_window=5,
            filter_polyorder=3,
            filter_tolerance=0.2,
        ),
    ),
)

agent = Agent(
    knowledge=knowledge,
    search_knowledge=True,
)

agent.print_response("How to make Thai curry?", markdown=True)

Create a virtual environment

Open the Terminal and create a python virtual environment.

python3 -m venv .venv
source .venv/bin/activate

Install libraries

pip install -U sqlalchemy psycopg pgvector agno chonkie openai

Set OpenAI Key

Set your OPENAI_API_KEY as an environment variable. You can get one from OpenAI.

Mac

export OPENAI_API_KEY=sk-***

Windows

setx OPENAI_API_KEY sk-***

Run PgVector

docker run -d \
  -e POSTGRES_DB=ai \
  -e POSTGRES_USER=ai \
  -e POSTGRES_PASSWORD=ai \
  -e PGDATA=/var/lib/postgresql/data/pgdata \
  -v pgvolume:/var/lib/postgresql/data \
  -p 5532:5432 \
  --name pgvector \
  agno/pgvector:16

Run the script

python semantic_chunking.py

Semantic Chunking Params

Parameter	Type	Default	Description
`embedder`	`Union[str, Embedder, BaseEmbeddings]`	`OpenAIEmbedder`	The embedder configuration. Can be an Agno Embedder (e.g., `OpenAIEmbedder`, `GeminiEmbedder`), a Chonkie `BaseEmbeddings` instance (e.g., `OpenAIEmbeddings`), or a string model identifier (e.g., `"text-embedding-3-small"`) for Chonkie's AutoEmbeddings.
`chunk_size`	`int`	`5000`	Maximum tokens allowed per chunk.
`similarity_threshold`	`float`	`0.5`	Similarity threshold for grouping sentences (0-1). Lower values create larger groups (fewer chunks).
`similarity_window`	`int`	`3`	Number of sentences to consider for similarity calculation.
`min_sentences_per_chunk`	`int`	`1`	Minimum number of sentences per chunk.
`min_characters_per_sentence`	`int`	`24`	Minimum number of characters per sentence.
`delimiters`	`List[str]`	`[". ", "! ", "? ", "\n"]`	Delimiters to split sentences on.
`include_delimiters`	`Literal["prev", "next", None]`	`"prev"`	Include delimiters in the chunk text. Specify whether to include with the previous or next sentence.
`skip_window`	`int`	`0`	Number of groups to skip when looking for similar content to merge. 0 (default) uses standard semantic grouping; higher values enable merging of non-consecutive semantically similar groups.
`filter_window`	`int`	`5`	Window length for the Savitzky-Golay filter used in boundary detection.
`filter_polyorder`	`int`	`3`	Polynomial order for the Savitzky-Golay filter.
`filter_tolerance`	`float`	`0.2`	Tolerance for the Savitzky-Golay filter boundary detection.
`chunker_params`	`Dict[str, Any]`	`None`	Additional parameters to pass directly to Chonkie's SemanticChunker.

Get Started

Basics

Context Management

Execution Control

Additional Features

Integrations

Help

Semantic Chunking

Semantic Chunking Params

Get Started

Basics

Context Management

Execution Control

Additional Features

Integrations

Help

​Semantic Chunking Params

Semantic Chunking Params