- Fixed Size: Splits text every 500 characters (which may break recipes mid-instruction)
- Semantic: Keeps complete recipes together based on meaning
- Document: Each page becomes a chunk
Available Chunking Strategies
Fixed Size Chunking
Split content into uniform chunks with specified size and overlap.
Semantic Chunking
Use semantic similarity to identify natural breakpoints in content.
Recursive Chunking
Recursively split content using multiple separators for hierarchical processing.
Document Chunking
Preserve document structure by treating sections as individual chunks.
CSV Row Chunking
Splits CSV files by treating each row as an individual chunk. Only compatible with CSVs.
Markdown Chunking
Split markdown content while preserving heading structure and hierarchy. Only compatible with Markdown files.
Agentic Chunking
Use AI to intelligently determine optimal chunk boundaries.
Custom Chunking
Build your own chunking strategy for specialized use cases.
Using Chunking Strategies
Chunking strategies are configured when setting up readers for your knowledge base:Choosing a Strategy
The choice of chunking strategy depends on your content type and use case:- Text documents: Semantic chunking maintains context and meaning
- Structured documents: Document or Markdown chunking preserves hierarchy
- Tabular data: CSV Row chunking treats each row as a separate entity
- Mixed content: Recursive chunking provides flexibility with multiple separators
- Uniform processing: Fixed Size chunking ensures consistent chunk dimensions
chunking_strategy
parameter when configuring the reader.
Consider your specific use case and performance requirements when choosing a chunking strategy, since different strategies vary in processing time and memory usage.