content
as the building block of any piece of knowledge.
Content can be added to knowledge from different sources.
Content Origin | Description |
---|---|
Path | Local files or directories containing files |
Url | Direct links to files or other sites |
Text | Raw text content |
Topic | Search topics from repositories like Arxiv or Wikipedia |
Remote Content | Content stored in remote repositories like S3 or Google Cloud Storage |
PDFReader
class is created
but we update the chunk_size. Similarly, we can update the chunking_strategy
and other parameters that will influence how content is ingested and processed.