Documentation Index
Fetch the complete documentation index at: https://docs.agno.com/llms.txt
Use this file to discover all available pages before exploring further.
Register cloud storage providers on a Knowledge instance with content_sources. Each provider has .file() and .folder() methods that create content references you pass to knowledge.insert().
from agno.knowledge.knowledge import Knowledge
from agno.knowledge.remote_content import S3Config
knowledge = Knowledge(
vector_db=vector_db,
contents_db=contents_db,
content_sources=[
S3Config(
id="company-docs",
name="Company Documents",
bucket_name="my-docs-bucket",
region="us-east-1",
),
],
)
# Insert a single file
knowledge.insert(
name="Q4 Report",
remote_content=knowledge.content_sources[0].file("reports/q4-2025.pdf"),
)
# Insert an entire folder
knowledge.insert(
name="Engineering Specs",
remote_content=knowledge.content_sources[0].folder("specs/"),
)
Supported Providers
| Provider | Config Class | Install |
|---|
| Amazon S3 | S3Config | pip install boto3 |
| Google Cloud Storage | GcsConfig | pip install google-cloud-storage |
| SharePoint | SharePointConfig | pip install msal requests |
| GitHub | GitHubConfig | pip install requests |
| Azure Blob Storage | AzureBlobConfig | pip install azure-identity azure-storage-blob (azure-identity is only for Service Principal authentication) |
All configs are importable from agno.knowledge.remote_content.
Provider Configuration
S3Config
from agno.knowledge.remote_content import S3Config
s3 = S3Config(
id="s3-docs",
name="S3 Documents",
bucket_name="my-bucket",
region="us-east-1",
aws_access_key_id="...", # optional, falls back to default credential chain
aws_secret_access_key="...", # optional, falls back to default credential chain
prefix="documents/", # optional, default prefix for browsing
)
| Field | Type | Default | Description |
|---|
id | str | required | Unique identifier for this source |
name | str | required | Display name |
bucket_name | str | required | S3 bucket name |
region | Optional[str] | None | AWS region |
aws_access_key_id | Optional[str] | None | AWS access key. Falls back to default credential chain. |
aws_secret_access_key | Optional[str] | None | AWS secret key. Falls back to default credential chain. |
prefix | Optional[str] | None | Default prefix for browsing and listing |
GcsConfig
from agno.knowledge.remote_content import GcsConfig
gcs = GcsConfig(
id="gcs-docs",
name="GCS Documents",
bucket_name="my-gcs-bucket",
project="my-gcp-project",
)
| Field | Type | Default | Description |
|---|
id | str | required | Unique identifier |
name | str | required | Display name |
bucket_name | str | required | GCS bucket name |
project | Optional[str] | None | GCP project ID |
credentials_path | Optional[str] | None | Path to GCP credentials file |
prefix | Optional[str] | None | Default prefix |
GitHubConfig
from agno.knowledge.remote_content import GitHubConfig
github = GitHubConfig(
id="my-repo",
name="My Repository",
repo="owner/repo",
token="ghp_...",
branch="main",
)
| Field | Type | Default | Description |
|---|
id | str | required | Unique identifier |
name | str | required | Display name |
repo | str | required | Repository in owner/repo format |
token | Optional[str] | None | GitHub personal access token (needs Contents: read) |
branch | Optional[str] | None | Branch name |
path | Optional[str] | None | Default path filter |
SharePointConfig
from agno.knowledge.remote_content import SharePointConfig
sharepoint = SharePointConfig(
id="sharepoint-docs",
name="SharePoint Documents",
tenant_id="...",
client_id="...",
client_secret="...",
hostname="contoso.sharepoint.com",
site_path="/sites/Engineering",
)
| Field | Type | Default | Description |
|---|
id | str | required | Unique identifier |
name | str | required | Display name |
tenant_id | str | required | Azure AD tenant ID |
client_id | str | required | Azure AD application client ID |
client_secret | str | required | Azure AD application client secret |
hostname | str | required | SharePoint hostname |
site_path | Optional[str] | None | Site path (e.g., /sites/Engineering) |
site_id | Optional[str] | None | Full site ID |
folder_path | Optional[str] | None | Default folder path |
AzureBlobConfig
Supports two authentication methods: Service Principal (Azure AD client credentials) and SAS (Shared Access Signature) token. Provide one or the other, not both.
Service Principal
SAS Token
from agno.knowledge.remote_content import AzureBlobConfig
azure = AzureBlobConfig(
id="azure-docs",
name="Azure Blob Documents",
tenant_id="...",
client_id="...",
client_secret="...",
storage_account="mystorageaccount",
container="documents",
)
from agno.knowledge.remote_content import AzureBlobConfig
azure = AzureBlobConfig(
id="azure-docs",
name="Azure Blob Documents",
sas_token="sv=2022-11-02&ss=b&srt=sco&sp=rl&se=...",
storage_account="mystorageaccount",
container="documents",
)
| Field | Type | Default | Description |
|---|
id | str | required | Unique identifier |
name | str | required | Display name |
tenant_id | Optional[str] | None | Azure AD tenant ID (Service Principal auth) |
client_id | Optional[str] | None | Azure AD application client ID (Service Principal auth) |
client_secret | Optional[str] | None | Azure AD application client secret (Service Principal auth) |
sas_token | Optional[str] | None | SAS token string (SAS token auth) |
storage_account | str | required | Azure storage account name |
container | str | required | Blob container name |
prefix | Optional[str] | None | Default prefix |
Requires the Storage Blob Data Reader (or Contributor) role on the storage account.
Inserting Content
Each config has .file() and .folder() methods that return content references for knowledge.insert().
# Single file
knowledge.insert(
name="Architecture Doc",
remote_content=s3.file("docs/architecture.pdf"),
)
# Entire folder
knowledge.insert(
name="All Specs",
remote_content=gcs.folder("specs/"),
)
# GitHub file from a specific branch
knowledge.insert(
name="README",
remote_content=github.file("README.md", branch="develop"),
)
# SharePoint file from a specific site
knowledge.insert(
name="Policy",
remote_content=sharepoint.file("Shared Documents/policy.pdf", site_path="/sites/HR"),
)
Browsing S3 Files
S3Config supports paginated file listing with list_files(). This is useful for building file pickers or exploring bucket contents before ingesting.
result = s3.list_files(prefix="reports/", limit=50, page=1)
for folder in result.folders:
print(f"Folder: {folder['name']}")
for file in result.files:
print(f"File: {file['name']} ({file['size']} bytes)")
print(f"Page {result.page} of {result.total_pages}")
| Parameter | Type | Default | Description |
|---|
prefix | Optional[str] | None | Path prefix filter. Overrides the config’s prefix. |
delimiter | str | "/" | Folder delimiter |
limit | int | 100 | Files per page (1-1000) |
page | int | 1 | Page number (1-indexed) |
An async variant alist_files() is also available with the same signature.
Multiple Sources
Register multiple providers on a single Knowledge instance.
knowledge = Knowledge(
vector_db=vector_db,
contents_db=contents_db,
content_sources=[s3, gcs, github, sharepoint, azure],
)
# Insert from different sources
knowledge.insert(name="S3 Doc", remote_content=s3.file("doc.pdf"))
knowledge.insert(name="GitHub Doc", remote_content=github.file("README.md"))
When running with AgentOS, registered sources are exposed via the /knowledge/{id}/sources API endpoint for listing and browsing.
Next Steps
| Task | Guide |
|---|
| Content types overview | Content Types |
| Filter search results | Filtering |
| Set up a vector database | Vector Databases |