Skip to main content
The JinaEmbedder class is used to embed text data into vectors using the Jina AI API. You can get started with Jina AI here. Get your API key.

Usage

jina_embedder.py
from agno.knowledge.knowledge import Knowledge
from agno.vectordb.pgvector import PgVector
from agno.knowledge.embedder.jina import JinaEmbedder

# Add embedding to database
embeddings = JinaEmbedder(id="jina-embeddings-v3").get_embedding("The quick brown fox jumps over the lazy dog.")
# Print the embeddings and their dimensions
print(f"Embeddings: {embeddings[:5]}")
print(f"Dimensions: {len(embeddings)}")

# Use an embedder in a knowledge base
knowledge = Knowledge(
    vector_db=PgVector(
        db_url="postgresql+psycopg://ai:ai@localhost:5532/ai",
        table_name="jina_embeddings",
        embedder=JinaEmbedder(id="jina-embeddings-v3"),
    ),
    max_results=2,
)

Advanced Usage

# Configure embedder with custom settings
embedder = JinaEmbedder(
    id="jina-embeddings-v3",
    dimensions=1024,
    embedding_type="float",
    late_chunking=True,
    batch_size=50,
    timeout=30.0
)

# Use async methods for better performance
import asyncio

async def embed_texts():
    embedder = JinaEmbedder()
    texts = ["First text", "Second text", "Third text"]
    
    # Get embeddings in batches
    embeddings, usage = await embedder.async_get_embeddings_batch_and_usage(texts)
    print(f"Generated {len(embeddings)} embeddings")
    print(f"Usage info: {usage[0]}")

# Run async example
asyncio.run(embed_texts())

Params

ParameterTypeDefaultDescription
idstr"jina-embeddings-v3"The model ID to use for generating embeddings.
dimensionsint1024The number of dimensions for the embedding vectors.
embedding_typeLiteral["float", "base64", "int8"]"float"The format type of the returned embeddings.
late_chunkingboolFalseWhether to enable late chunking optimization.
userOptional[str]NoneUser identifier for tracking purposes. Optional.
api_keyOptional[str]JINA_API_KEY env varThe Jina AI API key. Can be set via environment variable.
base_urlstr"https://api.jina.ai/v1/embeddings"The base URL for the Jina API.
headersOptional[Dict[str, str]]NoneAdditional headers to include in API requests. Optional.
request_paramsOptional[Dict[str, Any]]NoneAdditional parameters to include in the API request. Optional.
timeoutOptional[float]NoneTimeout in seconds for API requests. Optional.
enable_batchboolFalseEnable batch processing to reduce API calls and avoid rate limits
batch_sizeint100Number of texts to process in each API call for batch operations.

Features

  • Async Support: Full async/await support for better performance in concurrent applications
  • Batch Processing: Efficient batch processing of multiple texts with configurable batch size
  • Late Chunking: Support for Jina’s late chunking optimization technique
  • Flexible Output: Multiple embedding formats (float, base64, int8)
  • Usage Tracking: Get detailed usage information for API calls
  • Error Handling: Robust error handling with fallback mechanisms

Developer Resources

I