Data labeling and classification

Agents can now:

Turn text, images, audio, video, and PDFs into structured records.
Assign labels, label sets, taxonomy paths, and labeled spans.
Score and rank model outputs for evals and preference data.
Add a reviewer and an adjudicator when label quality matters.

Each task follows a similar pattern: an agent with an output_schema. Agno is natively multimodal and type-safe, so the full labeling stack can be built in pure python.

Example

Here’s a quick example classifying reviews into {positive, negative, neutral} using gemini-3.5-flash. The agent outputs a valid Classification object.

cookbook/data_labeling/_01_text_classification/basic.py

from typing import Literal
from agno.agent import Agent
from pydantic import BaseModel, Field


class Classification(BaseModel):
    label: Literal["positive", "negative", "neutral"] = Field(
        ..., description="The assigned sentiment label"
    )


agent = Agent(
    model="google:gemini-3.5-flash",
    instructions="You classify product reviews by sentiment.",
    output_schema=Classification,
)

result = agent.run("Broken on arrival, total waste of money.").content
# Classification(label='negative')

Swap the schema and instructions and the same pattern covers data extraction, span labeling, scoring, and preference ranking.

If you’re looking to jump straight into code - the data labeling cookbook contains 40+ runnable recipes across 18 data labeling patterns.

Data labeling workflows

Pick the page that matches what you need.

Workload	Input	Output	Page
Data extraction	Any modality	Typed Pydantic object	Data extraction
Classification	Any modality	One label, label set, or spans	Classification
Scoring / evaluation	Prompt + response	Rubric scores	LLM as judge
Preference ranking	Prompt + two responses	Winner + rationale	Preference data
Non-text input	Image, audio, video, PDF	Any of the above	Multimodal inputs
Reviewed labels	Any input	Adjudicated label + audit trail	Quality pipeline

Model choice

We use gemini-3.5-flash across the cookbooks because it handles text, image, audio, video, and PDF. Agno is model-agnostic, so you can swap models as needed.

Explore

Data extraction

Turn any modality into a typed object, with optional per-field confidence.

Classification

Single-label, multi-label, hierarchical, and span labeling.

LLM as judge

Score outputs against a rubric. The same machinery, used for evals.

Preference data

Rank A vs B for RLHF and DPO datasets.

Multimodal inputs

Feed images, audio, video, and PDFs into any labeler.

Quality pipeline

Two labelers, a reviewer, and an adjudicator with an audit trail.

​Example

​Data labeling workflows

​Model choice

​Explore

Data extraction

Classification

LLM as judge

Preference data

Multimodal inputs

Quality pipeline

​Developer Resources

Example

Data labeling workflows

Model choice

Explore

Developer Resources