Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.agno.com/llms.txt

Use this file to discover all available pages before exploring further.

An agent needs two kinds of state: what was said in this thread, and what the agent knows about this user. Agno gives you both. You do not build a conversation store or a retrieval layer.
from agno.agent import Agent
from agno.db.postgres import PostgresDb
from agno.models.openai import OpenAIResponses

db = PostgresDb(db_url="postgresql+psycopg://ai:ai@localhost:5532/ai")

agent = Agent(
    model=OpenAIResponses(id="gpt-5.5"),
    db=db,
    add_history_to_context=True,
    num_history_runs=5,
    enable_agentic_memory=True,
)

agent.run(
    "My name is Sarah and I prefer email over phone.",
    user_id="sarah@acme.com",
    session_id="thread-42",
)
reply = agent.run(
    "What's the best way to reach me?",
    user_id="sarah@acme.com",
    session_id="thread-42",
).content
# Uses both the thread history and the stored memory about Sarah.

Sessions vs memory

They solve different problems. Use both.
Session historyMemory
StoresThe messages in this conversation threadLearned facts about the user
ScopeOne session_idOne user_id, across all their sessions
Enable withadd_history_to_context=Trueenable_agentic_memory=True or update_memory_on_run=True
Answers”What did we just discuss?""What do I know about this person?”

Identifiers

IdentifierDistinguishesMaps to in your product
user_idThe personYour auth subject (user ID, email)
session_idA conversation thread for that personA chat tab, a Slack thread, a support case
Pass both on every run. Threads are scoped by session_id. Memory is scoped by user_id.

Memory: automatic or agentic

ModeSetUse when
Automaticupdate_memory_on_run=TrueCompute memory after every run.
Agenticenable_agentic_memory=TrueThe agent decides using tool calls.

Reading memory back

For a profile screen or a debug view, pull a user’s memories directly.
memories = agent.get_user_memories(user_id="sarah@acme.com")

Long conversations

History grows. Two ways to keep token cost bounded without losing continuity:
TechniqueEffect
num_history_runs=NOnly the last N turns flow into context
Session summaries (enable_session_summaries=True)Older turns are condensed into a running summary

Next steps

TaskGuide
Put this behind an HTTP APIServe as an API
Carry memory across Slack and webInterfaces
Give the agent live dataConnecting your data

Developer Resources