Quick Reference
- Default to automatic memory (
enable_user_memories=True
) unless you have a specific reason for agentic control - Always provide user_id, don’t rely on the default “default” user
- Use cheaper models for memory operations when using agentic memory
- Implement pruning for long-running applications
- Monitor token usage in production to catch memory-related cost spikes
- Test with realistic data: 100+ memories behave very differently than 5 memories
The Agentic Memory Token Trap
The Problem: When you useenable_agentic_memory=True
, every memory operation triggers a separate, nested LLM call. This architecture can cause token usage to explode, especially as memories accumulate.
Here’s what happens under the hood:
- User sends a message → Main LLM call processes it
- Agent decides to update memory → Calls
update_user_memory
tool - Nested LLM call fires with:
- Detailed system prompt (~50 lines)
- ALL existing user memories loaded into context
- Memory management instructions and tools
- Memory LLM makes tool calls (add, update, delete)
- Control returns to main conversation
Mitigation Strategy #1: Use Automatic Memory
For most use cases, automatic memory is your best bet—it’s significantly more efficient:Mitigation Strategy #2: Use a Cheaper Model for Memory Operations
If you do need agentic memory, use a less expensive model for memory management while keeping a powerful model for conversation:Mitigation Strategy #3: Guide Memory Behavior with Instructions
Add explicit instructions to prevent frivolous memory updates:Mitigation Strategy #4: Implement Memory Pruning
Prevent memory bloat by periodically cleaning up old or irrelevant memories:Mitigation Strategy #5: Set Tool Call Limits
Prevent runaway memory operations by limiting tool calls per conversation:Common Pitfalls
The user_id Pitfall
The Problem: Forgetting to setuser_id
causes all memories to default to user_id="default"
, mixing different users’ memories together.
user_id
explicitly, especially in multi-user applications.
The Double-Enable Pitfall
The Problem: Using bothenable_user_memories=True
and enable_agentic_memory=True
doesn’t give you both—agentic mode overrides automatic mode.