- Steer your agent or team toward specific behaviors or roles.
- Constrain or expand your agent or team’s capabilities.
- Ensure outputs are consistent, relevant, and aligned with your application’s needs.
- Enable advanced use cases such as multi-step reasoning, tool use, or structured output.
- System message: The system message is the main context that is sent to the agent or team, including all additional context
- User message: The user message is the message that is sent to the agent or team.
- Chat history: The chat history is the history of the conversation between the agent or team and the user.
- Additional input: Any few-shot examples or other additional input that is added to the context.
Context Caching
Most model providers support caching of system and user messages, though the implementation differs between providers. The general approach is to cache repetitive content and common instructions, and then reuse that cached content in subsequent requests as the prefix of your system message. In other words, if the model supports it, you can reduce the number of tokens sent to the model by putting static content at the start of your system message. Agno’s context construction is designed to place the most likely static content at the beginning of the system message.If you wish to fine-tune this, the recommended approach is to manually set the system message. Some examples of prompt caching:
- OpenAI’s prompt caching
- Anthropic prompt caching -> See an Agno example of this
- OpenRouter prompt caching