> ## Documentation Index
> Fetch the complete documentation index at: https://docs.agno.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Improve Agents

> Build, improve, and iterate on agents using Claude Code and evals.

Your platform is running. Now make it yours.

The templates are designed for coding agents. Claude Code can read traces, run evals, and edit agent code in a tight feedback loop. Three of the five workflows run autonomously with no input from you.

## Claude Code Workflows

The `docs/` directory includes five prompts that cover the agent development lifecycle:

| Prompt                | What it does                                                | Autonomous |
| :-------------------- | :---------------------------------------------------------- | :--------- |
| `create-new-agent.md` | Scaffolds an agent, registers it, smoke-tests via cURL      | No         |
| `improve-agent.md`    | Derives probes from instructions, runs them, fixes failures | Yes        |
| `extend-agent.md`     | Add tools or features. You direct, Claude executes          | No         |
| `hill-climb.md`       | Runs evals, diagnoses failures, fixes until all pass        | Yes        |
| `review-codebase.md`  | Finds drift between docs, code, and config                  | Yes        |

Run any prompt by pasting it into a Claude Code session:

```
Run docs/create-new-agent.md
```

## Evals

Lock in agent behavior with the eval suite:

```bash theme={null}
python -m evals                # run the suite
python -m evals -v             # verbose with rich panels
python -m evals --case <name>  # run one case
```

Evals use [AgentAsJudgeEval](/evals/agent-as-judge) (LLM judge, binary pass/fail) and [ReliabilityEval](/evals/reliability) (tool-call assertions). Results log to Postgres so you can track behavior over time.

The `hill-climb.md` prompt runs evals, diagnoses failures, and fixes what's in scope. Stops when all cases pass.

## Agent Patterns

The templates ship with two agents that demonstrate different patterns:

| Agent      | Pattern          | Description                                                       |
| :--------- | :--------------- | :---------------------------------------------------------------- |
| WebSearch  | Direct tools     | Agent sees each tool individually                                 |
| CodeSearch | Context provider | Agent sees one `query_<thing>` tool that hands off to a sub-agent |

**Direct tools** work when the agent needs fine-grained control over each tool call. **Context providers** work when you want to encapsulate a capability (like searching a codebase) behind a single interface.

Copy either pattern when building your own agents.

## Teams and Workflows

For most things, one agent is enough. When it isn't:

| Pattern                                  | When to use                                                        |
| :--------------------------------------- | :----------------------------------------------------------------- |
| [Multi-agent teams](/teams/overview)     | Route to specialists, coordinate parallel work, synthesize results |
| [Agentic workflows](/workflows/overview) | Deterministic pipelines that run the same way every time           |

Rule of thumb: **agents for open questions, teams for routing, workflows for processes.**

## Scheduled Tasks

`scheduler=True` is on by default. Schedule any agent or workflow on a cron:

* **Maintenance.** Purge sessions older than 90 days. Vacuum tables.
* **Proactive runs.** Every weekday morning, summarize overnight news and send to Slack.
* **Periodic re-evaluation.** Run the eval suite on cron to catch behavior drift before users do.

See [Scheduler docs](/agent-os/scheduler/overview) for the cron API.

## Interfaces

Agents should live where your users are.

**Slack** is pre-wired in the templates. Set `SLACK_BOT_TOKEN` and `SLACK_SIGNING_SECRET` in your env and the interface activates automatically. See [Slack setup](/deploy/interfaces/slack/overview).

For Discord, Telegram, WhatsApp, or custom UIs, see the [Interfaces guide](/deploy/interfaces).

## Next Steps

<CardGroup cols={2}>
  <Card title="Tools Catalog" icon="wrench" href="/tools/toolkits">
    100+ integrations ready to add to your agents.
  </Card>

  <Card title="AgentOS Features" icon="sparkles" href="/agent-os/overview">
    Sessions, memory, tracing, and more.
  </Card>
</CardGroup>
