Build a Product

You have an agent, AgentOS running it, and a Postgres holding state. What’s left to ship a product? Less than you think. Most of the gap between “AgentOS works” and “users can pay” is product work, not infrastructure. This page covers two things the rest of the section doesn’t: picking your starting point, and the operating loop that keeps your agent working once real users hit it.

Pick the right starting point

Don’t build AgentOS from scratch. Start from a template that’s closest to what you want to ship.

Template	Right when
Scout	You’re building a context agent that pulls from external systems (Slack, Drive, MCP, custom)
Dash	You’re building a data agent that answers questions from your database
Coda	You’re building a code companion that lives in Slack
Demo OS	You want a kitchen-sink reference with every feature wired up
Bare AgentOS	You want full control and don’t mind wiring it yourself

All four templates ship with Docker Compose, Railway scripts, Slack manifest, JWT setup, and a .env.production flow. You’re 90% of the way to deploy on day one.

Replace the demo data

Templates ship with synthetic data so the first run works. Replace it with your own:

Template	Swap
Scout	Configure context providers in `scout/contexts.py`. Point at your S3 buckets, your Drive folders, your MCP servers.
Dash	Replace `scripts/generate_data.py` with a loader for your dataset. Then rewrite `knowledge/` for your tables.
Coda	Edit `repos.yaml` to point at your repos. Make sure your `GITHUB_ACCESS_TOKEN` has the right scopes.
Demo OS	Fork the agent that’s closest to yours. Modify its instructions, knowledge, tools. Remove the agents you don’t need.

The goal is a single agent that does one useful thing on real data. Then iterate.

The shipping checklist

Each row maps to a page in this section. Don’t re-read them here; this is the punch list.

Step	Do	Where
Pick interfaces	Slack for B2B, Telegram for personal, WhatsApp for support, AG-UI for browser, custom HTTP for everything else. Start with one.	Interfaces
Wire auth	`RUNTIME_ENV=prd` and `JWT_VERIFICATION_KEY`. Control plane issues tokens, your service verifies.	Security & Auth
Turn on tracing	`tracing=True` from day one. The first time a user reports a bad answer, you’ll have the trace tree to debug from.	Observability
Gate irreversible actions	`requires_confirmation=True` for user approval, `@approval` for admin approval. Don’t add approval everywhere — friction kills adoption.	Human Approval
Schedule proactive work	Register recurring jobs in the app lifespan so they survive restarts.	Scheduling
Deploy	Railway via template scripts, AWS/GCP/Azure via container + managed Postgres + secrets, or self-hosted Docker Compose.	Deploy

Iterate from real usage

The first version is wrong about things, and that’s expected. The interesting question is how fast can you find what’s wrong and fix it. The loop:

A user reports a bad answer (Slack, support ticket, your own dogfooding).
Find the run. Filter agno_sessions by user_id and created_at to narrow it down. The session ID gives you the full thread.
Pull the trace from agno_traces and agno_spans. Look at what tools were called, what the model saw, what came back.
Replay it locally. Run the same input through the same agent in a script. Reproduce.
Patch the prompt, swap the tool, add a learning, fix the knowledge. Re-run.
Add the case to your eval suite so the regression can’t come back.
Ship.

Steps 1-3 should take five minutes once you know your data model. The signals that matter:

Signal	Where
Wrong answers users complained about	`agno_sessions` joined with the feedback table you build
Tools failing in production	`agno_spans` filtered by `status='error'`
Slow runs	`agno_spans` `end_time - start_time` per agent
Cost spikes	`agno_sessions.total_tokens` grouped by `agent_id` and day
Regression after a change	Run evals before deploying

Keep the agent learning

The agents that get better over time have feedback loops baked in. The agents that get worse are the ones nobody is watching. Three patterns the templates use: LearningMachine. Agno’s built-in pattern for agents that store discovered facts and retrieve them on the next run. When the agent figures out that the revenue_v2 table replaced revenue last March, it writes that to a learnings table; the next time someone asks a revenue question, that learning is in the prompt. Dash uses this end-to-end. See Learning. Knowledge updates as a feedback loop. When you find a wrong-answer pattern, add the right answer to knowledge. The user who asked the question gets a fix; every subsequent user gets the right answer the first time. Pal and Scout both lean on this. Eval-gated deploys. Every deploy runs an eval suite. Regressions block the merge. The eval suite grows with the bugs you find — every postmortem ends with a new eval case. Over time the suite becomes the institutional memory of what your agent should and shouldn’t do.

What’s left

You shipped. The rest is normal product work that AgentOS doesn’t try to solve:

Concern	Where
Pricing, billing, rate limits	A layer in front of AgentOS. Stripe, your own metering, your own limits.
Multi-tenant isolation	Per-tenant `db` instances or schema-per-tenant within one db
Compliance (SOC 2, HIPAA, etc.)	RBAC + audit logs + data residency choices in your storage backend
Custom UIs	AG-UI for the chat surface; build the rest as a normal web app calling AgentOS

AgentOS gives you the runtime. The product is yours to ship.

See it work end-to-end

Template	Product
Scout tutorial	Enterprise context agent over S3 + Slack + Drive + MCP
Dash tutorial	Self-learning data agent over a SaaS metrics dataset
Coda tutorial	Code companion that lives in Slack and triages issues

Pick the one closest to your product and follow it end-to-end. Then swap the data and ship.

Welcome

Tutorials

Runtime

Demo OS

Build a Product

Pick the right starting point

Replace the demo data

The shipping checklist

Iterate from real usage

Keep the agent learning

What’s left

See it work end-to-end

Welcome

Tutorials

Runtime

Demo OS

​Pick the right starting point

​Replace the demo data

​The shipping checklist

​Iterate from real usage

​Keep the agent learning

​What’s left

​See it work end-to-end

Pick the right starting point

Replace the demo data

The shipping checklist

Iterate from real usage

Keep the agent learning

What’s left

See it work end-to-end