> ## Documentation Index > Fetch the complete documentation index at: https://docs.agno.com/llms.txt > Use this file to discover all available pages before exploring further. # Background Output Evaluation > Use Agent as Judge evaluation to assess responses as a background task This example demonstrates how to use Agent as Judge evaluation to assess the main agent's output as a background task. Unlike blocking validation, background evaluation: * Does NOT block the response to the user * Logs evaluation results for monitoring and analytics * Can trigger alerts or store metrics without affecting latency **Use cases:** * Quality monitoring in production * Compliance auditing * Validating hallucinations or other inappropriate content ```python background_output_evaluation.py theme={null} from agno.agent import Agent from agno.db.sqlite import AsyncSqliteDb from agno.eval.agent_as_judge import AgentAsJudgeEval from agno.models.openai import OpenAIResponses from agno.os import AgentOS # Setup database for agent and evaluation storage db = AsyncSqliteDb(db_file="tmp/evaluation.db") # Create the evaluator using Agent as Judge evaluator = AgentAsJudgeEval( db=db, name="Response Quality Check", model=OpenAIResponses(id="gpt-5.2"), criteria="Response should be helpful, accurate, and well-structured", additional_guidelines=[ "Evaluate if the response addresses the user's question directly", "Check if the information provided is correct and reliable", "Assess if the response is well-organized and easy to understand", ], threshold=7, run_in_background=True, # Runs evaluation without blocking the response ) # Create the main agent with Agent as Judge evaluation main_agent = Agent( id="support-agent", name="CustomerSupportAgent", model=OpenAIResponses(id="gpt-5.2"), instructions=[ "You are a helpful customer support agent.", "Provide clear, accurate, and friendly responses.", "If you don't know something, say so honestly.", ], db=db, post_hooks=[evaluator], # Automatically evaluates each response markdown=True, ) # Create AgentOS agent_os = AgentOS(agents=[main_agent]) app = agent_os.get_app() if __name__ == "__main__": agent_os.serve(app="background_output_evaluation:app", port=7777, reload=True) ``` ```bash theme={null} uv pip install -U agno openai uvicorn ``` ```bash Mac/Linux theme={null} export OPENAI_API_KEY="your_openai_api_key_here" ``` ```bash Windows theme={null} $Env:OPENAI_API_KEY="your_openai_api_key_here" ``` ```bash Mac/Linux theme={null} python background_output_evaluation.py ``` ```bash Windows theme={null} python background_output_evaluation.py ``` ```bash theme={null} curl -X POST http://localhost:7777/agents/support-agent/runs \ -F "message=How do I reset my password?" \ -F "stream=false" ``` The response will be returned immediately. The evaluation runs in the background and results are stored in the database. ## What Happens 1. User sends a request to the agent 2. The agent processes and generates a response 3. The response is sent to the user **immediately** 4. Background evaluation runs: * `AgentAsJudgeEval` automatically evaluates the response against the criteria * Scores the response on a scale of 1-10 * Stores results in the database ### Production Extensions In production, you could extend this pattern to: | Extension | Description | | -------------------- | ----------------------------------------------------------- | | **Database Storage** | Store evaluations for analytics dashboards | | **Alerting** | Use `on_fail` callback to send alerts when evaluations fail | | **Observability** | Log to platforms like Datadog or OpenTelemetry | | **A/B Testing** | Compare response quality across model versions | | **Training Data** | Build datasets for fine-tuning | Background evaluation is ideal for quality monitoring without impacting user experience. For scenarios where you need to block bad responses, use synchronous hooks instead. ## Related Examples Run all hooks as background tasks Mix synchronous and background hooks