OpenAI Moderation Guardrail

The OpenAI Moderation Guardrail is a built-in guardrail that detects content that violates OpenAI’s content policy in the input of your Agents. This can be helpful to detect content that violates OpenAI’s content policy faster and without firing the unsucessful API request. It can also be useful if you are using a different provider but still want to use the OpenAI Moderation guidelines.

Usage

To use the OpenAI Moderation Guardrail, you need to import it and pass it to the Agent with the pre_hooks parameter:

from agno.guardrails import OpenAIModerationGuardrail
from agno.agent import Agent
from agno.models.openai import OpenAIChat

openai_moderation_guardrail = OpenAIModerationGuardrail()

agent = Agent(
    name="OpenAI Moderation Guardrail Agent",
    model=OpenAIChat(id="gpt-5-mini"),
    pre_hooks=[openai_moderation_guardrail],
)

Moderation model

By default, the OpenAI Moderation Guardrail will use OpenAI’s omni-moderation-latest model. You can adjust which model is used for moderation by providing the moderation_model parameter:

openai_moderation_guardrail = OpenAIModerationGuardrail(
    moderation_model="omni-moderation-latest",
)

Moderation categories

You can specify which categories the guardrail should check for. By default, the guardrail will consider all the existing moderation categories. You can check the list of categories in OpenAI’s docs. You can override the default list of moderation categories using the raise_for_categories parameter:

openai_moderation_guardrail = OpenAIModerationGuardrail(
    raise_for_categories=["violence", "hate"],
)

Developer Resources

View Examples
View Reference
View Cookbook

Introduction

Learn

Help

OpenAI Moderation Guardrail

Usage

Moderation model

Moderation categories

Developer Resources

Introduction

Learn

Help

​Usage

​Moderation model

​Moderation categories

​Developer Resources

Usage

Moderation model

Moderation categories

Developer Resources