Skip to main content
# pip install "upsonic[safety-engine]"
uv pip install "upsonic[safety-engine]"

What is Profanity Policy?

Profanity policies detect profanity, toxic content, and inappropriate language using the Detoxify ML library. Supports multiple models including unbiased, original, multilingual, and lightweight variants.

Why its Important?

Profanity policies are essential for maintaining a professional and respectful environment when using AI agents. These policies prevent toxic content from being processed by LLMs, which helps maintain platform quality, protect users from harmful language, and ensure appropriate communication standards.
  • Prevents sending toxic content to LLM: Blocks profanity and toxic language from being processed by language models, maintaining platform quality
  • Maintains professional standards: Ensures your AI agent interactions remain respectful and appropriate for all audiences
  • Protects user experience: Prevents exposure to offensive or harmful language, creating a safer communication environment

Usage

from upsonic import Agent, Task
from upsonic.safety_engine.policies import ProfanityBlockPolicy_LLM

agent = Agent(
    "anthropic/claude-sonnet-4-5",
    user_policy=ProfanityBlockPolicy_LLM,
    debug=True
)

task = Task("You are an idiot!")
result = agent.print_do(task)
# Blocked with educational message
print("Result:", result)

Available Variants

  • ProfanityBlockPolicy: Standard blocking with unbiased model
  • ProfanityBlockPolicy_Original: BERT-based original model
  • ProfanityBlockPolicy_Multilingual: Multi-language support (7 languages)
  • ProfanityBlockPolicy_LLM: LLM-powered contextual block messages
  • ProfanityRaiseExceptionPolicy: Raises DisallowedOperation exception
  • ProfanityRaiseExceptionPolicy_LLM: LLM-generated exception messages
  • GPU variants: ProfanityBlockPolicy_GPU, ProfanityRaiseExceptionPolicy_GPU, etc.
  • CPU variants: ProfanityBlockPolicy_CPU, ProfanityRaiseExceptionPolicy_CPU, etc.

Installation

uv sync --extra safety-engine

Custom Policy

from upsonic.safety_engine.policies.profanity_policies import ProfanityRule, ProfanityBlockAction
from upsonic.safety_engine.base.policy import Policy

custom_rule = ProfanityRule(
    model_name="unbiased",  # or "original", "multilingual", "original-small", "unbiased-small"
    device="cuda"  # or "cpu" or None for auto
)

policy = Policy(
    name="Custom Profanity Policy",
    description="With custom model and device",
    rule=custom_rule,
    action=ProfanityBlockAction(min_confidence=0.5)
)