What is Safety Engine?
The Safety Engine is a powerful content filtering and policy enforcement system that helps you maintain safe, appropriate, and compliant AI interactions. Just like humans need guidelines and rules to ensure appropriate behavior, AI agents need safety policies to filter content, block inappropriate material, and protect sensitive information. The key benefits of the Safety Engine are:- Content Filtering: Automatically detect and block inappropriate content like adult material, hate speech, or sensitive topics
- Privacy Protection: Anonymize or redact sensitive information like phone numbers, emails, or personal data
- Compliance: Ensure your AI applications meet regulatory requirements and platform policies
- Customizable: Create your own policies or use pre-built ones for common use cases
- Dual Protection: Apply policies to both user input (user_policy) and agent output (agent_policy)
Core Principles For Safety Policies
When implementing safety policies, consider these important elements:- Policy Types: Choose between blocking, anonymizing, or raising exceptions based on your needs
- Detection Methods: Use keyword-based detection for speed or LLM-based detection for accuracy
- Input vs Output: Apply user_policy to filter incoming requests and agent_policy to filter outgoing responses
- Language Support: Policies can detect and respond in multiple languages automatically
- Confidence Thresholds: Policies use confidence scores to determine when to take action
Understanding Policy Actions
Blocking Policies: Completely prevent content from being processed and return a block message Anonymization Policies: Replace sensitive information with safe alternatives while preserving context Exception Policies: Raise exceptions that stop execution when inappropriate content is detectedPolicy Types and Use Cases
Blocking Policies
- CryptoBlockPolicy: Essential for banks and financial institutions to comply with anti-money laundering (AML) and Know Your Customer (KYC) regulations
- AdultContentBlockPolicy: Required for professional banking environments and customer-facing applications
- SensitiveSocialBlockPolicy: Critical for maintaining professional communication standards in financial services
Anonymization Policies
- AnonymizePhoneNumbersPolicy: Protects customer PII (Personally Identifiable Information) by replacing phone numbers with safe alternatives
- AnonymizePhoneNumbersPolicy_LLM_Finder: Uses AI for more accurate detection of phone numbers and other sensitive data in customer communications
Exception Policies
- CryptoRaiseExceptionPolicy: Raises exceptions for compliance logging and regulatory reporting when crypto-related content is detected
- AdultContentRaiseExceptionPolicy: Stops execution and logs incidents for audit trails in professional banking environments
Let’s Create a Banking Assistant with Compliance Controls
In this example, we’ll create a banking assistant that enforces financial regulations and protects sensitive customer information.Creating Custom Safety Policies
You can build policies that integrate seamlessly with your AI agents.Policy Architecture Overview
Custom policies follow a three-component architecture:- Rule: Defines what content to detect (using regex patterns or LLM-based detection)
- Action: Specifies what to do when content is detected (block, anonymize, or raise exception)
- Policy: Combines the rule and action into a complete safety policy
Step-by-Step Custom Policy Creation
Let’s create a custom policy to protect credit card information in a financial application:Step 1: Create the Rule Class
Step 2: Create the Action Class
Step 3: Create the Policy
Advanced: LLM-Enhanced Detection
For more sophisticated detection, you can use LLM-based rules that leverage AI for better accuracy:Policy Action Types
Choose the appropriate action type based on your security requirements:1. Allow Actions
2. Blocking Actions
3. Replace Actions
4. Anonymization Actions
5. Exception Actions
6. LLM-Enhanced Blocking Actions
7. LLM-Enhanced Exception Actions
Integrating Custom Policies with Agents
Once you’ve created your custom policy, integrate it with your AI agents:Multi-Language Support
Custom policies automatically support multiple languages:Advanced Configuration Options
Custom policies support advanced configuration for enterprise use cases:Best Practices for Custom Policies
- Start Simple: Begin with regex-based rules for performance, then enhance with LLM detection
- Test Thoroughly: Validate your policies with diverse test cases
- Consider Performance: Balance accuracy with processing speed
- Document Clearly: Provide clear descriptions for policy maintenance
- Handle Edge Cases: Account for various input formats and languages
- Monitor Effectiveness: Track policy performance and adjust confidence thresholds
Choosing the Right Action Type
Select the appropriate action based on your security requirements and use case:| Action Type | Use Case | Security Level | Content Preservation |
|---|---|---|---|
| Allow | Low-risk content, testing | Low | Full |
| Replace | Moderate risk, content flow needed | Medium | Partial (placeholder) |
| Anonymize | High risk, unique replacements needed | High | Partial (anonymized) |
| Block | Critical security, complete prevention | Very High | None |
| Exception | Compliance, audit trails | Very High | None |
- Multiple Policy Types: Combine blocking, anonymization, and exception policies for comprehensive regulatory compliance
- LLM-Enhanced Detection: Use AI-powered content detection for better accuracy in identifying financial risks and compliance violations
- Privacy Protection: Automatically anonymize sensitive customer information like SSNs, account numbers, and personal data
- Custom Policy Configuration: Create tailored policies with specific language support for international banking operations
- Dual Protection: Apply different policies to customer input (user_policy) and agent responses (agent_policy) for complete coverage
- Language Support: Automatic language detection and localized responses for global banking and fintech applications
- Audit Trail: Monitor policy triggers, confidence scores, and compliance actions for regulatory reporting and risk management

