Overview
What is Safety Engine?
Safety Engine is a comprehensive content filtering and policy enforcement system for AI agents. It allows you to control what goes into your agents (user input) and what comes out (agent responses) by applying policies that automatically detect and handle sensitive content like PII, prohibited topics, adult content, hate speech, or custom safety rules.How Safety Engine Works
Safety Engine operates at two key points in your agent’s lifecycle:- Before Processing (user_policy): Filters and validates user input before it reaches your agent
- After Processing (agent_policy): Sanitizes and validates agent output before it’s returned to the user
- Rule: Detects specific content (e.g., “Does this text contain credit card numbers?”)
- Action: Decides what to do when content is detected (e.g., “Block it”, “Anonymize it”, “Replace it”)
Why Safety Engine is Important
- Compliance: Meet regulatory requirements (GDPR, HIPAA, PCI-DSS, etc.)
- Privacy Protection: Automatically detect and protect sensitive personal information
- Content Moderation: Block inappropriate, harmful, or prohibited content
- Risk Mitigation: Prevent your AI from exposing sensitive data or violating policies
- Multi-language Support: Automatically adapts to user’s language
- Flexibility: Use pre-built policies or create custom ones for your specific needs
Prebuilt Policies
Cryptocurrency Policies
What is Crypto Policy? Cryptocurrency policies detect and handle crypto-related content. Designed for financial institutions that need to block or control cryptocurrency discussions, trading advice, or wallet addresses. UsageCryptoBlockPolicy: Static keyword detection with blockingCryptoBlockPolicy_LLM_Block: Static detection with LLM-generated block messagesCryptoBlockPolicy_LLM_Finder: LLM-powered detection for better accuracyCryptoReplace: Replaces crypto keywords with placeholder textCryptoRaiseExceptionPolicy: Raises DisallowedOperation exceptionCryptoRaiseExceptionPolicy_LLM_Raise: LLM-generated exception messages
PII (Personal Identifiable Information) Policies
What is PII Policy? PII policies detect and protect personal identifiable information including emails, phone numbers, SSN, addresses, credit cards, driver’s licenses, passports, IP addresses, and other sensitive personal data. UsagePIIBlockPolicy: Blocks any content with PIIPIIBlockPolicy_LLM: LLM-powered block messagesPIIBlockPolicy_LLM_Finder: LLM detection for better accuracyPIIAnonymizePolicy: Anonymizes PII with unique replacementsPIIReplacePolicy: Replaces PII with[PII_REDACTED]PIIRaiseExceptionPolicy: Raises DisallowedOperation exceptionPIIRaiseExceptionPolicy_LLM: LLM-generated exception messages
Phone Number Policies
What is Phone Number Policy? Phone number policies specifically detect and anonymize phone numbers in various formats (US, international, etc.). UsageAnonymizePhoneNumbersPolicy: Pattern-based detection and anonymizationAnonymizePhoneNumbersPolicy_LLM_Finder: LLM-powered detection for better accuracy
Adult Content Policies
What is Adult Content Policy? Adult content policies detect explicit sexual content, adult themes, age-restricted material, and inappropriate content for general audiences. UsageAdultContentBlockPolicy: Keyword and pattern detection with blockingAdultContentBlockPolicy_LLM: LLM-powered block messagesAdultContentBlockPolicy_LLM_Finder: LLM detection for context awarenessAdultContentRaiseExceptionPolicy: Raises DisallowedOperation exceptionAdultContentRaiseExceptionPolicy_LLM: LLM-generated exception messages
Sensitive Social Policies
What is Sensitive Social Policy? Sensitive social policies detect racism, hate speech, discriminatory language, and other sensitive social issues to maintain respectful and inclusive communication. UsageSensitiveSocialBlockPolicy: Keyword and pattern detection with blockingSensitiveSocialBlockPolicy_LLM: LLM-powered block messagesSensitiveSocialBlockPolicy_LLM_Finder: LLM detection for context awarenessSensitiveSocialRaiseExceptionPolicy: Raises DisallowedOperation exceptionSensitiveSocialRaiseExceptionPolicy_LLM: LLM-generated exception messages
Financial Information Policies
What is Financial Info Policy? Financial information policies detect and protect credit cards, bank accounts, SSN, routing numbers, IBAN, SWIFT codes, tax IDs, investment accounts, cryptocurrency wallets, and other sensitive financial data. UsageFinancialInfoBlockPolicy: Pattern detection with blockingFinancialInfoBlockPolicy_LLM: LLM-powered block messagesFinancialInfoBlockPolicy_LLM_Finder: LLM detection for better accuracyFinancialInfoAnonymizePolicy: Anonymizes financial dataFinancialInfoReplacePolicy: Replaces with[FINANCIAL_INFO_REDACTED]FinancialInfoRaiseExceptionPolicy: Raises DisallowedOperation exceptionFinancialInfoRaiseExceptionPolicy_LLM: LLM-generated exception messages
Medical Information Policies
What is Medical Info Policy? Medical information policies detect and protect health records, diagnoses, prescriptions, medical IDs, insurance information, and other Protected Health Information (PHI) for HIPAA compliance. UsageMedicalInfoBlockPolicy: Pattern detection with blockingMedicalInfoBlockPolicy_LLM: LLM-powered block messagesMedicalInfoBlockPolicy_LLM_Finder: LLM detection for better accuracyMedicalInfoAnonymizePolicy: Anonymizes medical dataMedicalInfoReplacePolicy: Replaces with[MEDICAL_INFO_REDACTED]MedicalInfoRaiseExceptionPolicy: Raises DisallowedOperation exceptionMedicalInfoRaiseExceptionPolicy_LLM: LLM-generated exception messages
Legal Information Policies
What is Legal Info Policy? Legal information policies detect and protect case numbers, legal IDs, court documents, attorney-client privileged information, and other sensitive legal data. Available VariantsLegalInfoBlockPolicy: Pattern detection with blockingLegalInfoBlockPolicy_LLM: LLM-powered block messagesLegalInfoBlockPolicy_LLM_Finder: LLM detection for better accuracyLegalInfoAnonymizePolicy: Anonymizes legal dataLegalInfoReplacePolicy: Replaces with placeholderLegalInfoRaiseExceptionPolicy: Raises DisallowedOperation exceptionLegalInfoRaiseExceptionPolicy_LLM: LLM-generated exception messages
Technical Security Policies
What is Technical Security Policy? Technical security policies detect and protect API keys, access tokens, passwords, private keys, database credentials, encryption keys, and other technical security credentials. Available VariantsTechnicalSecurityBlockPolicy: Pattern detection with blockingTechnicalSecurityBlockPolicy_LLM: LLM-powered block messagesTechnicalSecurityBlockPolicy_LLM_Finder: LLM detection for better accuracyTechnicalSecurityAnonymizePolicy: Anonymizes security credentialsTechnicalSecurityReplacePolicy: Replaces with placeholderTechnicalSecurityRaiseExceptionPolicy: Raises DisallowedOperation exceptionTechnicalSecurityRaiseExceptionPolicy_LLM: LLM-generated exception messages
Cybersecurity Policies
What is Cybersecurity Policy? Cybersecurity policies detect vulnerability disclosures, exploit code, attack vectors, malware signatures, hacking techniques, and other cybersecurity threats. Available VariantsCybersecurityBlockPolicy: Pattern detection with blockingCybersecurityBlockPolicy_LLM: LLM-powered block messagesCybersecurityBlockPolicy_LLM_Finder: LLM detection for better accuracyCybersecurityAnonymizePolicy: Anonymizes threat dataCybersecurityReplacePolicy: Replaces with placeholderCybersecurityRaiseExceptionPolicy: Raises DisallowedOperation exceptionCybersecurityRaiseExceptionPolicy_LLM: LLM-generated exception messages
Data Privacy Policies
What is Data Privacy Policy? Data privacy policies detect user tracking, cookie data, privacy violations, consent issues, and data retention concerns for GDPR and privacy compliance. Available VariantsDataPrivacyBlockPolicy: Pattern detection with blockingDataPrivacyBlockPolicy_LLM: LLM-powered block messagesDataPrivacyBlockPolicy_LLM_Finder: LLM detection for better accuracyDataPrivacyAnonymizePolicy: Anonymizes privacy dataDataPrivacyReplacePolicy: Replaces with placeholderDataPrivacyRaiseExceptionPolicy: Raises DisallowedOperation exceptionDataPrivacyRaiseExceptionPolicy_LLM: LLM-generated exception messages
Fraud Detection Policies
What is Fraud Detection Policy? Fraud detection policies identify phishing attempts, scam indicators, fraudulent schemes, identity theft, and suspicious financial activities. Available VariantsFraudDetectionBlockPolicy: Pattern detection with blockingFraudDetectionBlockPolicy_LLM: LLM-powered block messagesFraudDetectionBlockPolicy_LLM_Finder: LLM detection for better accuracyFraudDetectionAnonymizePolicy: Anonymizes fraud indicatorsFraudDetectionReplacePolicy: Replaces with placeholderFraudDetectionRaiseExceptionPolicy: Raises DisallowedOperation exceptionFraudDetectionRaiseExceptionPolicy_LLM: LLM-generated exception messages
Phishing Policies
What is Phishing Policy? Phishing policies detect suspicious links, credential harvesting attempts, spoofed domains, social engineering tactics, and email phishing patterns. Available VariantsPhishingBlockPolicy: Pattern detection with blockingPhishingBlockPolicy_LLM: LLM-powered block messagesPhishingBlockPolicy_LLM_Finder: LLM detection for better accuracyPhishingAnonymizePolicy: Anonymizes phishing indicatorsPhishingReplacePolicy: Replaces with placeholderPhishingRaiseExceptionPolicy: Raises DisallowedOperation exceptionPhishingRaiseExceptionPolicy_LLM: LLM-generated exception messages
Insider Threat Policies
What is Insider Threat Policy? Insider threat policies detect data exfiltration, unauthorized access, policy violations, suspicious behavior, and insider risk indicators. Available VariantsInsiderThreatBlockPolicy: Pattern detection with blockingInsiderThreatBlockPolicy_LLM: LLM-powered block messagesInsiderThreatBlockPolicy_LLM_Finder: LLM detection for better accuracyInsiderThreatAnonymizePolicy: Anonymizes threat indicatorsInsiderThreatReplacePolicy: Replaces with placeholderInsiderThreatRaiseExceptionPolicy: Raises DisallowedOperation exceptionInsiderThreatRaiseExceptionPolicy_LLM: LLM-generated exception messages
Custom Policy
Creating custom policies allows you to define your own content detection and handling logic specific to your application’s needs.Creating Rule
Overall Class Structure Rules inherit fromRuleBase and must implement the process method. They detect specific content and return a RuleOutput with confidence score and detected keywords.
Creating Action
Overall Class Structure Actions inherit fromActionBase and must implement the action method. They decide what to do when content is detected (block, allow, replace, anonymize, or raise exception).
allow_content(): Let content pass throughraise_block_error(message): Block with a messagereplace_triggered_keywords(replacement): Replace keywords with textanonymize_triggered_keywords(): Anonymize with random valuesraise_exception(message): Raise DisallowedOperation exceptionllm_raise_block_error(reason): Generate block message with LLMllm_raise_exception(reason): Generate exception with LLM

