Tool Safety Policies

What is Tool Safety Policy?

Tool safety policies provide two layers of protection for AI agent tool usage:

Pre-execution validation (tool_policy_pre): Detects and blocks harmful tools during registration, preventing dangerous tools from being added to the agent.
Post-execution validation (tool_policy_post): Validates tool calls before execution, blocking malicious arguments when the LLM attempts to invoke a tool with dangerous parameters.

These policies use LLM-powered analysis to identify tools and tool calls that could perform dangerous operations like system manipulation, data destruction, network attacks, security violations, and malicious operations.

Why its Important?

Tool safety policies are critical for protecting your systems and preventing malicious tool usage in AI agents. These policies prevent harmful tools from being registered and block malicious tool calls before execution, which helps protect against system manipulation, data destruction, and security breaches.

Prevents harmful tools from being registered: Blocks tools with dangerous functionality like file deletion, system shutdown, or privilege escalation from being added to your agent
Blocks malicious tool calls before execution: Detects and prevents suspicious tool arguments that could cause system damage or security violations
Maintains system security and integrity: Ensures your AI agent can only use safe tools with safe parameters, protecting your infrastructure from harm

Usage

Pre-Execution Validation (tool_policy_pre)

Validates tools during registration to block harmful tools before they’re added:

from upsonic import Agent, Task
from upsonic.tools import tool
from upsonic.safety_engine.policies import HarmfulToolBlockPolicy

@tool
def delete_all_files(directory: str) -> str:
    """Delete all files in a directory recursively."""
    return f"Would delete all files in {directory}"

agent = Agent(
    model="openai/gpt-4o",
    tool_policy_pre=HarmfulToolBlockPolicy,
    debug=True,
)

task = Task(
    description="Use the delete_all_files tool to delete all files in /tmp/test_directory",
    tools=[delete_all_files]
)

result = agent.do(task)
print(f"Task result: {result}")

Post-Execution Validation (tool_policy_post)

Validates tool calls before execution to block malicious arguments when the LLM invokes a tool:

from upsonic import Agent, Task
from upsonic.tools import tool
from upsonic.safety_engine.policies import MaliciousToolCallBlockPolicy

@tool
def run_command(command: str) -> str:
    """Execute a system command."""
    return f"Executed: {command}"

agent = Agent(
    model="openai/gpt-4o",
    tool_policy_post=MaliciousToolCallBlockPolicy,
    debug=True,
)

task = Task(
    description="Use the run_command tool to execute: rm -rf /tmp/test",
    tools=[run_command]
)

result = agent.do(task)
print(f"Task result: {result}")

Combined Pre + Post Validation

Use both policies together for defense-in-depth security:

from upsonic import Agent, Task
from upsonic.safety_engine.policies import (
    HarmfulToolBlockPolicy,
    MaliciousToolCallBlockPolicy
)

def safe_calculator(a: int, b: int) -> int:
    """A safe calculator tool that adds two numbers."""
    return a + b

agent = Agent(
    model="openai/gpt-4o",
    tools=[safe_calculator],
    tool_policy_pre=HarmfulToolBlockPolicy,      # Blocks harmful tools at registration
    tool_policy_post=MaliciousToolCallBlockPolicy  # Blocks malicious calls at execution
)
# Provides two layers of protection

Available Variants

Harmful Tool Detection Policies

HarmfulToolBlockPolicy: LLM-powered detection with blocking during tool registration
HarmfulToolBlockPolicy_LLM: LLM-powered block messages for harmful tools
HarmfulToolRaiseExceptionPolicy: Raises DisallowedOperation exception for harmful tools
HarmfulToolRaiseExceptionPolicy_LLM: LLM-generated exception messages for harmful tools

Malicious Tool Call Detection Policies

MaliciousToolCallBlockPolicy: LLM-powered detection with blocking before tool execution
MaliciousToolCallBlockPolicy_LLM: LLM-powered block messages for malicious tool calls
MaliciousToolCallRaiseExceptionPolicy: Raises DisallowedOperation exception for malicious tool calls
MaliciousToolCallRaiseExceptionPolicy_LLM: LLM-generated exception messages for malicious tool calls

GET STARTED

CONCEPTS

STARTING AN AGENT PROJECT

DEPLOYMENT

FURTHER READINGS

Tool Safety Policies

What is Tool Safety Policy?

Why its Important?

Usage

Pre-Execution Validation (tool_policy_pre)

Post-Execution Validation (tool_policy_post)

Combined Pre + Post Validation

Available Variants

Harmful Tool Detection Policies

Malicious Tool Call Detection Policies

GET STARTED

CONCEPTS

STARTING AN AGENT PROJECT

DEPLOYMENT

FURTHER READINGS

​What is Tool Safety Policy?

​Why its Important?

​Usage

​Pre-Execution Validation (tool_policy_pre)

​Post-Execution Validation (tool_policy_post)

​Combined Pre + Post Validation

​Available Variants

​Harmful Tool Detection Policies

​Malicious Tool Call Detection Policies

What is Tool Safety Policy?

Why its Important?

Usage

Pre-Execution Validation (tool_policy_pre)

Post-Execution Validation (tool_policy_post)

Combined Pre + Post Validation

Available Variants

Harmful Tool Detection Policies

Malicious Tool Call Detection Policies