Managing Guardrails
Guardrails provide essential risk management capabilities for your AI applications by detecting and preventing potential security, privacy, and operational risks. This guide explains how to manage guardrails in Agent Ops Director.
Understanding Guardrails
A guardrail in Agent Ops Director is a configurable control mechanism that evaluates AI interactions against defined risk criteria. Guardrails consist of:
- Detection Rules: Conditions that trigger the guardrail
- Risk Category: Classification based on frameworks like FINOS and OWASP
- Severity Level: Impact rating (Low, Medium, High)
- Enforcement Mode: Monitoring (log only) or Enforce (block risky requests)
- Metadata: Version information and other attributes
Viewing Available Guardrails
Agent Ops Director includes a comprehensive interface for viewing and managing all guardrails in your environment:
- Navigate to the Guardrail Studio section in the left navigation menu
- View the list of all available guardrails, including both default and custom guardrails
- Use filters to sort by category, status, or enforcement mode
Enabling and Disabling Guardrails
To change a guardrail's operational status:
- Locate the desired guardrail in the Guardrail Studio
- Click on the guardrail to view its details
- Use the Enable/Disable button to change its status
- Click Save to apply the changes
Disabled guardrails will not process any requests, regardless of their enforcement mode.
Switching Between Monitoring and Enforce Modes
Each guardrail can operate in one of two modes:
- Monitoring Mode: Logs violations without blocking requests
- Enforce Mode: Actively blocks requests that violate the guardrail rules
To change a guardrail's mode:
- Open the guardrail details view
- Locate the Mode dropdown in the details section
- Select either "Monitoring" or "Enforce"
- Click Save to apply the change
Best Practices for Managing Guardrails
- Start in Monitoring Mode: When implementing a new guardrail, begin in Monitoring Mode to observe its behavior and adjust for false positives
- Regular Review: Periodically review guardrail analytics to evaluate effectiveness and adjust as needed
- Incremental Deployment: Roll out guardrails gradually, especially in Enforce Mode, to minimize disruption
- Documentation: Maintain documentation of which guardrails are active and their configurations for compliance and operational purposes
Managing Guardrail Versions
Agent Ops Director maintains version history for all guardrails:
- In the guardrail details view, look for the Version information
- Editing then saving guardrail will increment the version automatically
Next Steps
- Create custom detection rules for organization-specific needs
- Learn how to test guardrails to ensure effectiveness
- Understand how to monitor guardrail analytics for ongoing management