Skip to main content
logoTetrate Agent Operations DirectorVersion: Latest

Creating Detection Rules

Detection rules are the core components of guardrails that identify potential risks in AI interactions. This guide explains how to create and manage detection rules in Agent Ops Director.

Understanding Detection Rules

Detection rules define specific patterns or conditions that, when matched in AI request or response content, trigger a guardrail event. Rules can be organized into categories and have configurable severity levels.

Each rule consists of:

  • Pattern: A regular expression or matching condition
  • Location: Where to apply the rule (request body, headers, etc.)
  • Event Name: Identifier for the triggered event
  • Severity: Impact level (Low, Medium, High)

Creating a New Detection Rule

To create a new detection rule:

  1. Navigate to the Guardrail Studio section
  2. Select a guardrail to edit or create a new one
  3. Go to the Detection Rules tab
  4. Click + Add Detection Rule
  5. Configure the rule details:
    • Enter a descriptive name
    • Select or create a rule category
    • Define the pattern using regular expressions
    • Specify where to apply the rule (location)
    • Set the event name and severity

Adding Detection Rules

Using Rule Categories

Rules are organized into categories for easier management. Default categories include:

  • Security: Rules for detecting security risks like prompt injections
  • Regulatory: Rules related to compliance requirements
  • Operational: Rules for operational concerns like model versioning

To work with categories:

  1. When creating/editing a rule, select an existing category or create a new one
  2. Use the Group by: Category dropdown in the Detection Rules view to organize rules by category
  3. Filter rules by category using the filter options

Pattern Matching Examples

Effective rules rely on well-crafted patterns. Here are some examples:

Email Detection

[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,6}

Credit Card Detection

4[0-9]{3}(?:[ -]?[0-9]{4}){3}|5[1-5][0-9]{2}(?:[ -]?[0-9]{4}){3}

Prompt Injection Detection

(ignore|disregard).+(previous|above).+(instructions|prompt)

Setting Rule Severity

Each rule should have an appropriate severity level:

  • Low: Minor risks with limited impact
  • Medium: Moderate risks that may affect operations
  • High: Significant risks requiring immediate attention

The severity level determines how the rule behaves in different enforcement modes and may affect alerting thresholds.

Testing Detection Rules

After creating a rule:

  1. Go to the Tests tab in the guardrail details view
  2. Create test cases with sample inputs that should trigger your rule
  3. Run tests to validate the rule's effectiveness
  4. Adjust patterns as needed based on test results

For more details on testing, see Testing Guardrails.

Best Practices for Detection Rules

  • Start Specific: Begin with precise patterns and broaden as needed
  • Consider False Positives: Balance detection coverage with false positive risk
  • Use Rule Comments: Add comments to document complex patterns
  • Regular Updates: Review and update rules as new risk patterns emerge
  • Consistent Naming: Use clear, consistent naming conventions for rules and events

Next Steps