Skip to main content
logoTetrate Agent Operations DirectorVersion: Latest

Testing Guardrails

Testing guardrails is essential to ensure they effectively identify real risks while minimizing false positives. This guide explains the testing workflows available in Agent Ops Director.

Understanding Guardrail Testing

Guardrail testing allows you to:

  • Verify detection rules are working as expected
  • Identify false positives before deploying to production
  • Measure detection accuracy and performance
  • Create reproducible test cases for guardrail validation

The Test Tab Interface

Each guardrail in Agent Ops Director includes a dedicated testing interface:

  1. Navigate to the Guardrail Studio section
  2. Select the guardrail you want to test
  3. Click on the Tests tab to access the testing interface

Guardrail Tests Interface

Creating Test Cases

To create a test case:

  1. Click + Add Test Case in the Tests tab
  2. Configure the test case:
    • Test Name: A descriptive name for the test
    • Test Input: The sample content to test against the guardrail
    • Expected Event: The event you expect the guardrail to trigger (or "none" if no event should trigger)
  3. Click Save to add the test case

Consider creating both positive test cases (should trigger the guardrail) and negative test cases (should not trigger the guardrail).

Running Tests

You can run tests individually or as a batch:

  • Single Test: Click the "Run" button next to a specific test case
  • All Tests: Click the Run All Tests button to execute all test cases for the guardrail

After running tests, each test case will show either:

  • PASS: The actual event matched the expected event
  • FAIL: The actual event didn't match the expected event

Interpreting Test Results

The test results provide valuable insights:

  • Test Summary: Shows passing and failing tests at a glance
  • False Positive Rate: The percentage of test cases incorrectly identified
  • Expected vs. Actual: For each test, compare what was expected with what actually happened

Use these insights to refine your detection rules for better accuracy.

Best Practices for Testing

  • Test Edge Cases: Include unusual but valid inputs that might cause false positives
  • Include Real-World Examples: Use anonymized examples from your actual usage
  • Maintain Test Coverage: Update test cases when adding or modifying rules
  • Regular Retesting: Periodically rerun tests to ensure continued effectiveness
  • Document Test Cases: Add clear descriptions of what each test is validating

Using Test Results to Improve Guardrails

After testing:

  1. Review all failing tests to understand why they failed
  2. Update detection rules to address false positives or missed detections
  3. Add additional test cases to cover newly discovered edge cases
  4. Rerun tests to verify improvements

Next Steps