Guardrails Engine¶
Added in: v0.15.0
Guardrails validate content before (input) and after (output) every LLM call. They catch unsafe inputs, redact PII, enforce output formats, and block toxic content — all without changing your application code.
Quick Start¶
from selectools import Agent, AgentConfig, OpenAIProvider, tool
from selectools.guardrails import GuardrailsPipeline, TopicGuardrail, PIIGuardrail
@tool(description="Look up a customer by email")
def lookup_customer(email: str) -> str:
return f"Customer found: John Doe ({email})"
guardrails = GuardrailsPipeline(
input=[
TopicGuardrail(deny=["politics", "religion"]),
PIIGuardrail(action="rewrite"), # redact PII in user messages
],
output=[], # no output guardrails for now
)
agent = Agent(
tools=[lookup_customer],
provider=OpenAIProvider(),
config=AgentConfig(guardrails=guardrails),
)
# This works fine:
result = agent.ask("Look up customer john@example.com")
# Input is rewritten: "Look up customer [EMAIL:********]"
# This raises GuardrailError:
result = agent.ask("What do you think about politics?")
# GuardrailError: Guardrail 'topic' blocked: Denied topics detected: politics
How It Works¶
User Message → Input Guardrails → LLM Call → Output Guardrails → Response
↓ ↓
block / rewrite / warn block / rewrite / warn
- Input guardrails run on every user message before it reaches the LLM
- Output guardrails run on the LLM response before it's returned to you
- Guardrails execute in order — if one rewrites content, the next sees the rewritten version
- If a guardrail blocks, processing stops immediately with a
GuardrailError
Failure Actions¶
Every guardrail has an action that controls what happens when content fails the check:
| Action | Behaviour | Use Case |
|---|---|---|
block (default) |
Raises GuardrailError |
Hard safety boundaries |
rewrite |
Returns sanitised content | PII redaction, length truncation |
warn |
Logs a warning, continues | Monitoring without blocking |
from selectools.guardrails import GuardrailAction, TopicGuardrail
# Block (default) — raises exception
TopicGuardrail(deny=["politics"], action=GuardrailAction.BLOCK)
# Warn — logs and continues
TopicGuardrail(deny=["politics"], action=GuardrailAction.WARN)
Built-in Guardrails¶
TopicGuardrail¶
Block content mentioning denied topics using keyword matching with word boundaries.
from selectools.guardrails import TopicGuardrail
# Basic usage
g = TopicGuardrail(deny=["politics", "religion", "gambling"])
# Case-sensitive matching
g = TopicGuardrail(deny=["API_KEY"], case_sensitive=True)
# Warn instead of block
g = TopicGuardrail(deny=["competitors"], action="warn")
PIIGuardrail¶
Detect and redact personally identifiable information using regex patterns.
Built-in PII types: email, phone_us, ssn, credit_card, ipv4
from selectools.guardrails import PIIGuardrail, GuardrailAction
# Redact all PII (default action is rewrite)
g = PIIGuardrail()
result = g.check("Email me at user@example.com, SSN 123-45-6789")
# result.content = "Email me at [EMAIL:********], SSN [SSN:********]"
# Detect specific types only
g = PIIGuardrail(detect=["email", "credit_card"])
# Block instead of redact
g = PIIGuardrail(action=GuardrailAction.BLOCK)
# Add custom patterns
g = PIIGuardrail(custom_patterns={
"employee_id": r"EMP-\d{6}",
"internal_ip": r"10\.\d{1,3}\.\d{1,3}\.\d{1,3}",
})
# Just detect without a guardrail pipeline
matches = g.detect("Contact user@example.com")
for m in matches:
print(f" {m.pii_type}: '{m.value}' at {m.start}-{m.end}")
ToxicityGuardrail¶
Score content against a keyword blocklist. Configurable threshold controls sensitivity.
from selectools.guardrails import ToxicityGuardrail
# Block on any toxic word (threshold=0.0)
g = ToxicityGuardrail(threshold=0.0)
# Only block when many toxic words appear
g = ToxicityGuardrail(threshold=0.3)
# Custom blocklist
g = ToxicityGuardrail(blocklist={"spam", "scam", "phishing"})
# Check score without blocking
score = g.score("Some text to check")
matched = g.matched_words("Some text to check")
FormatGuardrail¶
Validate output format — JSON structure, required keys, length bounds.
from selectools.guardrails import FormatGuardrail
# Require valid JSON
g = FormatGuardrail(require_json=True)
# Require specific keys in JSON
g = FormatGuardrail(require_json=True, required_keys=["intent", "confidence"])
# Length bounds (characters)
g = FormatGuardrail(min_length=10, max_length=5000)
LengthGuardrail¶
Enforce content length in characters or words. Supports truncation on rewrite.
from selectools.guardrails import LengthGuardrail, GuardrailAction
# Hard limit
g = LengthGuardrail(max_chars=10000)
# Truncate to fit (rewrite mode)
g = LengthGuardrail(max_words=500, action=GuardrailAction.REWRITE)
# Minimum length (useful for output guardrails)
g = LengthGuardrail(min_words=10)
Pipeline Examples¶
Input: PII Redaction + Topic Blocking¶
pipeline = GuardrailsPipeline(
input=[
PIIGuardrail(action="rewrite"), # Step 1: redact PII
TopicGuardrail(deny=["internal_only"]), # Step 2: block restricted topics
],
)
Output: JSON Validation + Length Cap¶
pipeline = GuardrailsPipeline(
output=[
FormatGuardrail(require_json=True, required_keys=["answer"]),
LengthGuardrail(max_chars=2000, action="rewrite"),
],
)
Both Input and Output¶
pipeline = GuardrailsPipeline(
input=[
PIIGuardrail(action="rewrite"),
TopicGuardrail(deny=["violence", "illegal"]),
],
output=[
ToxicityGuardrail(threshold=0.0),
LengthGuardrail(max_chars=5000, action="rewrite"),
],
)
agent = Agent(
tools=[...],
provider=provider,
config=AgentConfig(guardrails=pipeline),
)
Custom Guardrails¶
Subclass Guardrail and override check():
from selectools.guardrails import Guardrail, GuardrailAction, GuardrailResult
import re
class NoProfanityGuardrail(Guardrail):
name = "no_profanity"
action = GuardrailAction.BLOCK
def __init__(self, words: list[str]) -> None:
self._patterns = [re.compile(rf"\b{re.escape(w)}\b", re.IGNORECASE) for w in words]
def check(self, content: str) -> GuardrailResult:
for pattern in self._patterns:
if pattern.search(content):
return GuardrailResult(
passed=False,
content=content,
reason=f"Profanity detected: {pattern.pattern}",
guardrail_name=self.name,
)
return GuardrailResult(passed=True, content=content, guardrail_name=self.name)
# Use it
pipeline = GuardrailsPipeline(
input=[NoProfanityGuardrail(words=["badword1", "badword2"])],
)
Error Handling¶
When a guardrail with action=block fails, it raises GuardrailError:
from selectools.guardrails import GuardrailError
try:
result = agent.ask("Tell me about politics")
except GuardrailError as e:
print(f"Blocked by: {e.guardrail_name}")
print(f"Reason: {e.reason}")
Trace Integration¶
Guardrail activations appear in the execution trace:
result = agent.ask("Some input")
for step in result.trace:
if step.type == "guardrail":
print(f"Guardrail fired: {step.summary}")
API Reference¶
| Class | Description |
|---|---|
GuardrailsPipeline(input=[], output=[]) |
Ordered pipeline of input and output guardrails |
Guardrail |
Base class — subclass and override check() |
GuardrailResult(passed, content, reason) |
Result of a single check |
GuardrailError(guardrail_name, reason) |
Raised when action=block fails |
GuardrailAction.BLOCK |
Raise exception on failure |
GuardrailAction.REWRITE |
Return sanitised content |
GuardrailAction.WARN |
Log warning and continue |
TopicGuardrail(deny=[...]) |
Keyword-based topic blocking |
PIIGuardrail(detect=[...], action=...) |
PII detection and redaction |
ToxicityGuardrail(threshold=0.0) |
Keyword-based toxicity scoring |
FormatGuardrail(require_json=True) |
JSON/length format validation |
LengthGuardrail(max_chars=..., max_words=...) |
Content length enforcement |