Documentation Menu
Trust & Policies
Waitroom's trust scoring and policy engine form the permissions model for the agent era — defining what agents can do freely, what requires approval, and what is forbidden.
Trust Scoring
Every agent has a trust score per room. The score reflects how reliably the agent has behaved over time and is used by the policy engine for auto-approve decisions.
| Parameter | Value |
|---|---|
| initial_score | 15 |
| min_score | 0 |
| max_score | 100 |
Score Adjustments
Trust score changes after every check-in decision:
| Event | Weight | Effect |
|---|---|---|
| approval | +1.0 | Agent acted appropriately, trust increases |
| modification | +0.6 | Action was mostly correct but needed adjustment |
| rejection | -0.3 | Agent proposed something inappropriate |
| expiry | -0.1 | Check-in was abandoned or ignored |
Policy Engine
When an agent creates a check-in, the policy engine evaluates the room's rules to determine the initial status. Rules are evaluated in strict priority order:
- Forbid rules — highest priority. If any forbid rule matches, the check-in is immediately rejected with a POLICY_FORBIDS error.
- Auto-approve rules — if a rule with auto_approve action matches, the check-in is approved instantly.
- Trust-based thresholds — if the agent's trust score meets the room's threshold for the check-in's risk level, it is auto-approved.
- Default action — fallback. Usually require_approval, meaning the check-in stays pending for human review.
Check-in arrives
│
├─ Forbid rules match? ──▸ REJECTED (403 POLICY_FORBIDS)
│
├─ Auto-approve match? ──▸ APPROVED (auto)
│
├─ Trust threshold met? ──▸ APPROVED (trust)
│
└─ Default action ──▸ PENDING (require_approval)Policy Actions
| Action | Behavior |
|---|---|
| auto_approve | Check-in is approved immediately without human review |
| require_approval | Check-in stays pending until a human decides |
| forbid | Check-in is rejected immediately — agent cannot proceed |
Timeout Actions
| Action | Behavior |
|---|---|
| auto_approve | If no human responds within the timeout, approve automatically |
| cancel | If no human responds, expire the check-in (default) |
| hold | Keep the check-in pending indefinitely until someone decides |
Policy Configuration
Policies are stored as JSONB on the rooms table. The full schema:
{
"default_action": "require_approval", // auto_approve | require_approval | forbid
"timeout_minutes": 60, // 1 - 10080 (7 days)
"timeout_action": "cancel", // auto_approve | cancel | hold
"rules": [
{
"action": "forbid",
"conditions": {
"risk_level": ["critical"], // match any risk level in array
"action_type": ["delete", "drop"], // match any action keyword
"agent_id": ["agent_123"], // match specific agents
"min_trust_score": 80 // agent must have this score
},
"reason": "Critical actions are always blocked"
}
],
"trust_thresholds": { // optional
"auto_approve_low": 60, // auto-approve low-risk if score >= 60
"auto_approve_medium": 85 // auto-approve medium-risk if score >= 85
}
}Rule Conditions
Each rule has a conditions object. All specified conditions must match (AND logic). Within an array condition (e.g. risk_level), any value can match (OR logic).
| Condition | Type | Description |
|---|---|---|
| risk_level | string[] | Match if check-in risk level is in this array |
| action_type | string[] | Match if check-in action contains any of these keywords |
| agent_id | string[] | Match if the check-in agent is in this list |
| min_trust_score | number | Match if agent's trust score is at or above this value (0-100) |
Rule Examples
Forbid all critical actions
{
"action": "forbid",
"conditions": { "risk_level": ["critical"] },
"reason": "Critical actions require manual execution"
}Auto-approve low-risk from trusted agents
{
"action": "auto_approve",
"conditions": {
"risk_level": ["low"],
"min_trust_score": 70
}
}Auto-approve read-only actions
{
"action": "auto_approve",
"conditions": {
"action_type": ["read", "list", "get", "view"]
}
}Forbid a specific agent from destructive ops
{
"action": "forbid",
"conditions": {
"agent_id": ["untrusted-bot-id"],
"action_type": ["delete", "drop", "destroy"]
},
"reason": "This agent is not authorized for destructive operations"
}Trust-based auto-approve thresholds
{
"default_action": "require_approval",
"trust_thresholds": {
"auto_approve_low": 50, // agents with score >= 50 auto-approved for low risk
"auto_approve_medium": 80 // agents with score >= 80 auto-approved for medium risk
},
"rules": [
{
"action": "forbid",
"conditions": { "risk_level": ["critical"] }
}
]
}Best Practices
- Start strict, loosen gradually. Begin with require_approval as the default action. Add auto-approve rules only after agents have built trust.
- Always forbid destructive operations. Actions like "delete database", "drop table", or "revoke access" should have explicit forbid rules regardless of trust score.
- Use trust thresholds for routine work. Once an agent consistently gets approved for low-risk actions, set a trust threshold to auto-approve them — reducing human burden without sacrificing safety.
- Scope agents to rooms. Use room_scopes when registering agents to limit which rooms a key can access.
- Review audit logs regularly. The audit trail shows every decision, trust score change, and policy evaluation. Use it to tune policies over time.