OpenClaw Failure Modes¶
Platform-specific detectors for OpenClaw multi-agent chat sessions. These catch issues unique to OpenClaw's event-driven session model, multi-channel messaging, agent spawning, sandbox isolation, and elevated privilege modes.
Session Loops¶
| Field | Value |
|---|---|
| Detector key | openclaw_session_loop |
| Severity | Critical |
Plain language: Your agent is stuck in a loop within its session. It keeps calling the same tool, sending the same message, or spawning the same sub-agent over and over -- burning time and resources without making progress.
Technical: Analyzes the session event stream for exact tool call repetition (same name + input hash, 3+ consecutive), fuzzy tool loops (same tool and key structure but different values, 5+ consecutive), spawn/send ping-pong patterns (A-B-A-B alternation with 4+ cycles), and repeated identical messages (3+ consecutive). Uses SHA256 hashing for exact matching and structural hashing (key names + type placeholders) for fuzzy matching.
Examples (non-technical):
- The agent calls the same search tool 8 times with identical queries
- Two sub-agents keep handing a task back and forth between each other
- The agent sends the same response message to the user 5 times in a row
Examples (technical):
- Tool call loop:
search_web({"query": "weather SF"})called 6 consecutive times (exact hash match, threshold: 3) - Fuzzy tool loop:
query_db({"table": "users", "limit": 10})thenquery_db({"table": "users", "limit": 20})-- same structure, different values, 5 times - Spawn ping-pong:
session.spawn(agent_B)→session.send(agent_A)→session.spawn(agent_B)→ ... (4+ alternations) - Message loop:
message.sent("I'll help you with that")repeated 4 times with identical content
Detection methods:
- Exact Tool Loop: SHA256 hash of
tool_name + tool_input, flags 3+ consecutive matches (tolerates up to 4 event gaps) - Fuzzy Tool Loop: Structural hash of key names + type placeholders, flags 5+ consecutive
- Spawn Ping-Pong: Detects repeated targeting of same session or A-B-A-B alternation between two targets
- Message Loop: Identifies 3+ consecutive identical message.sent/message.send events
Sub-types: tool_call_loop, fuzzy_tool_loop, spawn_ping_pong, abab_ping_pong, message_sent_loop
Tool Abuse¶
| Field | Value |
|---|---|
| Detector key | openclaw_tool_abuse |
| Severity | High |
Plain language: Your agent is misusing its tools. It's making too many calls, calling the same tool redundantly, failing more than half the time, or using sensitive tools like file deletion or shell commands that could be dangerous.
Technical: Monitors tool call volume (flags >4 total calls), per-tool redundancy (3+ calls to same tool), error rate (>50% failure when 3+ calls), and sensitive tool usage via exact name matching (exec, eval, shell, delete_file, etc.) and keyword matching (delete, ban, bulk, export, dump, etc.).
Examples (non-technical):
- The agent made 12 tool calls in a single turn -- that's excessive for a simple task
- The agent called the same database query 5 times when once would have been enough
- The agent tried to run a shell command -- a tool it should never need for customer support
Examples (technical):
- Excessive calls: 9 tool.call events in one session (threshold: 4)
- Redundant:
fetch_user_profilecalled 4 times (threshold: 3) -- same data, no state change between calls - High error rate: 5/8 tool calls returned
status: "failed"(62.5%, threshold: 50%) - Sensitive tool: agent called
run_command(exact match) andbulk_delete_users(keyword match: "bulk" + "delete")
Detection methods:
- Volume Tracking: Total tool.call events per session (threshold: 4)
- Redundancy Detection: Per-tool call counts (threshold: 3 for same tool)
- Error Rate Monitoring: Failed/total ratio when 3+ calls made (threshold: 50%)
- Sensitive Tool Detection: Exact match against blocklist + keyword matching against dangerous operation terms
Sub-types: excessive_calls, redundant_calls, high_error_rate, sensitive_tools
Elevated Privilege Risk¶
| Field | Value |
|---|---|
| Detector key | openclaw_elevated_risk |
| Severity | High |
Plain language: Your agent is performing risky operations, either in elevated mode (where it has extra permissions) or attempting dangerous actions without proper authorization. This could mean file system access, code execution, or admin-level operations.
Technical: Checks the session's elevated_mode flag, then categorizes each tool call against risk categories: file system operations (read/write/delete), code execution (exec, eval, run_code), system commands (shell, subprocess), admin actions (delete, ban, suspend), permission operations (escalate, grant), data operations (bulk, export, dump), and credential operations (password, reset). Also scans tool inputs for dangerous patterns (rm -rf, /etc/passwd, SSN, bulk delete). Non-elevated sessions using risky tools are flagged as escalation attempts (highest severity).
Examples (non-technical):
- An agent in elevated mode is deleting files and running shell commands -- these are risky even when authorized
- A regular (non-elevated) agent tried to execute code -- it's attempting to escalate its privileges
- An agent's tool input contains
rm -rf /-- a destructive system command
Examples (technical):
- Elevated mode + tool
write_file: legitimate but logged aselevated_operation(MODERATE) - Non-elevated + tool
exec: escalation attempt -- agent shouldn't have code execution (SEVERE) - Tool input matches pattern:
"rm -rf /"or"/etc/passwd"(risky_input) - Risk categories triggered:
{file_system, code_execution, system_commands}-- 3 categories (MODERATE threshold: 3) - Tool
bulk_export_datamatches keyword "bulk" + "export" (data_operations category)
Detection methods:
- Privilege Context Check: Compares tool risk against session's elevated_mode flag
- Escalation Detection: Flags risky tool usage in non-elevated sessions (highest severity)
- Risk Categorization: Classifies tools into 7 risk categories via exact match and keyword matching
- Input Pattern Scanning: Regex matching against dangerous command patterns in tool inputs
Sub-types: elevated_operation, escalation_attempt, risky_input, categories: file_system, code_execution, system_commands, admin_actions, permission_ops, data_operations, credential_ops
Spawn Chain Depth¶
| Field | Value |
|---|---|
| Detector key | openclaw_spawn_chain |
| Severity | High |
Plain language: Your agents are spawning too many levels of sub-agents. Agent A spawns Agent B, which spawns Agent C, which spawns Agent D -- creating a deep chain that's hard to control. Even worse, some agents might be spawning each other in circles, or escalating to privileged agents they shouldn't access.
Technical: Tracks session.spawn events to measure chain depth (safe threshold: 3 levels), detects circular references (spawned session ID already seen), and identifies privilege escalation patterns (spawning agents matching privilege keywords like admin, root, supervisor, master, controller).
Examples (non-technical):
- An agent spawned 5 levels of sub-agents to handle a simple question -- way too deep
- Agent A spawned Agent B, which spawned Agent A again -- creating an infinite loop
- A regular agent spawned an "admin" agent -- attempting to gain higher privileges
Examples (technical):
- Spawn depth: 6
session.spawnevents in one session (safe threshold: 3) - Circular reference:
spawned_session_id: "sess_abc"already in seen_ids set -- agent is re-spawning itself - Privilege escalation: spawn targets include
"admin_agent"and"supervisor_bot"-- 2 privileged agents in chain - Depth field:
data.depth: 5exceeds MAX_SAFE_SPAWN_DEPTH (3) - Total children: spawn event with
child_session_ids: ["s1", "s2", "s3"]-- fan-out of 3 at one level
Detection methods:
- Depth Tracking: Counts spawn events and explicit depth fields per session
- Circular Reference Detection: Maintains seen_ids set, flags if spawned ID already exists
- Privilege Keyword Matching: Case-insensitive substring match against admin/root/supervisor/master/controller/elevated/privileged/system/internal
- Fan-Out Monitoring: Counts total child sessions spawned
Sub-types: excessive_depth, circular_reference, privilege_escalation
Channel Mismatch¶
| Field | Value |
|---|---|
| Detector key | openclaw_channel_mismatch |
| Severity | Medium |
Plain language: Your agent is sending messages formatted wrong for the messaging channel. Code blocks sent to WhatsApp don't render, messages are too long for the platform's limits, or sensitive personal information is being sent through channels where it shouldn't appear.
Technical: Checks for cross-channel routing (event channel differs from session channel) and channel-specific content violations: WhatsApp (code blocks, markdown tables, messages >1000 chars), Telegram (messages >4096 chars), Slack (SSN and credit card number patterns), Discord (messages >2000 chars).
Examples (non-technical):
- The agent sends a code snippet with triple backticks to WhatsApp -- it shows as garbled text
- A response meant for Telegram is 6,000 characters long -- it'll be truncated or rejected
- The agent sends a customer's credit card number through Slack -- a serious PII violation
Examples (technical):
- WhatsApp: message contains
```python\nprint("hello")\n```-- code blocks don't render (formatting violation) - WhatsApp: message is 1,847 chars (threshold: 1,000) -- too long for mobile display
- Telegram: message is 5,200 chars (threshold: 4,096) -- exceeds API limit
- Slack: message contains
"SSN: 123-45-6789"pattern match (pii_exposure, SEVERE) - Discord: message is 2,500 chars (threshold: 2,000) -- exceeds embed limit
- Cross-channel: event.channel =
"telegram"but session.channel ="whatsapp"(routing error)
Detection methods:
- Cross-Channel Detection: Compares event channel against session channel
- WhatsApp Format Check: Detects code blocks (```), markdown tables (|---|), and long messages (>1000 chars)
- Telegram Length Check: Flags messages exceeding 4096-character API limit
- Slack PII Detection: Regex matching for SSN (###-##-####) and credit card (####-####-####-####) patterns
- Discord Length Check: Flags messages exceeding 2000-character limit
Sub-types: cross_channel_routing, formatting, length, pii_exposure
Sandbox Escape¶
| Field | Value |
|---|---|
| Detector key | openclaw_sandbox_escape |
| Severity | Critical |
Plain language: Your agent is trying to break out of its sandbox. The sandbox is supposed to restrict what the agent can do, but it's attempting to access files, make network calls, execute code, or query databases -- all things it's not supposed to do in a sandboxed environment.
Technical: Checks the session's sandbox_enabled flag, then monitors all tool.call events against 5 restricted categories: file system (read_file, write_file, delete_file, list_dir), network (http_request, api_call, fetch_url, webhook_forward), code execution (exec, eval, run_code, shell), system commands (os_command, system, subprocess), and database (query_database, sql_query, db_execute). When sandbox is enabled, any matching tool call is an escape attempt. When sandbox is disabled, matches are logged as audit flags at lower severity.
Examples (non-technical):
- A sandboxed agent tried to read files from the server -- it shouldn't have file access
- A sandboxed agent attempted to make an HTTP request to an external API -- network access is restricted
- A sandboxed agent tried to run a shell command -- code execution is blocked in sandbox mode
Examples (technical):
- Sandbox enabled + tool
read_file: escape attempt infile_systemcategory (SEVERE) - Sandbox enabled + tool
http_request: escape attempt innetworkcategory (SEVERE) - Sandbox enabled + tool
eval: escape attempt incode_executioncategory (SEVERE) - Sandbox enabled + tools across 3 categories:
{file_system, network, code_execution}-- multi-vector escape (SEVERE, 2+ categories) - Sandbox disabled + tool
delete_file: audit flag, lower severity (MINOR), confidence reduced by 0.2
Detection methods:
- Tool Category Matching: Exact tool name matching against 5 restricted category blocklists
- Sandbox Context: Severity escalation when
sandbox_enabled=True(escape attempt vs audit) - Multi-Category Detection: Tracks number of distinct violation categories (escalates at 2+)
- Audit Mode: When sandbox is disabled, still logs suspicious tool usage at lower confidence
Sub-types: file_system, network, code_execution, system, database