capabilities table

F1 is the real-trace (external-lane) score for detectors with real coverage, and the blended (external+synthetic) score for thin/synthetic detectors. Coverage = real external traces backing the number: real (>=30, externally validated), thin (1-29, number is synthetic-backed), synthetic (0, capability signal pending real traces).

Detector	F1	Coverage	Status
`overflow`	1.000	real (50)	Production
`withholding`	1.000	real (50)	Production
`approval_bypass`	1.000	real (50)	Production
`chunk_attribution`	1.000	thin (20)	Production
`openclaw_channel_mismatch`	0.993	thin (20)	Production
`mcp_protocol`	0.990	synthetic	Production
`impersonation_risk`	0.985	real (85)	Production
`deception`	0.985	real (85)	Production
`openclaw_session_loop`	0.985	thin (20)	Production
`task_starvation`	0.980	synthetic	Production
`tool_error_recovery`	0.980	synthetic	Production
`computer_use`	0.977	synthetic	Production
`entity_confusion`	0.969	synthetic	Production
`critic_quality`	0.967	synthetic	Production
`consensus_collapse`	0.967	real (60)	Production
`specification_compliance`	0.966	real (30)	Production
`scope_escalation`	0.960	real (85)	Production
`reward_hacking`	0.960	real (50)	Production
`n8n_timeout`	0.959	thin (19)	Production
`dify_rag_poisoning`	0.952	synthetic	Production
`analytical_semantics`	0.947	real (32)	Production
`langgraph_tool_failure`	0.944	synthetic	Production
`scheduled_task`	0.942	synthetic	Production
`citation`	0.935	thin (15)	Production
`openclaw_spawn_chain`	0.933	thin (20)	Production
`n8n_resource`	0.931	thin (13)	Production
`delegation`	0.930	real (47)	Production
`corruption`	0.930	real (94)	Production
`retrieval_quality`	0.930	thin (29)	Production
`planning_fallacy`	0.920	synthetic	Production
`persona_drift`	0.920	real (89)	Production
`rag_poisoning`	0.916	real (100)	Production
`sycophancy`	0.902	real (50)	Production
`dify_classifier_drift`	0.900	synthetic	Production
`langgraph_parallel_sync`	0.899	synthetic	Production
`langgraph_checkpoint_corruption`	0.896	synthetic	Production
`subagent_boundary`	0.896	synthetic	Production
`multi_chain`	0.891	synthetic	Production
`propagation`	0.889	synthetic	Production
`adaptive_thinking`	0.887	synthetic	Production
`openclaw_elevated_risk`	0.886	thin (20)	Production
`dify_variable_leak`	0.886	synthetic	Production
`convergence`	0.883	synthetic	Production
`openclaw_tool_abuse`	0.882	thin (20)	Production
`orchestration_quality`	0.880	synthetic	Production
`injection`	0.878	real (100)	Production
`n8n_complexity`	0.876	thin (14)	Production
`langgraph_edge_misroute`	0.869	synthetic	Production
`dispatch_async`	0.869	synthetic	Production
`openclaw_sandbox_escape`	0.864	thin (20)	Production
`hallucination`	0.862	real (250)	Production
`parallel_consistency`	0.860	synthetic	Production
`role_usurpation`	0.857	real (170)	Production
`multi_agent_contagion`	0.857	real (60)	Production
`role_usurpation_exec`	0.851	real (83)	Production
`role_usurpation_canonical`	0.849	real (97)	Production
`compaction_quality`	0.847	synthetic	Production
`n8n_error`	0.844	thin (17)	Production
`over_refusal`	0.843	real (300)	Production
`context`	0.838	real (221)	Production
`langgraph_state_corruption`	0.833	synthetic	Production
`memory_staleness`	0.822	synthetic	Production
`reasoning_consistency`	0.816	synthetic	Production
`grounding`	0.807	real (215)	Production
`specification`	0.806	real (276)	Production
`routing`	0.800	synthetic	Production
`loop`	0.794	real (129)	Beta
`cowork_safety`	0.792	synthetic	Beta
`context_precision`	0.789	real (30)	Beta
`n8n_cycle`	0.781	thin (14)	Beta
`completion`	0.777	real (199)	Beta
`n8n_schema`	0.765	thin (15)	Beta
`chunk_relevance`	0.765	thin (28)	Beta
`decomposition`	0.738	real (232)	Beta
`exploration_safety`	0.716	synthetic	Beta
`model_selection`	0.683	synthetic	Beta
`workflow`	0.667	real (100)	Beta
`communication`	0.664	real (213)	Beta
`authority_gradient`	0.641	synthetic	Experimental
`derailment`	0.623	real (223)	Experimental
`coordination`	0.591	real (235)	Experimental
`under_refusal`	0.562	real (300)	Experimental
`jailbreak_compliance`	0.507	real (300)	Experimental