Pisama¶
Process-level failure detection for multi-agent LLM systems. Catch loops, state corruption, persona drift, coordination breakdown, and convergence failures across Claude Managed Agents, LangGraph, n8n, Dify, OpenClaw, and Semantic Kernel. Open source. MIT licensed.
-
Python SDK
pip install pisama— detect failures in 3 lines. No server needed. -
Full Platform
Dashboard, tiered detection, self-healing, REST API. Docker Compose.
-
API Reference
REST API for traces, detections, healing, and integrations.
-
Detection Reference
Per-detector documentation with F1 scores and accuracy benchmarks.
-
Cookbook
Integration examples for Claude Managed Agents, LangGraph, n8n, Dify, OpenClaw, and Semantic Kernel.
-
OSS vs Cloud
Compare the free SDK with the full platform.
-
Auto-Instrumentation
One line of code. Patches Anthropic + OpenAI SDKs automatically.
-
Chaos Engineering
Inject failures to test agent resilience. 6 experiment types with safety controls.
What Pisama detects¶
LLM agents fail silently. A coding agent loops for 40 minutes. A research agent hallucinates citations. A support agent drifts from its persona. Standard monitoring misses all of it.
Pisama provides 57 calibrated detectors (mean F1 0.876, 51 production-tier) built on the MAST taxonomy:
| 51 production-tier | F1 >= 0.80 — loop detection, injection, hallucination, citation, persona drift, and more |
| 5-tier escalation | Hash ($0.00) state delta embeddings LLM judge ($0.02) human review |
| 5 frameworks | Purpose-built detectors for LangGraph, n8n, Dify, OpenClaw, and Claude Managed Agents |
| $0.05 avg/trace | 90%+ resolved at Tier 1-2, zero LLM cost |
| OTEL native | OpenTelemetry with gen_ai.* semantic conventions |
| Self-healing | Fix generation, approval workflows, rollback |
Supported frameworks¶
LangGraph · n8n · Dify · OpenClaw · Semantic Kernel · Claude Managed Agents · OpenAI Assistants · Bedrock Agents · Claude Code · Any OTEL source
Quick links¶
- SDK Quickstart —
pip install pisamaand detect failures in 3 lines - Cookbook — Framework integration examples
- Installation — Full setup guide
- Configuration — Environment variables
- Detection Reference — All detectors with F1 scores
- Deployment — Production deployment