Pisama¶
Failure detection for multi-agent LLM systems. 27 production-grade detectors across 4 frameworks. Open source. MIT licensed.
-
Python SDK
pip install pisama— detect failures in 3 lines. No server needed. -
Full Platform
Dashboard, tiered detection, self-healing, REST API. Docker Compose.
-
API Reference
REST API for traces, detections, healing, and integrations.
-
Detection Reference
Per-detector documentation with F1 scores and accuracy benchmarks.
-
Cookbook
Integration examples for LangGraph, n8n, Dify, OpenClaw, CrewAI, and Claude.
-
OSS vs Cloud
Compare the free SDK with the full platform.
What Pisama detects¶
LLM agents fail silently. A coding agent loops for 40 minutes. A research agent hallucinates citations. A support agent drifts from its persona. Standard monitoring misses all of it.
Pisama provides 41 calibrated detectors (17 core + 24 framework-specific) built on the MAST taxonomy:
| 27 production-grade | F1 >= 0.80 — loop detection, state corruption, coordination failure, persona drift, and more |
| 5-tier escalation | Hash ($0.00) state delta embeddings LLM judge ($0.02) human review |
| 4 frameworks | Purpose-built detectors for LangGraph, n8n, Dify, and OpenClaw |
| $0.05 avg/trace | 90%+ resolved at Tier 1-2, zero LLM cost |
| OTEL native | OpenTelemetry with gen_ai.* semantic conventions |
| Self-healing | Fix generation, approval workflows, rollback |
Supported frameworks¶
LangGraph · n8n · Dify · OpenClaw · CrewAI · AutoGen · Claude Code · Claude Managed Agents · Any OTEL source
Quick links¶
- SDK Quickstart —
pip install pisamaand detect failures in 3 lines - Cookbook — Framework integration examples
- Installation — Full setup guide
- Configuration — Environment variables
- Detection Reference — All detectors with F1 scores
- Deployment — Production deployment