Pisama¶

Failure detection for multi-agent LLM systems. 27 production-grade detectors across 4 frameworks. Open source. MIT licensed.

Get started in 3 minutes View on GitHub

Python SDK

pip install pisama — detect failures in 3 lines. No server needed.

SDK Quickstart
Full Platform

Dashboard, tiered detection, self-healing, REST API. Docker Compose.

Platform Quickstart
API Reference

REST API for traces, detections, healing, and integrations.

API docs
Detection Reference

Per-detector documentation with F1 scores and accuracy benchmarks.

Detectors
Cookbook

Integration examples for LangGraph, n8n, Dify, OpenClaw, CrewAI, and Claude.

Cookbook
OSS vs Cloud

Compare the free SDK with the full platform.

Compare

What Pisama detects¶

LLM agents fail silently. A coding agent loops for 40 minutes. A research agent hallucinates citations. A support agent drifts from its persona. Standard monitoring misses all of it.

Pisama provides 41 calibrated detectors (17 core + 24 framework-specific) built on the MAST taxonomy:


27 production-grade	F1 >= 0.80 — loop detection, state corruption, coordination failure, persona drift, and more
5-tier escalation	Hash ($0.00) state delta embeddings LLM judge ($0.02) human review
4 frameworks	Purpose-built detectors for LangGraph, n8n, Dify, and OpenClaw
$0.05 avg/trace	90%+ resolved at Tier 1-2, zero LLM cost
OTEL native	OpenTelemetry with `gen_ai.*` semantic conventions
Self-healing	Fix generation, approval workflows, rollback

Supported frameworks¶

LangGraph · n8n · Dify · OpenClaw · CrewAI · AutoGen · Claude Code · Claude Managed Agents · Any OTEL source

Quick links¶

SDK Quickstart — pip install pisama and detect failures in 3 lines
Cookbook — Framework integration examples
Installation — Full setup guide
Configuration — Environment variables
Detection Reference — All detectors with F1 scores
Deployment — Production deployment