Skip to content

Pisama

Process-level failure detection for multi-agent LLM systems. Catch loops, state corruption, persona drift, coordination breakdown, and convergence failures across Claude Managed Agents, LangGraph, n8n, Dify, OpenClaw, and Semantic Kernel. Open source. MIT licensed.

Get started in 3 minutes View on GitHub

  • Python SDK


    pip install pisama — detect failures in 3 lines. No server needed.

    SDK Quickstart

  • Full Platform


    Dashboard, tiered detection, self-healing, REST API. Docker Compose.

    Platform Quickstart

  • API Reference


    REST API for traces, detections, healing, and integrations.

    API docs

  • Detection Reference


    Per-detector documentation with F1 scores and accuracy benchmarks.

    Detectors

  • Cookbook


    Integration examples for Claude Managed Agents, LangGraph, n8n, Dify, OpenClaw, and Semantic Kernel.

    Cookbook

  • OSS vs Cloud


    Compare the free SDK with the full platform.

    Compare

  • Auto-Instrumentation


    One line of code. Patches Anthropic + OpenAI SDKs automatically.

    pisama-auto

  • Chaos Engineering


    Inject failures to test agent resilience. 6 experiment types with safety controls.

    Chaos Engineering


What Pisama detects

LLM agents fail silently. A coding agent loops for 40 minutes. A research agent hallucinates citations. A support agent drifts from its persona. Standard monitoring misses all of it.

Pisama provides 25 detectors externally validated at production grade (real-trace F1 0.80 or higher on at least 30 real traces, mean 0.84), with 83 of 85 registered capabilities now calibrated and 28 more reaching production-grade F1 on synthetic data, built on the MAST taxonomy:

25 externally validated Production-grade on real traces (F1 >= 0.80): injection, hallucination, persona drift, corruption, impersonation, and more
5-tier escalation Hash ($0.00) state delta embeddings LLM judge ($0.02) human review
5 frameworks Purpose-built detectors for LangGraph, n8n, Dify, OpenClaw, and Claude Managed Agents
$0.05 avg/trace 90%+ resolved at Tier 1-2, zero LLM cost
OTEL native OpenTelemetry with gen_ai.* semantic conventions
Self-healing Fix generation, approval workflows, rollback

Supported frameworks

LangGraph · n8n · Dify · OpenClaw · Semantic Kernel · Claude Managed Agents · OpenAI Assistants · Bedrock Agents · Claude Code · Any OTEL source