Skip to content

Architecture

Pisama is a full-stack platform for detecting and healing failure modes in multi-agent LLM systems. This page describes the system architecture, key components, and data flow.

System Overview

graph TB
    subgraph sources["Trace Sources"]
        LG[LangGraph]
        N8N[n8n]
        DifyS[Dify]
        OpenClaw[OpenClaw]
        SK[Semantic Kernel]
        Claude[Claude Agent SDK]
        ManagedAgents[Claude Managed Agents]
        OTEL[Any OTEL-instrumented agent<br/>via gen_ai.* conventions]
    end

    subgraph oss["OSS Packages — pip install pisama"]
        SDK["pisama SDK<br/>analyze, async_analyze"]
        CLI["pisama CLI<br/>analyze, watch, detectors"]
        Core["pisama-core<br/>20 heuristic detectors"]
        AgentSDK["pisama-agent-sdk<br/>hooks, check, evaluator"]
        MCP["MCP Server<br/>Cursor, Claude Desktop"]
    end

    subgraph cloud["Cloud Platform"]
        FE["Frontend<br/>Next.js Dashboard"]
        API["REST API<br/>FastAPI, 34 endpoints"]
        Ingest["Ingestion Pipeline<br/>OTEL, n8n, Dify webhooks"]
        ML["ML Detection<br/>Tiered escalation, LLM Judge"]
        Heal["Self-Healing<br/>Fix generation, approval, rollback"]
        DB["Storage<br/>PostgreSQL + pgvector"]
    end

    sources --> SDK
    sources --> Ingest
    SDK --> Core
    CLI --> Core
    AgentSDK --> Core
    MCP --> SDK
    Ingest --> DB
    DB --> ML
    ML --> FE
    ML --> Heal
    API --> DB
    API --> ML
    FE --> API

OSS boundary

The OSS packages (pisama, pisama-core, pisama-agent-sdk) run entirely offline with zero network calls. They include all 20 heuristic detectors, the CLI, and the MCP server. The Cloud platform adds ML-based tiered detection, a dashboard, self-healing, and multi-tenancy. See OSS vs Cloud for a full comparison.

Technology Stack

Layer Technology
Backend FastAPI, SQLAlchemy, PostgreSQL 16+, pgvector, Alembic
Frontend Next.js 16, React 18, TailwindCSS 3.4, Zustand, TanStack Query 5
ML / Embeddings E5-large-instruct (1024d), nomic-embed-text-v1.5 (768d), sentence-transformers
LLM Claude (Anthropic) for judge and fixes; Gemini for budget tier
SDK Python with LangGraph, n8n, Dify, OpenClaw, Semantic Kernel, and Claude Managed Agents adapters; generic OTEL ingestion for other frameworks
CLI Click-based with MCP server support
Infrastructure Docker, Fly.io (backend), Vercel (frontend), PostgreSQL 16

Data Flow

Trace Ingestion Pipeline

Traces enter Pisama through the ingestion pipeline, which normalizes data from any supported framework into a common internal format.

graph TD
    Source["Trace Source<br/>OTEL / webhook / SDK"] --> Parser["Ingestion Parser<br/>OTEL, n8n, conversation, raw JSON"]
    Parser --> Parsed["ParsedState Objects<br/>trace_id, agent_id, sequence_num<br/>state_delta, state_hash<br/>token_count, latency_ms"]
    Parsed --> Storage["Storage Layer<br/>PostgreSQL + pgvector"]

Agent identification uses framework-specific OTEL attributes:

Framework Agent attribute State attribute
Standard OTEL gen_ai.agent.name gen_ai.state
LangGraph langgraph.node.name langgraph.state
OpenClaw openclaw.agent.name openclaw.session.state
Claude Managed Agents managed_agents.agent.id --
OpenAI Assistants gen_ai.assistant.id run.steps
Bedrock Agents aws.bedrock.agent.id orchestrationTrace
Semantic Kernel semantic_kernel.agent.name --

Detection Pipeline

The DetectionOrchestrator is the main entry point for trace analysis. It runs all applicable detectors and returns a DiagnosisResult.

graph TD
    Trace["Trace / ParsedStates"] --> Orch["Detection Orchestrator<br/>Runs detectors sequentially"]
    Orch --> Filter["Filter & Sort<br/>Keep detected=True only<br/>Sort by severity, then confidence"]
    Filter --> Result["DiagnosisResult<br/>primary_failure, all_detections<br/>root_cause_explanation<br/>self_healing_available"]

Each detector follows a cheapest-first strategy, escalating through tiers only when lower tiers are inconclusive. See Detection Tiers for details.

Self-Healing Pipeline

When failures are detected, the self-healing pipeline can generate and apply fixes:

graph TD
    Det["Detection Result"] --> Gen["Fix Generator<br/>AI-powered fix generation<br/>Code suggestions, best practices"]
    Gen --> Approve["Approval Policy<br/>Manual or automatic<br/>High-risk requires human approval"]
    Approve --> Apply["Apply & Validate<br/>Execute fix with checkpoint<br/>Rollback if validation fails"]

See Self-Healing for the full pipeline description.

Key Directories

Directory Purpose
backend/app/api/v1/ REST API endpoints
backend/app/detection/ Core heuristic detection algorithms (20 detectors)
backend/app/detection_enterprise/ Enterprise ML/tiered detection, calibration
backend/app/detection/llm_judge/ LLM-as-Judge verification
backend/app/ingestion/ Trace parsing (OTEL, n8n, universal)
backend/app/storage/ Database models and migrations
backend/app/fixes/ AI-powered fix suggestions
backend/app/healing/ Self-healing orchestration
backend/app/core/ Auth, security, rate limiting, feature gates
backend/tests/ pytest tests
frontend/src/app/ Next.js pages and components
packages/ Python packages (pisama-core, pisama-claude-code)
cli/ CLI with MCP server support

Architecture Principles

  1. Tiered Detection -- Always start at Tier 1 (hash), escalate only if needed through Tier 2 (state delta), Tier 3 (embeddings), Tier 4 (LLM), Tier 5 (human).

  2. OTEL-First -- All traces use OpenTelemetry with gen_ai.* semantic conventions.

  3. Framework-Agnostic Core -- No framework-specific imports in core detection code. Framework-specific logic lives in adapters in packages/pisama-core/src/pisama_core/adapters/.

  4. Cost-Aware -- Track tokens, compute time, and dollar cost per detection. Target: $0.05/trace average.

  5. Safety-First Healing -- Require checkpoints, rollback capability, and approval policies for high-risk fixes.

  6. Feature-Flagged Enterprise -- Core code never imports from enterprise modules. Enterprise code can import from core. Feature gates return HTTP 402 when features are not enabled.