Architecture¶

Pisama is a full-stack platform for detecting and healing failure modes in multi-agent LLM systems. This page describes the system architecture, key components, and data flow.

System Overview¶

graph TB
    subgraph sources["Trace Sources"]
        LG[LangGraph]
        N8N[n8n]
        DifyS[Dify]
        OpenClaw[OpenClaw]
        SK[Semantic Kernel]
        Claude[Claude Agent SDK]
        ManagedAgents[Claude Managed Agents]
        OTEL[Any OTEL-instrumented agent<br/>via gen_ai.* conventions]
    end

    subgraph oss["OSS Packages — pip install pisama"]
        SDK["pisama SDK<br/>analyze, async_analyze"]
        CLI["pisama CLI<br/>analyze, watch, detectors"]
        Core["pisama-core<br/>20 heuristic detectors"]
        AgentSDK["pisama-agent-sdk<br/>hooks, check, evaluator"]
        MCP["MCP Server<br/>Cursor, Claude Desktop"]
    end

    subgraph cloud["Cloud Platform"]
        FE["Frontend<br/>Next.js Dashboard"]
        API["REST API<br/>FastAPI, 34 endpoints"]
        Ingest["Ingestion Pipeline<br/>OTEL, n8n, Dify webhooks"]
        ML["ML Detection<br/>Tiered escalation, LLM Judge"]
        Heal["Self-Healing<br/>Fix generation, approval, rollback"]
        DB["Storage<br/>PostgreSQL + pgvector"]
    end

    sources --> SDK
    sources --> Ingest
    SDK --> Core
    CLI --> Core
    AgentSDK --> Core
    MCP --> SDK
    Ingest --> DB
    DB --> ML
    ML --> FE
    ML --> Heal
    API --> DB
    API --> ML
    FE --> API

OSS boundary

The OSS packages (pisama, pisama-core, pisama-agent-sdk) run entirely offline with zero network calls. They include all 20 heuristic detectors, the CLI, and the MCP server. The Cloud platform adds ML-based tiered detection, a dashboard, self-healing, and multi-tenancy. See OSS vs Cloud for a full comparison.

Technology Stack¶

Layer	Technology
Backend	FastAPI, SQLAlchemy, PostgreSQL 16+, pgvector, Alembic
Frontend	Next.js 16, React 18, TailwindCSS 3.4, Zustand, TanStack Query 5
ML / Embeddings	E5-large-instruct (1024d), nomic-embed-text-v1.5 (768d), sentence-transformers
LLM	Claude (Anthropic) for judge and fixes; Gemini for budget tier
SDK	Python with LangGraph, n8n, Dify, OpenClaw, Semantic Kernel, and Claude Managed Agents adapters; generic OTEL ingestion for other frameworks
CLI	Click-based with MCP server support
Infrastructure	Docker, Fly.io (backend), Vercel (frontend), PostgreSQL 16

Data Flow¶

Trace Ingestion Pipeline¶

Traces enter Pisama through the ingestion pipeline, which normalizes data from any supported framework into a common internal format.

graph TD
    Source["Trace Source<br/>OTEL / webhook / SDK"] --> Parser["Ingestion Parser<br/>OTEL, n8n, conversation, raw JSON"]
    Parser --> Parsed["ParsedState Objects<br/>trace_id, agent_id, sequence_num<br/>state_delta, state_hash<br/>token_count, latency_ms"]
    Parsed --> Storage["Storage Layer<br/>PostgreSQL + pgvector"]

Agent identification uses framework-specific OTEL attributes:

Framework	Agent attribute	State attribute
Standard OTEL	`gen_ai.agent.name`	`gen_ai.state`
LangGraph	`langgraph.node.name`	`langgraph.state`
OpenClaw	`openclaw.agent.name`	`openclaw.session.state`
Claude Managed Agents	`managed_agents.agent.id`	--
OpenAI Assistants	`gen_ai.assistant.id`	run.steps
Bedrock Agents	`aws.bedrock.agent.id`	orchestrationTrace
Semantic Kernel	`semantic_kernel.agent.name`	--

Detection Pipeline¶

The DetectionOrchestrator is the main entry point for trace analysis. It runs all applicable detectors and returns a DiagnosisResult.

graph TD
    Trace["Trace / ParsedStates"] --> Orch["Detection Orchestrator<br/>Runs detectors sequentially"]
    Orch --> Filter["Filter & Sort<br/>Keep detected=True only<br/>Sort by severity, then confidence"]
    Filter --> Result["DiagnosisResult<br/>primary_failure, all_detections<br/>root_cause_explanation<br/>self_healing_available"]

Each detector follows a cheapest-first strategy, escalating through tiers only when lower tiers are inconclusive. See Detection Tiers for details.

Self-Healing Pipeline¶

When failures are detected, the self-healing pipeline can generate and apply fixes:

graph TD
    Det["Detection Result"] --> Gen["Fix Generator<br/>AI-powered fix generation<br/>Code suggestions, best practices"]
    Gen --> Approve["Approval Policy<br/>Manual or automatic<br/>High-risk requires human approval"]
    Approve --> Apply["Apply & Validate<br/>Execute fix with checkpoint<br/>Rollback if validation fails"]

See Self-Healing for the full pipeline description.

Key Directories¶

Directory	Purpose
`backend/app/api/v1/`	REST API endpoints
`backend/app/detection/`	Core heuristic detection algorithms (20 detectors)
`backend/app/detection_enterprise/`	Enterprise ML/tiered detection, calibration
`backend/app/detection/llm_judge/`	LLM-as-Judge verification
`backend/app/ingestion/`	Trace parsing (OTEL, n8n, universal)
`backend/app/storage/`	Database models and migrations
`backend/app/fixes/`	AI-powered fix suggestions
`backend/app/healing/`	Self-healing orchestration
`backend/app/core/`	Auth, security, rate limiting, feature gates
`backend/tests/`	pytest tests
`frontend/src/app/`	Next.js pages and components
`packages/`	Python packages (pisama-core, pisama-claude-code)
`cli/`	CLI with MCP server support

Architecture Principles¶

Tiered Detection -- Always start at Tier 1 (hash), escalate only if needed through Tier 2 (state delta), Tier 3 (embeddings), Tier 4 (LLM), Tier 5 (human).
OTEL-First -- All traces use OpenTelemetry with gen_ai.* semantic conventions.
Framework-Agnostic Core -- No framework-specific imports in core detection code. Framework-specific logic lives in adapters in packages/pisama-core/src/pisama_core/adapters/.
Cost-Aware -- Track tokens, compute time, and dollar cost per detection. Target: $0.05/trace average.
Safety-First Healing -- Require checkpoints, rollback capability, and approval policies for high-risk fixes.
Feature-Flagged Enterprise -- Core code never imports from enterprise modules. Enterprise code can import from core. Feature gates return HTTP 402 when features are not enabled.