Architecture¶
Pisama is a full-stack platform for detecting and healing failure modes in multi-agent LLM systems. This page describes the system architecture, key components, and data flow.
System Overview¶
graph TB
subgraph sources["Trace Sources"]
LG[LangGraph]
N8N[n8n]
DifyS[Dify]
OpenClaw[OpenClaw]
SK[Semantic Kernel]
Claude[Claude Agent SDK]
ManagedAgents[Claude Managed Agents]
OTEL[Any OTEL-instrumented agent<br/>via gen_ai.* conventions]
end
subgraph oss["OSS Packages — pip install pisama"]
SDK["pisama SDK<br/>analyze, async_analyze"]
CLI["pisama CLI<br/>analyze, watch, detectors"]
Core["pisama-core<br/>20 heuristic detectors"]
AgentSDK["pisama-agent-sdk<br/>hooks, check, evaluator"]
MCP["MCP Server<br/>Cursor, Claude Desktop"]
end
subgraph cloud["Cloud Platform"]
FE["Frontend<br/>Next.js Dashboard"]
API["REST API<br/>FastAPI, 34 endpoints"]
Ingest["Ingestion Pipeline<br/>OTEL, n8n, Dify webhooks"]
ML["ML Detection<br/>Tiered escalation, LLM Judge"]
Heal["Self-Healing<br/>Fix generation, approval, rollback"]
DB["Storage<br/>PostgreSQL + pgvector"]
end
sources --> SDK
sources --> Ingest
SDK --> Core
CLI --> Core
AgentSDK --> Core
MCP --> SDK
Ingest --> DB
DB --> ML
ML --> FE
ML --> Heal
API --> DB
API --> ML
FE --> API OSS boundary
The OSS packages (pisama, pisama-core, pisama-agent-sdk) run entirely offline with zero network calls. They include all 20 heuristic detectors, the CLI, and the MCP server. The Cloud platform adds ML-based tiered detection, a dashboard, self-healing, and multi-tenancy. See OSS vs Cloud for a full comparison.
Technology Stack¶
| Layer | Technology |
|---|---|
| Backend | FastAPI, SQLAlchemy, PostgreSQL 16+, pgvector, Alembic |
| Frontend | Next.js 16, React 18, TailwindCSS 3.4, Zustand, TanStack Query 5 |
| ML / Embeddings | E5-large-instruct (1024d), nomic-embed-text-v1.5 (768d), sentence-transformers |
| LLM | Claude (Anthropic) for judge and fixes; Gemini for budget tier |
| SDK | Python with LangGraph, n8n, Dify, OpenClaw, Semantic Kernel, and Claude Managed Agents adapters; generic OTEL ingestion for other frameworks |
| CLI | Click-based with MCP server support |
| Infrastructure | Docker, Fly.io (backend), Vercel (frontend), PostgreSQL 16 |
Data Flow¶
Trace Ingestion Pipeline¶
Traces enter Pisama through the ingestion pipeline, which normalizes data from any supported framework into a common internal format.
graph TD
Source["Trace Source<br/>OTEL / webhook / SDK"] --> Parser["Ingestion Parser<br/>OTEL, n8n, conversation, raw JSON"]
Parser --> Parsed["ParsedState Objects<br/>trace_id, agent_id, sequence_num<br/>state_delta, state_hash<br/>token_count, latency_ms"]
Parsed --> Storage["Storage Layer<br/>PostgreSQL + pgvector"] Agent identification uses framework-specific OTEL attributes:
| Framework | Agent attribute | State attribute |
|---|---|---|
| Standard OTEL | gen_ai.agent.name | gen_ai.state |
| LangGraph | langgraph.node.name | langgraph.state |
| OpenClaw | openclaw.agent.name | openclaw.session.state |
| Claude Managed Agents | managed_agents.agent.id | -- |
| OpenAI Assistants | gen_ai.assistant.id | run.steps |
| Bedrock Agents | aws.bedrock.agent.id | orchestrationTrace |
| Semantic Kernel | semantic_kernel.agent.name | -- |
Detection Pipeline¶
The DetectionOrchestrator is the main entry point for trace analysis. It runs all applicable detectors and returns a DiagnosisResult.
graph TD
Trace["Trace / ParsedStates"] --> Orch["Detection Orchestrator<br/>Runs detectors sequentially"]
Orch --> Filter["Filter & Sort<br/>Keep detected=True only<br/>Sort by severity, then confidence"]
Filter --> Result["DiagnosisResult<br/>primary_failure, all_detections<br/>root_cause_explanation<br/>self_healing_available"] Each detector follows a cheapest-first strategy, escalating through tiers only when lower tiers are inconclusive. See Detection Tiers for details.
Self-Healing Pipeline¶
When failures are detected, the self-healing pipeline can generate and apply fixes:
graph TD
Det["Detection Result"] --> Gen["Fix Generator<br/>AI-powered fix generation<br/>Code suggestions, best practices"]
Gen --> Approve["Approval Policy<br/>Manual or automatic<br/>High-risk requires human approval"]
Approve --> Apply["Apply & Validate<br/>Execute fix with checkpoint<br/>Rollback if validation fails"] See Self-Healing for the full pipeline description.
Key Directories¶
| Directory | Purpose |
|---|---|
backend/app/api/v1/ | REST API endpoints |
backend/app/detection/ | Core heuristic detection algorithms (20 detectors) |
backend/app/detection_enterprise/ | Enterprise ML/tiered detection, calibration |
backend/app/detection/llm_judge/ | LLM-as-Judge verification |
backend/app/ingestion/ | Trace parsing (OTEL, n8n, universal) |
backend/app/storage/ | Database models and migrations |
backend/app/fixes/ | AI-powered fix suggestions |
backend/app/healing/ | Self-healing orchestration |
backend/app/core/ | Auth, security, rate limiting, feature gates |
backend/tests/ | pytest tests |
frontend/src/app/ | Next.js pages and components |
packages/ | Python packages (pisama-core, pisama-claude-code) |
cli/ | CLI with MCP server support |
Architecture Principles¶
-
Tiered Detection -- Always start at Tier 1 (hash), escalate only if needed through Tier 2 (state delta), Tier 3 (embeddings), Tier 4 (LLM), Tier 5 (human).
-
OTEL-First -- All traces use OpenTelemetry with
gen_ai.*semantic conventions. -
Framework-Agnostic Core -- No framework-specific imports in core detection code. Framework-specific logic lives in adapters in
packages/pisama-core/src/pisama_core/adapters/. -
Cost-Aware -- Track tokens, compute time, and dollar cost per detection. Target: $0.05/trace average.
-
Safety-First Healing -- Require checkpoints, rollback capability, and approval policies for high-risk fixes.
-
Feature-Flagged Enterprise -- Core code never imports from enterprise modules. Enterprise code can import from core. Feature gates return HTTP 402 when features are not enabled.