Skip to content

Self-Healing

Pisama detects agent failures, recommends fixes, and applies them where it's safe to do so. "Safe to do so" is a closed vocabulary, not a marketing slogan — see Risk Levels below.

Coverage at a glance

Against the 84 detectors in backend/data/capability_registry.json, bucketed by the lowest-risk fix each detector can offer:

Bucket Count What happens when this detector fires
SAFE auto-heal in-loop 34 (40%) Config-only fix applied inline; permissionDecision: deny + prompt patch; agent retries within seconds
MEDIUM verify-then-apply 8 (9%) verify_inline() simulates the apply via a replay shim and applies if predicted to resolve
MEDIUM escalates 10 (11%) No verify shim yet, so routed to human approval
DANGEROUS escalates 24 (28%) Fix recommended, blocked pending human approval (prompt rewrites, output filters, governance / safety handoffs)
No remediation path yet 11 (12%) Deferred quality / experimental / topology-diagnostic detectors, see Deferred

Total recommendation coverage: 76 of 87 detectors (87%). Autonomous in-loop fix (SAFE plus verified MEDIUM, no approval): 42 of 87 (48%).

Run python -m scripts.synth_agents.coverage_audit to regenerate these numbers against the current code. The number that actually matters — real-world fix efficacy — is gated on A5 (fix-outcome verification scheduling); see What's not measured.

There are two healing surfaces:

  • In-loop (Python SDK): a tool call about to fail can be patched and retried inside the agent's loop, in under two seconds. See In-Loop Healing.
  • Async (hosted frameworks): detections that fire on ingested traces (n8n, LangGraph, Dify, OpenClaw, Managed Agents) generate fix suggestions that auto-apply for SAFE risk, route to approval for MEDIUM/DANGEROUS. See Async Healing.

The two paths share the same fix-generator inventory and risk classification.

Pipeline Overview

Detection
Generate fix suggestions  ◄── template per-detector + optional LLM enrichment
Classify risk (SAFE / MEDIUM / DANGEROUS)
   ├── SAFE       ──► auto-apply (in-loop or async)
   ├── MEDIUM     ──► auto-apply with verification, rollback on failure
   └── DANGEROUS  ──► escalate via approval flow
                      (POST /healing/{id}/approve)

Fix Suggestions

Every detection comes with a one-line suggested_fix (populated by the per-detector template generator) and an optional structured FixSuggestion list returned by GET /detections/{id}/fixes.

Each suggestion carries:

  • fix_type: the closed vocabulary the applicator dispatches on (e.g. retry_limit, prompt_reinforcement).
  • confidence: HIGH / MEDIUM / LOW.
  • recommendation_source: template, llm, or llm_cached. When the LLM enricher is enabled (env PISAMA_FIX_ENRICHER_ENABLED=true and ANTHROPIC_API_KEY set), the description and rationale are rewritten against the actual detection evidence and the customer's framework, with a cache to keep cost under ~$0.001 per detection.
  • applicable_frameworks: empty means framework-agnostic; otherwise the API returns the fix only when the caller's framework matches.

Risk Levels

Level Examples Policy
SAFE retry limit, circuit breaker, execution timeout, budget limiter, checkpoint recovery, state validation Auto-apply
MEDIUM exponential backoff, context pruning, summarization, schema enforcement, token optimizer Auto-apply with verification + rollback on failure
DANGEROUS prompt reinforcement, role boundary, input filtering, fact checking, source grounding Requires human approval before apply

The mapping lives in backend/app/healing/models.py::FIX_RISK_MAP; treat that file as the source of truth, not this doc.

In-Loop Healing (Python SDK)

When an agent is about to call a tool that the bridge would block, the pre_tool_use hook can request a SAFE-risk fix synchronously and retry the step with the patched system message — all inside the agent's loop.

from pisama_agent_sdk.hooks import PreToolUseHook

hook = PreToolUseHook(auto_heal=True)  # or PISAMA_AUTO_HEAL=1 in env
agent.hooks.pre_tool_use = hook

Behaviour:

  • SAFE fix available → permissionDecision: deny + systemMessage carrying the fix. Claude SDK and LangGraph natively re-run the step.
  • DANGEROUS / no fix → standard permissionDecision: block plus an approval URL in the system message.
  • heal_now() timeout, network error, or missing PISAMA_API_KEY → fall through to the block path.

Direct call from custom orchestration:

from pisama_agent_sdk import heal_now

healing = heal_now(
    detection_type="loop",
    details={"states": [...]},
    framework="claude_sdk",
)
if healing.applied:
    system_message = healing.prompt_patch  # apply and retry
elif healing.escalated:
    surface_to_human(healing.approval_url)

Each registered detector also accepts an on_detection callback that can return an InterventionAction (allow / block / retry_with_fix / escalate) — useful when wiring Pisama into a non-Claude agent loop:

from pisama_detectors import detect_loop, InterventionAction

def cb(result):
    if result.detected and result.confidence > 0.9:
        return InterventionAction(decision="retry_with_fix")
    return InterventionAction(decision="allow")

detect_loop(states=[...], on_detection=cb)

Async Healing (Hosted Frameworks)

For traces ingested from n8n, LangGraph, Dify, OpenClaw, and Anthropic Managed Agents, healing runs after detection completes. backend/app/healing/engine.py orchestrates the full detect → generate → validate → apply pipeline; the AutoApplyService gates by risk level and provides rollback on validation failure.

Triggering a healing operation for a stored detection:

curl -X POST "$API/api/v1/healing/trigger/$DETECTION_ID" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"approval_required": false}'

Synchronous variant for the in-loop case (no HealingRecord persisted; sub-second envelope):

curl -X POST "$API/api/v1/healing/trigger/sync" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"detection_type":"loop","details":{...},"framework":"claude_sdk"}'

Approve a DANGEROUS-risk fix:

curl -X POST "$API/api/v1/healing/$HEALING_ID/approve" \
  -H "Authorization: Bearer $TOKEN"

Roll back an applied fix:

curl -X POST "$API/api/v1/healing/$HEALING_ID/rollback" \
  -H "Authorization: Bearer $TOKEN"

List healing operations:

curl "$API/api/v1/healing" \
  -H "Authorization: Bearer $TOKEN"

Safety Guarantees

  1. Risk-gated auto-apply: only FixRiskLevel.SAFE fixes apply without human approval. DANGEROUS fixes always escalate.
  2. Checkpoint before apply: every applied fix records original_state on the HealingRecord so it can be reverted.
  3. Validation after apply: applied fixes run a verification step; failure triggers automatic rollback.
  4. In-loop fallback to block: when auto_heal cannot get a SAFE fix in time, the SDK falls back to the standard block-and-warn path.
  5. Audit trail: every healing operation is logged on the HealingRecord with timestamps, approvers, and the applied fix payload.

Governance failures never auto-heal

Nine detectors classify as governance, safety, or integrity failures — they catch the agent breaking a process, safety, or integrity contract, not a quality threshold:

  • authority_gradient — sub-agent suppressed orchestrator ground truth (MSR §15.3)
  • role_usurpation — agent acted outside its declared role (MAST FM-2.4)
  • approval_bypass — agent skipped a required approval gate
  • exploration_safety — agent left its declared sandbox boundary
  • role_usurpation_canonical / role_usurpation_exec — the F9 hybrid detector's variants: the agent assumed, or executed privileged actions under, an ungranted role
  • reward_hacking — agent gamed verification (weakened tests, hardcoded outputs, skip/xfail) to fake completion
  • cowork_safety — autonomous desktop agent took or attempted a destructive filesystem / cloud operation
  • subagent_boundary — sub-agent used a tool outside its declared allowed_tools

Auto-applying a "fix" for any of these would defeat the detector that caught the violation, and there is no config knob for "the agent cheated." Each escalates with a structured handoff payload (who to page, what evidence to attach, recommended rollback, SLA) so the human paged for review can act in minutes, not hours. See backend/app/fixes/governance_fixes.py.

Verify-then-apply for MEDIUM fixes

verify_inline() (backend/app/healing/verify_inline.py) is the safety net that unlocks in-loop auto-heal for MEDIUM-tier fixes — those that change agent behaviour (reranker order, route fallback, summarisation chunk size). The flow:

  1. Generator emits a MEDIUM fix.
  2. verify_inline() calls a per-detector replay shim that reconstructs the detector's input post-apply (e.g. for retrieval: re-sort retrieved docs by score, simulating the reranker).
  3. The original detector runs against the simulated post-apply state.
  4. Apply iff the failure signal drops below the improvement threshold (detected → False, or confidence drop ≥ 0.30); escalate otherwise.

When the replay shim doesn't exist for a detection type or the prediction is inconclusive, the path conservatively escalates. The shim registry is opt-in — new detectors land as "MEDIUM escalates" until a shim ships.

Deferred detectors

Eleven detectors intentionally have no fix generator today. Each is a quality / diagnostic score, an experimental (untested) detector, or a multi-agent topology diagnostic where there is no single correct in-loop remediation, so a templated fix would be guesswork:

  • dispatch_async — experimental detector for phone→desktop async-dispatch failures; untested readiness.
  • computer_use — screen-interaction error-rate signal where a 27.5% failure rate is normal (OSWorld); not a fixable config.
  • adaptive_thinking — statistical variance signal on the thinking budget; a baseline, not a fix.
  • scheduled_task — experimental /loop recurring-run drift detector; untested readiness.
  • sycophancy — behavioural signal (agent caves to user pressure); the honest remediation is a context-dependent prompt intervention, surfaced as insight rather than auto-applied.
  • analytical_semantics — checks whether a generated query matches the user's analytical intent; a semantic diagnostic with no clean config fix.
  • chunk_attribution — hard-blocked from auto-apply (HARD_BLOCKED_DETECTORS); attribution remediation is unsettled.
  • instruction_compliance: experimental (untested, no measured F1) instruction-adherence detector; there is no validated behaviour to remediate yet, so it stays deferred until it is calibrated and promoted.
  • silent_cascade: sub-agent topology diagnostic (a sub-agent swallows an error that then cascades upstream); the remediation is an architectural change to how the orchestrator surfaces sub-agent failures, not a config template.
  • synthesis_failure: sub-agent topology diagnostic (the orchestrator fails to merge sub-agent outputs into a coherent result); no single in-loop fix, so it is surfaced as insight.
  • redundant_delegation_conflict: multi-agent delegation-conflict diagnostic (two agents redundantly own the same task and conflict); the fix is topology-dependent and cannot be templated.

(output_validation left this list on 2026-06-11: F12 "incorrect verification" has a canonical structural remediation — an independent fail-closed verifier stage plus boundary schema enforcement — now shipped as OutputValidationFixGenerator.)

A weak fix generator here would be worse than the current "no recommendation" — it would dilute trust in the higher-quality recommendations. The set is asserted in scripts/synth_agents/tests/test_coverage_regression.py::_DEFERRED_AFTER_F; covering one means shipping a generator and removing it there.

What's not measured

This page lists coverage, not fix rate. Whether an applied fix actually resolves the failure on retry is the open question. The plumbing for measuring it (A5 in the implementation plan) is deferred:

  • After fix apply, APScheduler should enqueue a +1h and +24h re-run of the same detector against a fresh trace window for the affected agent.
  • Before/after firing rates land on HealingRecord.detector_progress (column already exists; population is the missing piece).
  • Aggregate weekly into a per-FixGenerator historical_efficacy score, which feeds back into FixSuggestion.confidence.

Until A5 ships, the honest framing is: "Pisama recommends a fix for 76 of 84 failure modes; 42 apply automatically in-loop (SAFE plus verified MEDIUM) without human approval; real-world fix efficacy is the next workstream." See TODO(A5) on backend/app/storage/models.py::HealingRecord.

Availability by Plan

Capability Free Indie Pro Team Enterprise
Template fix suggestions Yes Yes Yes Yes Yes
LLM-enriched recommendations -- Yes Yes Yes Yes
Code-level fixes -- Yes Yes Yes Yes
In-loop healing (SDK auto_heal) -- Yes Yes Yes Yes
Verify-then-apply for MEDIUM fixes -- -- Yes Yes Yes
Async healing (hosted frameworks) -- -- Yes Yes Yes
Approval workflow for DANGEROUS fixes -- -- Yes Yes Yes
Governance escalation with handoff payload -- -- Yes Yes Yes
Canary-style staged rollout -- -- -- -- Yes

See pisama.ai/pricing for the authoritative plan matrix and limits.