Skip to content

check_compliance (beta)

pisama_agent_sdk.check_compliance runs Pisama's specification-compliance detector on an agent trace. It extracts behavioural rules from the agent's system prompt and reports any rule violations the trace contains.

Beta API

This call is gated behind a feature flag and the response shape may change before it reaches GA. Pin the SDK version if you depend on the field set.

What it does

Given the agent's system prompt and a list of trace events (tool calls, agent messages, user messages, tool results), the detector runs a two-stage LLM pipeline:

  1. The first stage extracts behavioural rules from the system prompt: things the agent must do, must not do, or must do conditionally.
  2. The second stage scans the trace for each rule. Concrete rules (such as a forbidden tool name) match deterministically. Ambiguous rules escalate to a single LLM call per rule.

The return value is a ComplianceResult containing the detection verdict, each detected violation with evidence, the full extracted rule set (for transparency), and the LLM cost of the run.

Enable the feature flag

Set PISAMA_ENABLE_CHECK_COMPLIANCE=1 in the environment that runs the SDK. Any of 1, true, yes (case-insensitive) work. Without the flag the call raises PisamaFeatureNotEnabledError immediately, before any network round-trip.

export PISAMA_ENABLE_CHECK_COMPLIANCE=1

Minimal example

import asyncio
from pisama_agent_sdk import check_compliance

async def main():
    result = await check_compliance(
        system_prompt=(
            "You are a careful database assistant. "
            "Never call drop_table. Always confirm with the user "
            "before running destructive SQL."
        ),
        trace_events=[
            {"type": "user_message", "content": "Clean up old records."},
            {"type": "tool_call", "name": "drop_table", "args": {"name": "users"}},
        ],
    )
    if result.detected:
        for v in result.violations:
            print(f"{v.rule_id}: {v.explanation}")
            print(f"  evidence: {v.evidence}")
            print(f"  confidence: {v.confidence:.2f}")
    print(f"cost: ${result.cost_usd:.4f} ({result.tokens_used} tokens)")

asyncio.run(main())

Result shape

ComplianceResult has these fields:

Field Type Description
detected bool True if any rule was violated.
confidence float 0.0 to 1.0; the maximum violation confidence.
violations list[Violation] Each detected rule violation.
extracted_rules list[BehavioralRule] Every rule the extractor pulled from the prompt.
tokens_used int Total LLM tokens across both pipeline stages.
cost_usd float Total LLM cost in USD.

Violation carries rule_id, evidence, explanation, and confidence. BehavioralRule carries rule_id, description, trigger, optional required_action and forbidden_action, plus a severity of critical, high, medium, or low.

Errors

Exception When
PisamaFeatureNotEnabledError Flag is unset or set to a falsy value.
urllib.error.URLError Backend is unreachable.
TimeoutError Backend call exceeds timeout_ms (default 60 seconds).

A backend response with malformed JSON is treated as a soft failure: the SDK returns an empty ComplianceResult rather than crashing the agent.