Heal

Automatic fix generation and deployment for AI agent failures.

Risicare's self-healing pipeline automatically detects errors, diagnoses root causes, and generates fixes. Fix deployment via A/B testing is available as an opt-in feature.

Beyond observability

No other platform offers automated error diagnosis with a 154-code taxonomy, fix generation across 7 fix types, and statistical A/B deployment. While competitors stop at showing you the error, Risicare diagnoses why it happened and generates a fix.

Overview

The healing pipeline follows the DoVer methodology (Diagnosis via Observation of Verification):

Generate Hypotheses - Create testable hypotheses about fixes
Validate Statistically - Test fixes with A/B testing
Deploy Safely - Canary release with automatic rollback

Hypothesis Testing

DoVer methodology for fix validation

Learn more

Fix Types

7 types of automatic fixes

Learn more

Overview

How self-healing works

Learn more

Fix Types

Risicare can generate 7 types of fixes:

Type	What It Does	Example
Prompt	Modify system prompt or add few-shot examples	Add clarifying instructions
Parameter	Adjust LLM parameters	Lower temperature, increase max_tokens
Tool	Fix tool configuration	Add timeout, fix validation
Retry	Add retry logic	Exponential backoff on transient errors
Fallback	Use alternative model/strategy	Fall back to gpt-4o-mini on timeout
Guard	Add input/output validation	JSON schema validation
Routing	Change agent delegation	Route to different specialist agent

Fix Configuration

Fixes are JSON configurations, not code:

{
  "fix_id": "fix-abc123",
  "fix_type": "retry",
  "config": {
    "max_retries": 3,
    "initial_delay_ms": 1000,
    "exponential_base": 2.0,
    "max_delay_ms": 30000,
    "jitter": true,
    "retry_on": ["TimeoutError"]
  },
  "rollback_strategy": {
    "type": "immediate",
    "trigger": "error_rate > 0.1"
  }
}

No Code Injection

Fixes are declarative configurations applied by the SDK at runtime. Risicare never injects code into your system.

Hypothesis Testing

Before deployment, fixes are validated through hypothesis testing:

Generate Hypotheses

Diagnosis: TOOL.EXECUTION.TIMEOUT on weather_api

Hypothesis 1: Adding retry with backoff will reduce timeout errors
  Prior probability: 0.75 (based on similar patterns)

Hypothesis 2: Increasing timeout to 60s will reduce errors
  Prior probability: 0.60

Hypothesis 3: Adding fallback to cached data will maintain uptime
  Prior probability: 0.55

Statistical Validation

Each hypothesis is tested with:

Sample size calculation for statistical power (0.8)
Two-proportion z-test for significance (p < 0.05)
Bayesian updates to posterior probability
O'Brien-Fleming boundaries for early stopping

Test Results:
  Baseline error rate: 12.3%
  Treatment error rate: 2.1%
  Effect size (Cohen's h): 0.38
  P-value: 0.0023 ✓

  Decision: Hypothesis VALIDATED

Deployment Pipeline

Fix Created
     ↓
┌─────────────────┐
│ Canary (5%)     │  Minimum 100 samples
│                 │  Monitor error rate
└─────────────────┘
     ↓ (if passing)
┌─────────────────┐
│ Ramp (25%)      │  Statistical A/B test
│                 │  O'Brien-Fleming boundaries
└─────────────────┘
     ↓ (if winning)
┌─────────────────┐
│ Ramp (50%)      │  Continue testing
│                 │
└─────────────────┘
     ↓ (if winning)
┌─────────────────┐
│ Graduate (100%) │  Hold for 24 hours
│                 │  Mark as graduated
└─────────────────┘

Automatic Rollback

Fixes are automatically rolled back if:

Error rate increases >10% vs baseline
P99 latency exceeds 2x baseline
Manual rollback triggered

Rollback latency target: under 500ms (Redis routing update)

Fix Runtime

The SDK includes a fix runtime that:

Loads fixes from the API on startup
Caches locally with periodic refresh
Routes requests based on A/B assignment
Applies fixes at LLM call time

# Fix runtime is automatic when using the SDK
import risicare
 
risicare.init()
 
# Fixes are applied automatically to LLM calls
response = client.chat.completions.create(...)

Knowledge Base

Successful fixes are stored in a knowledge base:

Error patterns as embeddings (pgvector)
Fix templates with parameters
Cross-customer learning (federated, no raw data)
Similarity threshold: 0.85

When a new error occurs, the knowledge base is checked first before generating a new fix.

Next Steps

Hypothesis Testing Details

Statistical methods and DoVer

Learn more

Deployment Pipeline

Canary releases and A/B testing

Learn more

Edit this page on GitHub

PreviousPipeline NextOverview