Fix Runtime

Apply self-healing fixes at runtime.

The Fix Runtime enables dynamic application of fixes without redeployment.

Overview

When Risicare detects an error pattern and generates a fix, the Fix Runtime can apply it immediately:

from risicare.runtime import FixRuntime, FixRuntimeConfig
 
runtime = FixRuntime(
    config=FixRuntimeConfig(
        api_endpoint="https://app.risicare.ai",
        api_key="rsk-...",
        project_id="proj-...",
        enabled=True,
    )
)
 
runtime.start()

FixRuntimeConfig

from risicare.runtime import FixRuntimeConfig
 
config = FixRuntimeConfig(
    api_endpoint: str = "",                  # Risicare API endpoint
    api_key: str | None = None,              # API key
    project_id: str | None = None,           # Project ID
    cache_enabled: bool = True,              # Enable local fix caching
    cache_ttl_seconds: int = 300,            # Cache TTL (5 minutes)
    cache_max_entries: int = 1000,           # Max cached fixes
    auto_refresh: bool = True,               # Auto-refresh fixes from API
    refresh_interval_seconds: int = 60,      # Refresh interval
    enabled: bool = True,                    # Enable fix runtime
    dry_run: bool = False,                   # Log fixes without applying
    max_retries: int = 3,                    # Max retries per fix
    ab_testing_enabled: bool = True,         # Enable A/B testing
    track_effectiveness: bool = True,        # Track fix effectiveness
    timeout_ms: int = 1000,                  # Fix application timeout
    debug: bool = False,                     # Debug logging
)

You can also create config from environment variables:

config = FixRuntimeConfig.from_env()

This reads RISICARE_ENDPOINT, RISICARE_API_KEY, RISICARE_PROJECT_ID, and other RISICARE_* environment variables.

Fix Types

The runtime supports 7 fix types:

Type	Description
`prompt`	System prompt modifications
`parameter`	LLM parameter adjustments (temperature, etc.)
`tool`	Tool configuration fixes
`retry`	Retry logic with backoff
`fallback`	Alternative model/provider fallback
`guard`	Input/output validation guards
`routing`	Request routing changes

Public Methods

start() / stop()

Start and stop the runtime lifecycle:

runtime = FixRuntime(config=config)
 
# Start: loads fixes from API and begins background refresh
runtime.start()
 
# ... your application runs ...
 
# Stop: halts background refresh and clears cache
runtime.stop()

get_fix()

Look up the active fix for a given error code:

fix = runtime.get_fix(
    error_code="TOOL.EXECUTION.TIMEOUT",
    session_id="session-123",  # optional, used for A/B bucketing
)
 
if fix:
    print(f"Fix {fix.fix_id} (type: {fix.fix_type})")

wrap_call()

Wrap a synchronous operation with fix interception and retry handling. This is a convenience method that returns a new callable -- it is not a decorator.

def call_llm(messages: list) -> str:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
    return response.choices[0].message.content
 
wrapped_fn = runtime.wrap_call(
    call_llm,
    operation_type="llm_call",
    operation_name="gpt-4o",
    session_id="session-123",
    max_retries=3,
)
 
# Call the wrapped function -- fixes are applied automatically
result = wrapped_fn(messages=[{"role": "user", "content": "Hello"}])

wrap_call is not a decorator

wrap_call returns a wrapped callable. Do not use it as @runtime.wrap_call. Pass the function as the first argument and call the returned wrapper.

wrap_async_call()

Async version of wrap_call:

async def call_llm_async(messages: list) -> str:
    response = await async_client.chat.completions.create(
        model="gpt-4o",
        messages=messages
    )
    return response.choices[0].message.content
 
wrapped_fn = await runtime.wrap_async_call(
    call_llm_async,
    operation_type="llm_call",
    operation_name="gpt-4o",
    session_id="session-123",
    max_retries=3,
)
 
result = await wrapped_fn(messages=[{"role": "user", "content": "Hello"}])

intercept_call() / intercept_response() / intercept_error()

Low-level interception methods for manual control:

# Pre-call: apply prompt and parameter fixes
messages, params, context = runtime.intercept_call(
    operation_type="llm_call",
    operation_name="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
    params={"temperature": 0.7},
    session_id="session-123",
    error_code="TOOL.EXECUTION.TIMEOUT",  # if retrying after error
)
 
# Make the call with modified messages/params
try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        **params,
    )
 
    # Post-call: apply output guards
    response, should_continue = runtime.intercept_response(context, response)
 
except Exception as e:
    # Error: decide whether to retry
    should_retry, modified_params = runtime.intercept_error(context, e)
    if should_retry:
        # Retry with modified_params
        pass

refresh_fixes() / refresh_fixes_async()

Manually refresh fixes from the API (sync and async variants):

# Synchronous
fixes = runtime.refresh_fixes()
 
# Asynchronous
fixes = await runtime.refresh_fixes_async()

get_effectiveness_stats()

Get fix success rates tracked by the runtime:

stats = runtime.get_effectiveness_stats()
for fix_id, data in stats.items():
    print(f"{fix_id}: {data['success_rate']:.1%} "
          f"({data['successes']}/{data['applications']})")

add_interceptor()

Add a custom interceptor to the chain:

runtime.add_interceptor(MyCustomInterceptor())

Custom Interceptors

Create custom interceptors by subclassing FixInterceptor:

from risicare.runtime.interceptors import FixInterceptor, InterceptContext
 
class MyInterceptor(FixInterceptor):
    def pre_call(self, context: InterceptContext, messages=None, params=None):
        """Called before an LLM/tool call. Return (messages, params)."""
        if context.operation_name == "gpt-4o":
            params = params or {}
            params["temperature"] = 0.3
        return messages, params
 
    def post_call(self, context: InterceptContext, response):
        """Called after a call. Return (response, should_continue)."""
        # Validate output, transform response, etc.
        return response, True
 
    def on_error(self, context: InterceptContext, error: Exception):
        """Called on error. Return (should_retry, modified_params)."""
        if "timeout" in str(error).lower():
            return True, {"timeout": 60000}
        return False, None
 
runtime.add_interceptor(MyInterceptor())

InterceptContext

The context object passed through all interceptor methods:

Field	Type	Description
`operation_type`	`str`	`"llm_call"`, `"tool_call"`, `"agent_delegate"`
`operation_name`	`str`	Model name, tool name, or agent name
`session_id`	`str \| None`	Session ID for A/B bucketing
`trace_id`	`str \| None`	Current trace ID
`span_id`	`str \| None`	Current span ID
`error_code`	`str \| None`	Error code (when retrying)
`error_message`	`str \| None`	Error message (when retrying)
`attempt`	`int`	Current attempt number (starts at 1)
`applied_fixes`	`list[ApplyResult]`	Fixes applied during this intercept chain

A/B Testing

A/B testing is controlled by two settings:

ab_testing_enabled on FixRuntimeConfig -- enables the A/B testing system
traffic_percentage on each ActiveFix -- controls what percentage of sessions receive the fix

Traffic splitting uses session-hash-based bucketing, ensuring the same session always gets the same treatment:

config = FixRuntimeConfig(
    ab_testing_enabled=True,  # Enable A/B testing
)

The runtime checks each fix's traffic_percentage and uses hash(session_id) % 100 for consistent bucketing. Results appear in the dashboard with:

Error rate comparison (control vs treatment)
Latency impact
Statistical confidence

Rollback

Rollback is managed through the deployment API, not through the SDK runtime.

# Rollback a deployment via API
curl -X DELETE "https://app.risicare.ai/v1/deployments/{id}" \
  -H "Authorization: Bearer rsk-..."

Automatic rollback triggers when:

Condition	Threshold
Error rate increase	>10% vs baseline
P99 latency increase	>2x baseline

Rollback updates Redis routing instantly. The SDK picks up changes on its next refresh cycle (default: 60 seconds), or immediately if push invalidation is configured.

Fix Lifecycle

1. Error detected -> Diagnosis runs
2. Fix generated -> Appears in dashboard
3. Fix approved -> Synced to runtime via API
4. A/B test -> traffic_percentage controls split
5. Statistical validation
6. Full rollout or rollback

Global Runtime

For convenience, you can use the global runtime functions:

from risicare.runtime import init_runtime, get_runtime, shutdown_runtime
 
# Initialize and start
runtime = init_runtime(config=config, auto_start=True)
 
# Get the global instance anywhere
runtime = get_runtime()
 
# Shutdown
shutdown_runtime()

Next Steps

Fix Types

All fix type details

Learn more

Hypothesis Testing

DoVer methodology

Learn more

PreviousRouting NextDeploy