Fix Runtime
Apply self-healing fixes at runtime.
The Fix Runtime enables dynamic application of fixes without redeployment.
Overview
When Risicare detects an error pattern and generates a fix, the Fix Runtime can apply it immediately:
from risicare.runtime import FixRuntime, FixRuntimeConfig
runtime = FixRuntime(
config=FixRuntimeConfig(
api_endpoint="https://app.risicare.ai",
api_key="rsk-...",
project_id="proj-...",
enabled=True,
)
)
runtime.start()FixRuntimeConfig
from risicare.runtime import FixRuntimeConfig
config = FixRuntimeConfig(
api_endpoint: str = "", # Risicare API endpoint
api_key: str | None = None, # API key
project_id: str | None = None, # Project ID
cache_enabled: bool = True, # Enable local fix caching
cache_ttl_seconds: int = 300, # Cache TTL (5 minutes)
cache_max_entries: int = 1000, # Max cached fixes
auto_refresh: bool = True, # Auto-refresh fixes from API
refresh_interval_seconds: int = 60, # Refresh interval
enabled: bool = True, # Enable fix runtime
dry_run: bool = False, # Log fixes without applying
max_retries: int = 3, # Max retries per fix
ab_testing_enabled: bool = True, # Enable A/B testing
track_effectiveness: bool = True, # Track fix effectiveness
timeout_ms: int = 1000, # Fix application timeout
debug: bool = False, # Debug logging
)You can also create config from environment variables:
config = FixRuntimeConfig.from_env()This reads RISICARE_ENDPOINT, RISICARE_API_KEY, RISICARE_PROJECT_ID, and other RISICARE_* environment variables.
Fix Types
The runtime supports 7 fix types:
| Type | Description |
|---|---|
prompt | System prompt modifications |
parameter | LLM parameter adjustments (temperature, etc.) |
tool | Tool configuration fixes |
retry | Retry logic with backoff |
fallback | Alternative model/provider fallback |
guard | Input/output validation guards |
routing | Request routing changes |
Public Methods
start() / stop()
Start and stop the runtime lifecycle:
runtime = FixRuntime(config=config)
# Start: loads fixes from API and begins background refresh
runtime.start()
# ... your application runs ...
# Stop: halts background refresh and clears cache
runtime.stop()get_fix()
Look up the active fix for a given error code:
fix = runtime.get_fix(
error_code="TOOL.EXECUTION.TIMEOUT",
session_id="session-123", # optional, used for A/B bucketing
)
if fix:
print(f"Fix {fix.fix_id} (type: {fix.fix_type})")wrap_call()
Wrap a synchronous operation with fix interception and retry handling. This is a convenience method that returns a new callable -- it is not a decorator.
def call_llm(messages: list) -> str:
response = client.chat.completions.create(
model="gpt-4o",
messages=messages
)
return response.choices[0].message.content
wrapped_fn = runtime.wrap_call(
call_llm,
operation_type="llm_call",
operation_name="gpt-4o",
session_id="session-123",
max_retries=3,
)
# Call the wrapped function -- fixes are applied automatically
result = wrapped_fn(messages=[{"role": "user", "content": "Hello"}])wrap_call is not a decorator
wrap_call returns a wrapped callable. Do not use it as @runtime.wrap_call. Pass the function as the first argument and call the returned wrapper.
wrap_async_call()
Async version of wrap_call:
async def call_llm_async(messages: list) -> str:
response = await async_client.chat.completions.create(
model="gpt-4o",
messages=messages
)
return response.choices[0].message.content
wrapped_fn = await runtime.wrap_async_call(
call_llm_async,
operation_type="llm_call",
operation_name="gpt-4o",
session_id="session-123",
max_retries=3,
)
result = await wrapped_fn(messages=[{"role": "user", "content": "Hello"}])intercept_call() / intercept_response() / intercept_error()
Low-level interception methods for manual control:
# Pre-call: apply prompt and parameter fixes
messages, params, context = runtime.intercept_call(
operation_type="llm_call",
operation_name="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
params={"temperature": 0.7},
session_id="session-123",
error_code="TOOL.EXECUTION.TIMEOUT", # if retrying after error
)
# Make the call with modified messages/params
try:
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
**params,
)
# Post-call: apply output guards
response, should_continue = runtime.intercept_response(context, response)
except Exception as e:
# Error: decide whether to retry
should_retry, modified_params = runtime.intercept_error(context, e)
if should_retry:
# Retry with modified_params
passrefresh_fixes() / refresh_fixes_async()
Manually refresh fixes from the API (sync and async variants):
# Synchronous
fixes = runtime.refresh_fixes()
# Asynchronous
fixes = await runtime.refresh_fixes_async()get_effectiveness_stats()
Get fix success rates tracked by the runtime:
stats = runtime.get_effectiveness_stats()
for fix_id, data in stats.items():
print(f"{fix_id}: {data['success_rate']:.1%} "
f"({data['successes']}/{data['applications']})")add_interceptor()
Add a custom interceptor to the chain:
runtime.add_interceptor(MyCustomInterceptor())Custom Interceptors
Create custom interceptors by subclassing FixInterceptor:
from risicare.runtime.interceptors import FixInterceptor, InterceptContext
class MyInterceptor(FixInterceptor):
def pre_call(self, context: InterceptContext, messages=None, params=None):
"""Called before an LLM/tool call. Return (messages, params)."""
if context.operation_name == "gpt-4o":
params = params or {}
params["temperature"] = 0.3
return messages, params
def post_call(self, context: InterceptContext, response):
"""Called after a call. Return (response, should_continue)."""
# Validate output, transform response, etc.
return response, True
def on_error(self, context: InterceptContext, error: Exception):
"""Called on error. Return (should_retry, modified_params)."""
if "timeout" in str(error).lower():
return True, {"timeout": 60000}
return False, None
runtime.add_interceptor(MyInterceptor())InterceptContext
The context object passed through all interceptor methods:
| Field | Type | Description |
|---|---|---|
operation_type | str | "llm_call", "tool_call", "agent_delegate" |
operation_name | str | Model name, tool name, or agent name |
session_id | str | None | Session ID for A/B bucketing |
trace_id | str | None | Current trace ID |
span_id | str | None | Current span ID |
error_code | str | None | Error code (when retrying) |
error_message | str | None | Error message (when retrying) |
attempt | int | Current attempt number (starts at 1) |
applied_fixes | list[ApplyResult] | Fixes applied during this intercept chain |
A/B Testing
A/B testing is controlled by two settings:
ab_testing_enabledonFixRuntimeConfig-- enables the A/B testing systemtraffic_percentageon eachActiveFix-- controls what percentage of sessions receive the fix
Traffic splitting uses session-hash-based bucketing, ensuring the same session always gets the same treatment:
config = FixRuntimeConfig(
ab_testing_enabled=True, # Enable A/B testing
)The runtime checks each fix's traffic_percentage and uses hash(session_id) % 100 for consistent bucketing. Results appear in the dashboard with:
- Error rate comparison (control vs treatment)
- Latency impact
- Statistical confidence
Rollback
Rollback is managed through the deployment API, not through the SDK runtime.
# Rollback a deployment via API
curl -X DELETE "https://app.risicare.ai/v1/deployments/{id}" \
-H "Authorization: Bearer rsk-..."Automatic rollback triggers when:
| Condition | Threshold |
|---|---|
| Error rate increase | >10% vs baseline |
| P99 latency increase | >2x baseline |
Rollback updates Redis routing instantly. The SDK picks up changes on its next refresh cycle (default: 60 seconds), or immediately if push invalidation is configured.
Fix Lifecycle
1. Error detected -> Diagnosis runs
2. Fix generated -> Appears in dashboard
3. Fix approved -> Synced to runtime via API
4. A/B test -> traffic_percentage controls split
5. Statistical validation
6. Full rollout or rollback
Global Runtime
For convenience, you can use the global runtime functions:
from risicare.runtime import init_runtime, get_runtime, shutdown_runtime
# Initialize and start
runtime = init_runtime(config=config, auto_start=True)
# Get the global instance anywhere
runtime = get_runtime()
# Shutdown
shutdown_runtime()