Cost Tracking

Track LLM costs across providers.

Risicare automatically tracks LLM costs across 14 providers with real-time pricing data. The LLM Cost KPI card on the main dashboard shows total spend, and the Models chart breaks down requests by model — visible in the dashboard overview.

Automatic Cost Calculation

Cost is calculated for every LLM call:

import risicare
from openai import OpenAI
 
risicare.init()
 
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)
 
# Span includes:
# - gen_ai.usage.prompt_tokens: 10
# - gen_ai.usage.completion_tokens: 15
# - cost.usd: 0.000175

Supported Providers

Provider	Pricing	Cache Support
OpenAI	Per-token	50% cached discount
Anthropic	Per-token	90% cached discount
Google	Per-token	-
Cohere	Per-token	-
Mistral	Per-token	-
Groq	Per-token	-
Together AI	Per-token	-
Amazon Bedrock	Per-token	-
Vertex AI	Per-token	-
Cerebras	Per-token	-
HuggingFace	Per-request	-
Fireworks	Per-token	-
xAI	Per-token	-
Ollama	Free (local)	-

Pricing Examples

Key model pricing (per 1M tokens, as of February 2026):

OpenAI

Model	Input	Output	Cached Input
gpt-4o	$2.50	$10.00	$1.25
gpt-4o-mini	$0.15	$0.60	$0.075
o1	$15.00	$60.00	$7.50
o1-mini	$3.00	$12.00	$1.50
gpt-4-turbo	$10.00	$30.00	-

Anthropic

Model	Input	Output	Cached Input
claude-opus-4-5	$15.00	$75.00	$1.50
claude-sonnet-4-5	$3.00	$15.00	$0.30
claude-haiku-4-5	$0.80	$4.00	$0.08
claude-3-5-sonnet	$3.00	$15.00	$0.30
claude-3-haiku	$0.25	$1.25	$0.03

Google

Model	Input	Output
gemini-2.0-pro	$1.25	$5.00
gemini-2.0-flash	$0.10	$0.40
gemini-1.5-pro	$1.25	$5.00
gemini-1.5-flash	$0.075	$0.30

Cache Token Support

Cached tokens are automatically discounted:

Anthropic (90% discount)

response = anthropic.messages.create(
    model="claude-sonnet-4-5-20250929",
    system=[{
        "type": "text",
        "text": long_prompt,
        "cache_control": {"type": "ephemeral"}
    }],
    messages=[...]
)
 
# Span includes:
# - gen_ai.usage.cache_read_input_tokens: 5000
# - cost.cached_usd: 0.0015 (90% less than uncached)

OpenAI (50% discount)

# Cached requests automatically detected
# cost.cached_usd reflects 50% discount

Dashboard Views

Cost by Provider

View total cost breakdown by provider:

OpenAI:     $142.50 (45%)
Anthropic:   $98.20 (31%)
Google:      $45.30 (14%)
Others:      $31.00 (10%)

Cost by Model

See which models cost the most:

gpt-4o:                $98.50
claude-sonnet-4-5:     $67.20
gpt-4o-mini:           $22.00
gemini-2.0-pro:        $18.30

Cost by Feature

Track costs per feature or endpoint:

/api/chat:           $145.00
/api/summarize:       $67.00
/api/search:          $32.00

API Access

Cost data is available per-trace and per-span via the standard management API:

# Get traces with cost data
curl "https://app.risicare.ai/api/v1/traces?limit=50" \
  -H "Authorization: Bearer rsk-..."

Each trace includes total_cost_usd and each span includes llm_cost_usd in the response. Use the dashboard for aggregated cost breakdowns by provider, model, or time period.

Programmatic cost aggregation

A dedicated cost analytics API endpoint is on the roadmap. For now, aggregate cost data by querying traces and summing total_cost_usd, or use the dashboard's built-in cost views.

Cost Alerts

Set up cost alerts:

# Via dashboard or API
alert = {
    "type": "cost",
    "threshold": 100.00,  # USD per day
    "channel": "slack",
    "webhook": "https://hooks.slack.com/..."
}

Cost Optimization

Risicare identifies cost optimization opportunities:

Model downgrade suggestions: "Use gpt-4o-mini for simple queries"
Caching opportunities: "Enable Anthropic caching for system prompts"
Token reduction: "Reduce prompt length by 40%"

Next Steps

Evaluations

Evaluate LLM outputs

Learn more

Traces

View execution traces

Learn more

Edit this page on GitHub

PreviousScorers NextDiagnose