Skip to main content
GitHub

OpenAI-Compatible

Instrument any OpenAI-compatible API provider.

Many providers offer OpenAI-compatible APIs. Risicare automatically detects and correctly labels these providers.

How It Works

When using the OpenAI SDK with a custom base_url, Risicare automatically detects the provider from the URL and sets the correct gen_ai.system attribute.

import risicare
from openai import OpenAI
 
risicare.init()
 
# Using Together AI via OpenAI SDK
client = OpenAI(
    base_url="https://api.together.xyz/v1",
    api_key="your-together-key"
)
 
# Span will show gen_ai.system = "together" (auto-detected)
response = client.chat.completions.create(
    model="meta-llama/Llama-3-70b-chat-hf",
    messages=[{"role": "user", "content": "Hello!"}]
)

Supported Providers

ProviderBase URLAuto-detected
Together AIapi.together.xyzYes
Groqapi.groq.comYes
DeepSeekapi.deepseek.comYes
xAI (Grok)api.x.aiYes
Fireworksapi.fireworks.aiYes
Baseteninference.baseten.coYes
Novitaapi.novita.aiYes
BytePlusark.cn-beijing.byteplus.comYes

Provider Examples

DeepSeek

client = OpenAI(
    base_url="https://api.deepseek.com/v1",
    api_key="your-deepseek-key"
)
 
response = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Hello!"}]
)

xAI (Grok)

client = OpenAI(
    base_url="https://api.x.ai/v1",
    api_key="your-xai-key"
)
 
response = client.chat.completions.create(
    model="grok-beta",
    messages=[{"role": "user", "content": "Hello!"}]
)

Fireworks

client = OpenAI(
    base_url="https://api.fireworks.ai/inference/v1",
    api_key="your-fireworks-key"
)
 
response = client.chat.completions.create(
    model="accounts/fireworks/models/llama-v3-70b-instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)

Captured Attributes

All OpenAI-compatible providers capture:

AttributeDescription
gen_ai.systemProvider name (auto-detected)
gen_ai.request.modelModel name
gen_ai.response.modelActual model used
gen_ai.usage.prompt_tokensInput tokens
gen_ai.usage.completion_tokensOutput tokens
gen_ai.usage.total_tokensTotal tokens

Streaming

All providers support streaming:

stream = client.chat.completions.create(
    model="meta-llama/Llama-3-70b-chat-hf",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True
)
 
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Local Servers

Works with local OpenAI-compatible servers:

# LM Studio
client = OpenAI(
    base_url="http://localhost:1234/v1",
    api_key="lm-studio"
)
 
# vLLM
client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="vllm"
)
 
# Text Generation Inference
client = OpenAI(
    base_url="http://localhost:8080/v1",
    api_key="tgi"
)

Local Detection

Local servers (localhost, 127.0.0.1) are labeled as gen_ai.system = "openai" since they use the OpenAI-compatible API.

Next Steps