Together AI
Auto-instrument Together AI for open-source models.
Risicare automatically instruments the Together AI SDK for open-source model inference.
Installation
pip install risicare togetherAuto-Instrumentation
import risicare
from together import Together
risicare.init()
client = Together()
# Automatically traced
response = client.chat.completions.create(
model="meta-llama/Llama-3-70b-chat-hf",
messages=[{"role": "user", "content": "Hello!"}]
)Captured Attributes
| Attribute | Description |
|---|---|
gen_ai.system | together |
gen_ai.request.model | Requested model name |
gen_ai.response.model | Model name returned by API |
gen_ai.response.id | Response ID |
gen_ai.request.temperature | Sampling temperature |
gen_ai.request.max_tokens | Max output tokens |
gen_ai.request.stream | Whether streaming was requested |
gen_ai.request.has_tools | Whether tools were provided |
gen_ai.usage.prompt_tokens | Input tokens |
gen_ai.usage.completion_tokens | Output tokens |
gen_ai.usage.total_tokens | Total tokens |
gen_ai.completion.tool_calls | Number of tool calls made |
gen_ai.completion.finish_reason | Stop reason |
gen_ai.latency_ms | Request latency in milliseconds |
Streaming
stream = client.chat.completions.create(
model="meta-llama/Llama-3-70b-chat-hf",
messages=[{"role": "user", "content": "Write a story"}],
stream=True
)
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")Popular Models
| Model | Description |
|---|---|
meta-llama/Llama-3-70b-chat-hf | Llama 3 70B Chat |
meta-llama/Llama-3-8b-chat-hf | Llama 3 8B Chat |
mistralai/Mixtral-8x7B-Instruct-v0.1 | Mixtral 8x7B |
mistralai/Mistral-7B-Instruct-v0.2 | Mistral 7B |
Qwen/Qwen2-72B-Instruct | Qwen2 72B |
Embeddings
response = client.embeddings.create(
model="togethercomputer/m2-bert-80M-8k-retrieval",
input=["Hello, world!"]
)Instrumentation Scope
Only chat completions are instrumented by the Risicare Together provider patch. Embeddings are traced through the OpenAI-compatible host detection, not a dedicated Together embeddings patch.