Groq

Name: RisiCare LLMOps Observability
Availability: InStock

Auto-instrument Groq for ultra-fast inference.

Risicare automatically instruments the Groq SDK for ultra-low latency LLM inference.

Installation

pip install risicare groq

Auto-Instrumentation

import risicare
from groq import Groq
 
risicare.init()
 
client = Groq()
 
# Automatically traced
response = client.chat.completions.create(
    model="llama-3.1-70b-versatile",
    messages=[{"role": "user", "content": "Hello!"}]
)

Captured Attributes

Attribute	Description
`gen_ai.system`	`groq`
`gen_ai.request.model`	Requested model name
`gen_ai.response.model`	Model name returned by API
`gen_ai.response.id`	Response ID
`gen_ai.request.temperature`	Sampling temperature
`gen_ai.request.max_tokens`	Max output tokens
`gen_ai.request.stream`	Whether streaming was requested
`gen_ai.request.has_tools`	Whether tools were provided
`gen_ai.usage.prompt_tokens`	Input tokens
`gen_ai.usage.completion_tokens`	Output tokens
`gen_ai.usage.total_tokens`	Total tokens
`gen_ai.completion.tool_calls`	Number of tool calls made
`gen_ai.completion.finish_reason`	Stop reason
`gen_ai.latency_ms`	Request latency in milliseconds

Streaming

stream = client.chat.completions.create(
    model="llama-3.1-70b-versatile",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True
)
 
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="")

Supported Models

Model	Description
`llama-3.1-405b-reasoning`	Llama 3.1 405B
`llama-3.1-70b-versatile`	Llama 3.1 70B
`llama-3.1-8b-instant`	Llama 3.1 8B
`mixtral-8x7b-32768`	Mixtral 8x7B
`gemma2-9b-it`	Gemma 2 9B

Next Steps

Together AI

Open-source models

Learn more

All Providers

View all supported providers

Learn more

Edit this page on GitHub

PreviousMistral NextTogether AI