Vertex AI

Auto-instrument Google Vertex AI.

Risicare automatically instruments Google Vertex AI for Gemini and other models.

Installation

pip install risicare google-cloud-aiplatform

Auto-Instrumentation

import risicare
import vertexai
from vertexai.generative_models import GenerativeModel
 
risicare.init()
 
vertexai.init(project="your-project", location="us-central1")
 
model = GenerativeModel("gemini-1.5-pro")
 
# Automatically traced
response = model.generate_content("Hello!")

Captured Attributes

Attribute	Description
`gen_ai.system`	`vertexai`
`gen_ai.request.model`	Model name
`gen_ai.request.stream`	Whether streaming was requested
`gen_ai.usage.prompt_tokens`	Input tokens (from `prompt_token_count`)
`gen_ai.usage.completion_tokens`	Output tokens (from `candidates_token_count`)
`gen_ai.usage.total_tokens`	Total tokens (from `total_token_count`)
`gen_ai.completion.finish_reason`	Stop reason
`gen_ai.latency_ms`	Request latency in milliseconds
`gen_ai.response.candidates`	Number of response candidates
`gen_ai.prompt.parts`	Number of prompt parts

Streaming

responses = model.generate_content(
    "Write a story",
    stream=True
)
 
for response in responses:
    print(response.text, end="")

Chat Sessions

Not Instrumented

ChatSession.send_message is not instrumented. Only GenerativeModel.generate_content calls are traced. If you need tracing for multi-turn conversations, call generate_content directly with the full message history instead of using ChatSession.

Supported Models

Model	Description
`gemini-1.5-pro`	Gemini 1.5 Pro
`gemini-1.5-flash`	Gemini 1.5 Flash
`gemini-1.0-pro`	Gemini 1.0 Pro
`text-bison`	PaLM 2 Text
`chat-bison`	PaLM 2 Chat

Function Calling

from vertexai.generative_models import FunctionDeclaration, Tool
 
get_weather = FunctionDeclaration(
    name="get_weather",
    description="Get weather for a location",
    parameters={
        "type": "object",
        "properties": {"location": {"type": "string"}},
        "required": ["location"]
    }
)
 
tools = Tool(function_declarations=[get_weather])
model = GenerativeModel("gemini-1.5-pro", tools=[tools])
 
response = model.generate_content("What's the weather in Paris?")
# Function calls are captured as child spans

Next Steps

Cerebras

Hardware-accelerated inference

Learn more

All Providers

View all supported providers

Learn more

Edit this page on GitHub

PreviousAmazon Bedrock NextCerebras