Skip to main content
GitHub

LlamaIndex

Auto-instrument LlamaIndex for RAG applications.

Risicare automatically instruments LlamaIndex for comprehensive RAG observability.

Installation

pip install risicare[llamaindex]
# or
pip install risicare llama-index

Version Compatibility

Requires llama-index-core >= 0.10.20.

Auto-Instrumentation

import risicare
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
 
risicare.init()
 
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
 
# Automatically traced
response = query_engine.query("What is Risicare?")

What's Captured

FeatureDescription
Query EngineFull query execution
RetrieversDocument retrieval with scores
LLM CallsSynthesis and completion calls (deduplicated)
EmbeddingsEmbedding generation
Node ParsingDocument chunking
Index OperationsIndex building and updates

Span Hierarchy

llamaindex.query/{cls}
├── llamaindex.retrieve/{cls}
│   └── llamaindex.embedding/{model}
├── llamaindex.synthesize/{cls}
│   └── llamaindex.llm/{model}
└── llamaindex.component/{cls}

Provider Deduplication

Provider Deduplication

When using LlamaIndex, underlying LLM and embedding provider spans are automatically suppressed to avoid duplicate traces. You don't need to disable provider instrumentation manually. Other provider span types (e.g., tool calls) are not suppressed.

Query Engines

All query engine types are traced:

# Simple query engine
query_engine = index.as_query_engine()
 
# Chat engine
chat_engine = index.as_chat_engine()
response = chat_engine.chat("Hello!")
 
# Retriever-based
retriever = index.as_retriever()
nodes = retriever.retrieve("query")

Retrievers

Retrieval operations capture document details:

retriever = index.as_retriever(similarity_top_k=5)
nodes = retriever.retrieve("What is AI?")
 
# Each node's score and content is captured

Agents

LlamaIndex agents are fully traced:

from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
 
def search(query: str) -> str:
    return f"Results for {query}"
 
tool = FunctionTool.from_defaults(fn=search)
agent = ReActAgent.from_tools([tool], llm=llm)
 
# Agent iterations and tool calls traced
response = agent.chat("Search for AI news")

Embeddings

Embedding operations are captured:

from llama_index.embeddings.openai import OpenAIEmbedding
 
embed_model = OpenAIEmbedding()
embedding = embed_model.get_text_embedding("Hello")
 
# Embedding dimension and model captured

Index Building

Index creation is traced:

# Document loading
documents = SimpleDirectoryReader("data").load_data()
 
# Index building (chunking + embedding)
index = VectorStoreIndex.from_documents(
    documents,
    show_progress=True
)
# Node parsing and embedding spans

Streaming

query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("Write about AI")
 
for text in response.response_gen:
    print(text, end="")

Next Steps