LlamaIndex
Auto-instrument LlamaIndex for RAG applications.
Risicare automatically instruments LlamaIndex for comprehensive RAG observability.
Installation
pip install risicare[llamaindex]
# or
pip install risicare llama-indexVersion Compatibility
Requires llama-index-core >= 0.10.20.
Auto-Instrumentation
import risicare
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
risicare.init()
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
# Automatically traced
response = query_engine.query("What is Risicare?")What's Captured
| Feature | Description |
|---|---|
| Query Engine | Full query execution |
| Retrievers | Document retrieval with scores |
| LLM Calls | Synthesis and completion calls (deduplicated) |
| Embeddings | Embedding generation |
| Node Parsing | Document chunking |
| Index Operations | Index building and updates |
Span Hierarchy
llamaindex.query/{cls}
├── llamaindex.retrieve/{cls}
│ └── llamaindex.embedding/{model}
├── llamaindex.synthesize/{cls}
│ └── llamaindex.llm/{model}
└── llamaindex.component/{cls}
Provider Deduplication
Provider Deduplication
When using LlamaIndex, underlying LLM and embedding provider spans are automatically suppressed to avoid duplicate traces. You don't need to disable provider instrumentation manually. Other provider span types (e.g., tool calls) are not suppressed.
Query Engines
All query engine types are traced:
# Simple query engine
query_engine = index.as_query_engine()
# Chat engine
chat_engine = index.as_chat_engine()
response = chat_engine.chat("Hello!")
# Retriever-based
retriever = index.as_retriever()
nodes = retriever.retrieve("query")Retrievers
Retrieval operations capture document details:
retriever = index.as_retriever(similarity_top_k=5)
nodes = retriever.retrieve("What is AI?")
# Each node's score and content is capturedAgents
LlamaIndex agents are fully traced:
from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
def search(query: str) -> str:
return f"Results for {query}"
tool = FunctionTool.from_defaults(fn=search)
agent = ReActAgent.from_tools([tool], llm=llm)
# Agent iterations and tool calls traced
response = agent.chat("Search for AI news")Embeddings
Embedding operations are captured:
from llama_index.embeddings.openai import OpenAIEmbedding
embed_model = OpenAIEmbedding()
embedding = embed_model.get_text_embedding("Hello")
# Embedding dimension and model capturedIndex Building
Index creation is traced:
# Document loading
documents = SimpleDirectoryReader("data").load_data()
# Index building (chunking + embedding)
index = VectorStoreIndex.from_documents(
documents,
show_progress=True
)
# Node parsing and embedding spansStreaming
query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("Write about AI")
for text in response.response_gen:
print(text, end="")