LlamaIndex

Auto-instrument LlamaIndex for RAG applications.

Risicare automatically instruments LlamaIndex for comprehensive RAG observability.

Installation

pip install risicare[llamaindex]
# or
pip install risicare llama-index

Version Compatibility

Requires llama-index-core >= 0.10.20.

Auto-Instrumentation

import risicare
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
 
risicare.init()
 
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
 
# Automatically traced
response = query_engine.query("What is Risicare?")

What's Captured

Feature	Description
Query Engine	Full query execution
Retrievers	Document retrieval with scores
LLM Calls	Synthesis and completion calls (deduplicated)
Embeddings	Embedding generation
Node Parsing	Document chunking
Index Operations	Index building and updates

Span Hierarchy

llamaindex.query/{cls}
├── llamaindex.retrieve/{cls}
│   └── llamaindex.embedding/{model}
├── llamaindex.synthesize/{cls}
│   └── llamaindex.llm/{model}
└── llamaindex.component/{cls}

Provider Deduplication

When using LlamaIndex, underlying LLM and embedding provider spans are automatically suppressed to avoid duplicate traces. You don't need to disable provider instrumentation manually. Other provider span types (e.g., tool calls) are not suppressed.

Query Engines

All query engine types are traced:

# Simple query engine
query_engine = index.as_query_engine()
 
# Chat engine
chat_engine = index.as_chat_engine()
response = chat_engine.chat("Hello!")
 
# Retriever-based
retriever = index.as_retriever()
nodes = retriever.retrieve("query")

Retrievers

Retrieval operations capture document details:

retriever = index.as_retriever(similarity_top_k=5)
nodes = retriever.retrieve("What is AI?")
 
# Each node's score and content is captured

Agents

LlamaIndex agents are fully traced:

from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool
 
def search(query: str) -> str:
    return f"Results for {query}"
 
tool = FunctionTool.from_defaults(fn=search)
agent = ReActAgent.from_tools([tool], llm=llm)
 
# Agent iterations and tool calls traced
response = agent.chat("Search for AI news")

Embeddings

Embedding operations are captured:

from llama_index.embeddings.openai import OpenAIEmbedding
 
embed_model = OpenAIEmbedding()
embedding = embed_model.get_text_embedding("Hello")
 
# Embedding dimension and model captured

Index Building

Index creation is traced:

# Document loading
documents = SimpleDirectoryReader("data").load_data()
 
# Index building (chunking + embedding)
index = VectorStoreIndex.from_documents(
    documents,
    show_progress=True
)
# Node parsing and embedding spans

Streaming

query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("Write about AI")
 
for text in response.response_gen:
    print(text, end="")

JavaScript / TypeScript

The JS SDK provides RisicareLlamaIndexHandler for LlamaIndex.TS:

import { RisicareLlamaIndexHandler } from 'risicare/llamaindex';
 
const handler = new RisicareLlamaIndexHandler();
// Register with LlamaIndex's event system

The handler automatically suppresses provider instrumentation to prevent duplicate spans.

Next Steps

LangChain

Chains and agents

Learn more

All Frameworks

View all supported frameworks

Learn more

Edit this page on GitHub

PreviousPydantic AI NextOpenTelemetry