TruLens is an observability framework for Large Language Model (LLM) applications that provides instrumentation, evaluation, and visualization capabilities. The framework wraps existing LLM applications with OpenTelemetry-based tracing to capture execution details, computes evaluation metrics through LLM-based feedback functions, and stores results in configurable databases (SQLite, PostgreSQL, Snowflake).
This page provides a high-level introduction to TruLens' architecture, package organization, and core concepts. For detailed subsystem documentation, see: Core Architecture, Package Structure, Snowflake Integration, User Interfaces, Advanced Features, and Development and Deployment.
Sources: README.md10-26
The TruLens architecture consists of six primary layers:
Layer Descriptions:
| Layer | Purpose | Key Components |
|---|---|---|
| User Interface | Entry points for developers | Jupyter notebooks, CLI scripts |
| Application Wrapper | Framework-specific instrumentation | TruChain, TruLlama, TruGraph, TruBasicApp, TruCustomApp classes |
| Session Management | Central orchestration and configuration | TruSession class manages database connections, OTEL setup, evaluation lifecycle |
| Core Services | Execution tracing and evaluation | @instrument decorator, TracerProvider, Evaluator thread, RunManager |
| Provider | LLM API abstractions for evaluation | LLMProvider interface with implementations for OpenAI, Bedrock, Cortex, etc. |
| Persistence | Data storage abstractions | DBConnector interface, SQLAlchemyDB, SnowflakeEventTableDB implementations |
| Visualization | Data exploration interface | Streamlit dashboard with trulens_leaderboard(), record viewers, trace viewers |
Sources: README.md27-36
TruLens provides four primary capabilities for LLM application observability:
| Capability | Implementation | Details |
|---|---|---|
| Instrumentation | @instrument decorator, OTEL spans | Wraps application methods to generate structured spans (RECORD_ROOT, GENERATION, RETRIEVAL, EVAL) using OpenTelemetry TracerProvider |
| Evaluation | Feedback class, LLMProvider interface | Computes metrics (groundedness, relevance, coherence) by calling LLM APIs with structured prompts; supports OpenAI, Bedrock, Cortex, HuggingFace, LiteLLM |
| Persistence | DBConnector, SQLAlchemyDB, SnowflakeEventTableDB | Stores spans and evaluation results in SQLite (default), PostgreSQL, MySQL, or Snowflake account event tables |
| Visualization | Streamlit dashboard (trulens.dashboard) | Web interface with trulens_leaderboard() for app comparison, record viewers for trace inspection, feedback visualization |
Supported Frameworks:
TruChain class in trulens-apps-langchain)TruGraph class in trulens-apps-langgraph)TruLlama class in trulens-apps-llamaindex)TruBasicApp, TruCustomApp classes in trulens-core)Sources: README.md14-26
The following diagram maps TruLens components to their implementing classes and modules:
Core Component Table:
| Component | Module Path | Key Methods/Attributes | Purpose |
|---|---|---|---|
App | trulens.core.app | main_call(), with_record() | Base class for wrapping applications with instrumentation |
TruBasicApp | trulens.core.app.basic | __init__(app_function) | Wraps simple Python functions |
TruCustomApp | trulens.core.app.custom | __init__(app) | Wraps custom objects with dynamic method detection |
TruSession | trulens.core.session | __init__(database_url), start_evaluator() | Main API entry point; manages database, OTEL, evaluation |
@instrument | trulens.core.instruments | Decorator parameters: class_filter, method_filter | Wraps methods to generate OTEL spans |
DBConnector | trulens.core.database.connector | insert_record(), get_records() | Abstract interface for database operations |
SQLAlchemyDB | trulens.core.database.sqlalchemy | __init__(database_url) | SQLAlchemy implementation for SQLite/Postgres/MySQL |
SnowflakeEventTableDB | trulens.connectors.snowflake.connector | workspace_name, workspace_version | Snowflake account event table implementation |
Feedback | trulens.feedback.feedback | __init__(imp, selector, aggregator) | Defines evaluation metrics with implementation, selector, aggregator |
LLMProvider | trulens.feedback.llm_provider | generate_score() | Abstract interface for calling LLM APIs |
Evaluator | trulens.core.experimental.feature | start(), stop() | Background thread for computing feedback on spans |
Sources: README.md27-36
TruLens uses a modular package structure with trulens-core as the foundation and optional extension packages for specific frameworks and providers.
Package Descriptions:
| Package Name | Directory | Key Exports | Dependencies |
|---|---|---|---|
trulens-otel-semconv | src/otel/semconv/ | OTEL semantic conventions (span kinds, attributes) | opentelemetry-semantic-conventions>=0.36b0 |
trulens-core | src/core/ | App, TruSession, DBConnector, @instrument | trulens-otel-semconv, opentelemetry-api, opentelemetry-sdk, sqlalchemy, pydantic |
trulens-feedback | src/feedback/ | Feedback, LLMProvider, Groundedness, Relevance | trulens-core, nltk, scikit-learn |
trulens-dashboard | src/dashboard/ | run_dashboard(), trulens_leaderboard() | trulens-core, streamlit, plotly |
trulens-apps-langchain | src/apps/langchain/ | TruChain | trulens-core, langchain, langchain-core |
trulens-apps-langgraph | src/apps/langgraph/ | TruGraph | trulens-core, trulens-apps-langchain, langgraph |
trulens-apps-llamaindex | src/apps/llamaindex/ | TruLlama | trulens-core, llama-index |
trulens-providers-openai | src/providers/openai/ | OpenAI | trulens-core, trulens-feedback, openai |
trulens-providers-cortex | src/providers/cortex/ | Cortex | trulens-core, trulens-feedback, snowflake-snowpark-python |
trulens-connectors-snowflake | src/connectors/snowflake/ | SnowflakeConnector, SnowflakeEventTableDB | trulens-core, snowflake-snowpark-python, snowflake-sqlalchemy |
trulens | src/trulens/ | Meta-package (no code, aggregates dependencies) | trulens-core, trulens-feedback, trulens-dashboard |
Sources: README.md39-43
The following diagram illustrates the execution flow from application invocation through evaluation to storage:
Execution Phases:
TruApp wrapper with application and list of Feedback objectswith_record() context creates RECORD_ROOT span via OTEL TracerProvider@instrument decorator generates child spans (GENERATION, RETRIEVAL) during executionforce_flush() sends spans to SpanExporter, which calls DBConnector.insert_record()Evaluator thread retrieves spans, executes Feedback.imp() functions via LLMProvider.generate_score(), stores results with insert_feedback()TruSession.get_records_and_feedback() to query results as pandas DataFrameSources: README.md45-51
TruLens supports multiple LLM frameworks, evaluation providers, and storage backends through pluggable interfaces:
Framework Integration Table:
| Framework | Package | Wrapper Class | Module Path |
|---|---|---|---|
| LangChain | trulens-apps-langchain | TruChain | trulens.apps.langchain.TruChain |
| LangGraph | trulens-apps-langgraph | TruGraph | trulens.apps.langgraph.TruGraph |
| LlamaIndex | trulens-apps-llamaindex | TruLlama | trulens.apps.llamaindex.TruLlama |
| Custom Python | trulens-core | TruBasicApp | trulens.core.app.basic.TruBasicApp |
| Custom Objects | trulens-core | TruCustomApp | trulens.core.app.custom.TruCustomApp |
LLM Provider Table:
| Provider | Package | Class | Module Path | Initialization |
|---|---|---|---|---|
| OpenAI | trulens-providers-openai | OpenAI | trulens.providers.openai.OpenAI | OpenAI(model_engine="gpt-4") |
| AWS Bedrock | trulens-providers-bedrock | Bedrock | trulens.providers.bedrock.Bedrock | Bedrock(model_id="anthropic.claude-v2") |
| Snowflake Cortex | trulens-providers-cortex | Cortex | trulens.providers.cortex.Cortex | Cortex(model_engine="mistral-large") |
| HuggingFace | trulens-providers-huggingface | HuggingFace | trulens.providers.huggingface.HuggingFace | HuggingFace(model="meta-llama/Llama-2-7b") |
| LiteLLM | trulens-providers-litellm | LiteLLM | trulens.providers.litellm.LiteLLM | LiteLLM(model_engine="azure/gpt-4") |
| LangChain | trulens-providers-langchain | LangChainLLMProvider | trulens.providers.langchain | LangChainLLMProvider(llm=ChatOpenAI()) |
Database Connector Table:
| Database | Class | Module Path | Initialization |
|---|---|---|---|
| SQLite | SQLAlchemyDB | trulens.core.database.sqlalchemy.SQLAlchemyDB | TruSession(database_url="sqlite:///trulens.sqlite") |
| PostgreSQL | SQLAlchemyDB | trulens.core.database.sqlalchemy.SQLAlchemyDB | TruSession(database_url="postgresql://user:pass@host/db") |
| MySQL | SQLAlchemyDB | trulens.core.database.sqlalchemy.SQLAlchemyDB | TruSession(database_url="mysql://user:pass@host/db") |
| Snowflake | SnowflakeConnector | trulens.connectors.snowflake.SnowflakeConnector | SnowflakeConnector(snowpark_session=session) |
Sources: README.md39-43
TruLens follows a standard instrumentation workflow:
Installation:
Typical Usage Code Pattern:
TruSession with database configurationTruChain, TruLlama, etc.)Feedback functions using LLMProvider instanceswith_record() contextget_records_and_feedback()run_dashboard()Code Example References:
Key API Methods:
| Method | Class | Purpose |
|---|---|---|
TruSession.__init__(database_url) | TruSession | Initialize session with database connection string |
TruChain(app, feedbacks) | TruChain | Wrap LangChain application with feedback definitions |
with_record(app) | App | Context manager that creates RECORD_ROOT span |
get_records_and_feedback() | TruSession | Query all records and feedback results as DataFrame |
run_dashboard() | Module function | Launch Streamlit dashboard on localhost |
start_evaluator() | TruSession | Start background thread for computing feedback |
Sources: README.md39-51
The TruLens architecture implements five core design principles:
1. OpenTelemetry Standard Compliance
All instrumentation generates OTEL-compliant spans with semantic conventions defined in trulens-otel-semconv. Span types include RECORD_ROOT, GENERATION, RETRIEVAL, EVAL. The system uses standard TracerProvider from opentelemetry.sdk.trace.
2. Framework Abstraction
The App base class in trulens.core.app provides generic instrumentation via the @instrument decorator. Framework-specific subclasses (TruChain, TruLlama, TruGraph) extend this base with framework-specific method mappings.
3. Provider Interface Abstraction
The LLMProvider interface in trulens.feedback.llm_provider defines generate_score() method. Implementations (OpenAI, Bedrock, Cortex, etc.) handle provider-specific API calls, authentication, and response parsing.
4. Storage Abstraction
The DBConnector interface in trulens.core.database.connector defines persistence methods. Implementations include:
SQLAlchemyDB for SQL databases (SQLite, PostgreSQL, MySQL)SnowflakeEventTableDB for Snowflake account event tables5. Modular Package Dependencies
Each package in src/ directory is independently installable. Users install only required packages:
pip install trulens-core for basic instrumentationpip install trulens-apps-langchain adds LangChain supportpip install trulens-providers-openai adds OpenAI evaluationpip install trulens aggregates common packagesSources: README.md19-21
This wiki is organized into the following sections:
Sources: README.md, all meta.yaml files