DSPy Research and Demo

Abstract

This repository is a comprehensive resource for experimenting with DSPy—a framework for declarative language model pipelines. It includes code, Jupyter notebooks, runnable demos, tests, and documentation for evaluating, orchestrating, and optimizing LLM workflows. The materials support research and practical exploration of DSPy’s approach to efficient, scalable, and self-improving AI systems.

Motivation & Background

Traditional prompt engineering for LLMs is limited by manual trial-and-error, lack of reproducibility, and poor scalability. DSPy reframes LLM orchestration as a machine learning problem—using datasets, evaluation metrics, and hyperparameter optimization to automate and improve pipeline design.

Key Insights (YouTube walkthrough):

Limitations of Prompt Engineering: Manual, brittle, hard to scale, and resource-intensive.
DSPy’s ML Mindset: Treats LLM pipelines as trainable programs, enabling systematic evaluation, optimization, and reproducibility.
Benefits:
- Higher efficiency and scalability.
- Lower resource requirements (fewer API calls, less manual tuning).
- Modular composition for building complex systems.

Demos

Explore the following system architecture diagrams for DSPy pipelines:

– Basic Company Analysis Pipeline (No Compilation):
This diagram shows a straightforward DSPy pipeline for company analysis using declarative modules and prompt engineering. The workflow processes inputs and generates outputs without automated optimization, illustrating the baseline approach before leveraging DSPy’s compilation features.
– Compiled & Optimized Company Analysis Pipeline:
Here, the pipeline incorporates DSPy’s compilation and optimization capabilities. Prompts and module parameters are automatically tuned using datasets and evaluation metrics, resulting in improved accuracy and efficiency for company analysis tasks.
– Comprehensive Multi-Stage Pipeline with Composable Compiled Programs:
This diagram presents an advanced DSPy workflow for company and market analysis. Multiple compiled and optimized modules are composed to handle diverse subtasks—such as fact retrieval, social media analysis, comparative evaluation, and scoring. The architecture demonstrates scalable composition, iterative improvement, and robust evaluation across the pipeline.

Related Papers

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines – Introduces DSPy’s core concepts and compilation approach.
DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines – Explores evaluation and self-refinement in DSPy pipelines.

Repository Structure

notebooks/ – Jupyter notebooks with DSPy examples:
- Financial info retrieval
- Company valuation
- Market analysis
- RAG (Retrieval-Augmented Generation)
- Program composition and evaluation
dspy/ – Core DSPy framework modules, primitives, adapters, and utilities.
examples/ – Runnable Python demos for DSPy modules and pipelines.
tests/ – Validation code for modules, signatures, and optimizers.
Supporting configs:
- requirements.txt, pyproject.toml, setup.py – Dependency and environment management.
- docs/ – Documentation, diagrams, and API references.

Setup Instructions

Python Environment
- Recommended: Python 3.8+
- Create a virtual environment:
```
python -m venv .venv
source .venv/bin/activate
```

Install Dependencies

With pip:
```
pip install -r requirements.txt
```
Or with Poetry:
```
poetry install
```

Jupyter/Conda Setup (Optional for notebooks):

Install Jupyter:
```
pip install jupyter
```

Or use Conda:

conda create -n dspy python=3.8
conda activate dspy
conda install jupyter

Quickstart

Run an Example Notebook:

jupyter notebook notebooks/company-valuation.ipynb

Basic DSPy Pipeline (Python):

from dspy.modules import Predict, ChainOfThought, ReAct

# Example: Simple prediction module
predictor = Predict(signature="What is the capital of France?")
result = predictor.run()
print(result)

Core Concepts

Modules:
- Predict – Direct LLM calls for prediction tasks.
- ChainOfThought – Stepwise reasoning and intermediate outputs.
- ReAct – Combines reasoning and action for interactive tasks.
Signatures & Dataset Orchestration:
- Declarative task definitions and automated dataset management.
Evaluation Metrics Abstraction:
- Built-in support for accuracy, F1, custom metrics.
Optimizers & Compilation:
- Automated tuning of pipeline parameters for improved performance.

Extended Example: Financial Info Retrieval Pipeline

Initial Program:
- Simple retrieval using Predict.
Adding Evaluation Metrics:
- Integrate accuracy/F1 scoring for outputs.
Introducing RAG:
- Use retrieval-augmented generation for better factuality.
Compiling/Optimizing:
- Apply DSPy optimizers to tune pipeline and improve scores.
Final Score Improvement:
- Demonstrate measurable gains in evaluation metrics.
Manual Sample Validation:
- Review outputs for correctness and reliability.

Composition & Scaling

DSPy enables composition of programs—building larger, more capable systems by integrating smaller modules (e.g., chaining Predict, ChainOfThought, and ReAct). This supports scalable, maintainable, and extensible LLM workflows.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.vscode		.vscode
docs		docs
dspy		dspy
images		images
notebooks		notebooks
papers		papers
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
dspy.md		dspy.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DSPy Research and Demo

Abstract

Motivation & Background

Demos

Related Papers

Repository Structure

Setup Instructions

Quickstart

Core Concepts

Extended Example: Financial Info Retrieval Pipeline

Composition & Scaling

Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

iliazlobin/dspy-research

Folders and files

Latest commit

History

Repository files navigation

DSPy Research and Demo

Abstract

Motivation & Background

Demos

Related Papers

Repository Structure

Setup Instructions

Quickstart

Core Concepts

Extended Example: Financial Info Retrieval Pipeline

Composition & Scaling

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages