Article

dspyprompt-engineeringprompt-optimizationllm-compilerpythonai-frameworkllm-pipeline

Build Self-Optimizing LLM Pipelines with DSPy

Stop hand-crafting prompts. DSPy compiles declarative Python code into optimized prompts for your LLM. This guide shows you how to define a task, provide examples, and let DSPy's teleprompters automatically build a high-performance, few-shot prompt.

intermediate30 min6 steps

The play

Install and Configure DSPy
First, install the dspy-ai library. Then, configure DSPy to use an LLM, such as OpenAI. You must set your API key as an environment variable for authentication. This setup tells DSPy which language model to use for all subsequent operations.
Define a Declarative Signature
A DSPy Signature defines the inputs and outputs of a task without specifying the prompt. It's a declarative way to describe what you want the LLM to do. Here, we create a simple signature for a question-answering task.
Create a Program Module
A DSPy Module is a Python class that uses Signatures to structure a program. The `dspy.Predict` module is the simplest building block, taking a signature and generating a prediction. This represents your uncompiled, zero-shot program.
Prepare a Training Set
DSPy's optimizers (Teleprompters) need data to learn how to generate effective prompts. Create a small list of `dspy.Example` objects, each containing an input (`question`) and the desired output (`answer`).
Compile the Program
This is the core of DSPy. Use a Teleprompter like `BootstrapFewShot` to compile your module. The compiler runs your program on the training data, evaluates its performance, and generates few-shot examples to create an optimized prompt.
Run the Optimized Program
Now you can use your new `optimized_qa` module just like the original one. When you call it, DSPy will use the complex, few-shot prompt that was automatically generated during compilation, resulting in better performance.

Starter code

import dspy
import os
from dspy.teleprompters import BootstrapFewShot

# --- 1. Setup --- 
# IMPORTANT: Set your OpenAI API key. Replace "sk-YourSecretKey" with your actual key.
# You can also set this as an environment variable `export OPENAI_API_KEY=...`
api_key = os.getenv("OPENAI_API_KEY", "sk-YourSecretKey")
if api_key == "sk-YourSecretKey":
    print("Warning: OPENAI_API_KEY not set. Using a placeholder.")

# Configure the default LLM for DSPy
llm = dspy.OpenAI(model='gpt-3.5-turbo', api_key=api_key)
dspy.settings.configure(lm=llm)

# --- 2. Define Signature --- 
class GenerateAnswer(dspy.Signature):
    """Answer questions with short, factual answers."""
    question = dspy.InputField()
    answer = dspy.OutputField(desc="Often a single word or phrase.")

# --- 3. Define Module --- 
class BasicQA(dspy.Module):
    def __init__(self):
        super().__init__()
        self.generate_answer = dspy.Predict(GenerateAnswer)

    def forward(self, question):
        return self.generate_answer(question=question)

# --- 4. Prepare Data --- 
train_data = [
    {'question': 'What is the color of the sky on a clear day?', 'answer': 'blue'},
    {'question': 'What is the capital of Japan?', 'answer': 'Tokyo'},
    {'question': 'What is 2 + 2?', 'answer': '4'},
    {'question': 'Who wrote "Hamlet"?', 'answer': 'William Shakespeare'}
]
trainset = [dspy.Example(**x).with_inputs('question') for x in train_data]

# --- 5. Compile --- 
# Configure the teleprompter/optimizer
teleprompter = BootstrapFewShot(metric=dspy.evaluate.answer_exact_match, max_bootstrapped_demos=2)

# Compile the BasicQA module into an optimized version
optimized_qa = teleprompter.compile(BasicQA(), trainset=trainset)

# --- 6. Execute --- 
my_question = "What is the primary language spoken in Brazil?"

# Get the prediction from the optimized module
pred = optimized_qa(question=my_question)

print(f"Question: {my_question}")
print(f"Predicted Answer: {pred.answer}")

# You can inspect the prompt that was automatically generated and sent to the LLM
print("\n--- Last LLM Request ---")
llm.inspect_history(n=1)