Skip to content

Configure Models

This guide covers how to configure DSPy language models for use with DSPydantic. You must configure DSPy with dspy.configure(lm=lm) before creating prompters or running optimization.

DSPy Model Configuration

DSPydantic uses DSPy's language model configuration. You must configure DSPy before creating prompters:

import dspy

# Configure DSPy with your language model
# Pass temperature and max_tokens for generation behavior
lm = dspy.LM(
    "openai/gpt-4o",
    api_key="your-api-key",
    temperature=1.0,
    max_tokens=16000,
)
dspy.configure(lm=lm)

# Now create prompters - they will use the configured model
from dspydantic import Prompter
prompter = Prompter(model=MyModel)

Important: Model configuration is not saved with prompters. You must configure DSPy before loading saved prompters.

Cloud Providers

OpenAI

Configure OpenAI models:

import dspy
import os

# Set API key via environment variable
os.environ["OPENAI_API_KEY"] = "your-api-key"

# Or pass directly (with temperature and max_tokens)
lm = dspy.LM(
    "openai/gpt-4o",
    api_key="your-api-key",
    temperature=1.0,
    max_tokens=16000,
)
dspy.configure(lm=lm)

Available Models: - openai/gpt-4o - Best quality, slower - openai/gpt-4o-mini - Faster, good quality - openai/gpt-4-turbo - Balanced option - openai/gpt-3.5-turbo - Fastest, lower quality

Anthropic

Configure Anthropic Claude models:

import dspy
import os

# Set API key via environment variable
os.environ["ANTHROPIC_API_KEY"] = "your-api-key"

# Or pass directly
lm = dspy.LM("anthropic/claude-sonnet-4-5-20250929", api_key="your-api-key")
dspy.configure(lm=lm)

Available Models: - anthropic/claude-sonnet-4-5-20250929 - Latest Claude Sonnet - anthropic/claude-opus-3 - High quality - anthropic/claude-haiku-3 - Fast, cost-effective

Google Gemini

Configure Google Gemini models:

import dspy
import os

# Set API key via environment variable
os.environ["GEMINI_API_KEY"] = "your-api-key"

# Or pass directly
lm = dspy.LM("gemini/gemini-2.5-pro-preview-03-25", api_key="your-api-key")
dspy.configure(lm=lm)

Available Models: - gemini/gemini-2.5-pro-preview-03-25 - Latest Gemini Pro - gemini/gemini-1.5-pro - Stable version

Databricks

Configure Databricks models:

import dspy
import os

# Set API key and base URL
os.environ["DATABRICKS_API_KEY"] = "your-api-key"
os.environ["DATABRICKS_API_BASE"] = "https://your-workspace.cloud.databricks.com"

# Or pass directly
lm = dspy.LM("databricks/model-name", api_key="your-api-key", api_base="https://...")
dspy.configure(lm=lm)

Local Models

Ollama

Run models locally with Ollama:

import dspy

# 1. Install Ollama: https://ollama.ai
# 2. Run model: ollama run llama3.2

# Configure DSPy to use Ollama
lm = dspy.LM(
    "ollama_chat/llama3.2",
    api_base="http://localhost:11434",
    api_key=""  # Not needed for local
)
dspy.configure(lm=lm)

Setup Steps: 1. Install Ollama from https://ollama.ai 2. Pull a model: ollama pull llama3.2 3. Run the model: ollama run llama3.2 4. Configure DSPy as shown above

SGLang

Use SGLang for fast local inference:

import dspy

# 1. Install SGLang and launch server with your model
# 2. Server runs on http://localhost:7501/v1

# Configure DSPy to use SGLang (OpenAI-compatible endpoint)
lm = dspy.LM(
    "openai/model-name",  # Use any model name
    api_base="http://localhost:7501/v1",
    model_type="chat"
)
dspy.configure(lm=lm)

Setup Steps: 1. Install SGLang: pip install sglang 2. Launch server with your model 3. Configure DSPy to use the OpenAI-compatible endpoint

vLLM

Use vLLM for high-performance local inference:

import dspy

# 1. Install vLLM and launch server
# 2. Server runs on http://localhost:8000/v1

# Configure DSPy to use vLLM (OpenAI-compatible endpoint)
lm = dspy.LM(
    "openai/model-name",  # Use any model name
    api_base="http://localhost:8000/v1"
)
dspy.configure(lm=lm)

Setup Steps: 1. Install vLLM: pip install vllm 2. Launch server with your model 3. Configure DSPy to use the OpenAI-compatible endpoint

Model Selection Guide

Speed vs Accuracy

Model Type Speed Accuracy Cost Use Case
gpt-4o-mini Fast Good Low Simple tasks, quick iterations
gpt-4o Slower Better High Complex tasks, production
claude-haiku Fast Good Low Fast extraction
claude-sonnet Medium Best High High-quality extraction
Local (Ollama) Varies Good Free Privacy-sensitive, offline

Provider Comparison

Provider Speed Quality Cost Best For
OpenAI Fast High Medium General use, production
Anthropic Medium Very High High Complex reasoning
Gemini Fast High Low Cost-effective
Local Varies Good Free Privacy, offline

API Key Management

Store API keys securely:

import os
import dspy

# Set in environment (never commit to git)
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

lm = dspy.LM("openai/gpt-4o")
dspy.configure(lm=lm)

Direct Configuration

For testing only:

import dspy

lm = dspy.LM("openai/gpt-4o", api_key="your-api-key")
dspy.configure(lm=lm)

Security Best Practices: - Never commit API keys to version control - Use environment variables or secrets management - Rotate keys regularly - Use different keys for development and production

Model Selection for Optimization

During Optimization

Use a capable model for optimization:

import dspy

# Use a strong model for optimization
lm = dspy.LM("openai/gpt-4o", api_key="your-api-key")
dspy.configure(lm=lm)

prompter = Prompter(model=MyModel)
result = prompter.optimize(examples=examples)  # Uses gpt-4o

During Extraction

You can use a faster model for extraction:

import dspy

# Switch to faster model for extraction
lm_fast = dspy.LM("openai/gpt-4o-mini", api_key="your-api-key")
dspy.configure(lm=lm_fast)

# Load optimized prompter
prompter = Prompter.load("./saved_prompter")
data = prompter.run("text")  # Uses gpt-4o-mini

Performance Impact

Configuration Speed Accuracy Cost
gpt-4o Baseline Best High
gpt-4o-mini 2-3x faster Good Low
Local (Ollama) Varies Good Free

Tips

  • Configure DSPy before creating prompters
  • Use environment variables for API keys
  • Choose model based on speed vs accuracy needs
  • Use stronger models for optimization, faster models for extraction
  • Test with different models to find the best balance
  • Monitor API usage and costs

See Also