Types¶
Core types and data structures.
Example ¶
Example(expected_output=None, text=None, image_path=None, image_base64=None, pdf_path=None, pdf_dpi=300)
Example data for optimization.
This class automatically prepares input data from various input types: - Plain text - Images (from file path or base64 string) - PDFs (converted to images at specified DPI)
Examples:
# Plain text
Example(
text="John Doe, 30 years old",
expected_output={"name": "John Doe", "age": 30}
)
# Text dict for template formatting
Example(
text={"name": "John Doe", "location": "New York"},
expected_output={"name": "John Doe", "age": 30}
)
# Image from file
Example(
image_path="document.png",
expected_output={"name": "John Doe", "age": 30}
)
# PDF (converted to 300 DPI images)
Example(
pdf_path="document.pdf",
pdf_dpi=300,
expected_output={"name": "John Doe", "age": 30}
)
# Combined text and image
Example(
text="Extract information from this document",
image_path="document.png",
expected_output={"name": "John Doe", "age": 30}
)
# Image from base64 string
Example(
image_base64="iVBORw0KG...",
expected_output={"name": "John Doe", "age": 30}
)
# Without expected_output (uses LLM judge for evaluation)
Example(
text="John Doe, 30 years old",
expected_output=None
)
Attributes:
| Name | Type | Description |
|---|---|---|
input_data |
Input data dictionary (automatically generated from input parameters). |
|
text_dict |
Dictionary of text values for template formatting. Used to format instruction prompt templates with placeholders like "{key}". Set automatically when text parameter is a dict. |
|
expected_output |
Expected output. Can be a str, dict, or Pydantic model matching the target schema. If a string, it will be wrapped in a single-field model with field name "output". If a Pydantic model, it will be converted to a dict for comparison. If None, evaluation will use an LLM judge or custom evaluation function instead of comparing against expected output. |
Initialize an Example.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
expected_output
|
str | dict[str, Any] | BaseModel | None
|
Expected output. Can be a str, dict, or Pydantic model. If a string, it will be wrapped in a single-field model with field name "output". If None, evaluation will use an LLM judge or custom evaluation function. |
None
|
text
|
str | dict[str, str] | None
|
Plain text input (str) or dictionary of text values for template formatting (dict). If a dict, keys correspond to placeholders in instruction prompt templates (e.g., {"key": "value"}). For input_data text extraction, known keys "text", "review", "content", "input" are checked first. If none match, values are joined with spaces as fallback. |
None
|
image_path
|
str | Path | None
|
Path to an image file to convert to base64. |
None
|
image_base64
|
str | None
|
Base64-encoded image string. |
None
|
pdf_path
|
str | Path | None
|
Path to a PDF file to convert to images. |
None
|
pdf_dpi
|
int
|
DPI for PDF conversion (default: 300). |
300
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If no input parameters are provided. |
Source code in src/dspydantic/types.py
Functions¶
OptimizationResult
dataclass
¶
OptimizationResult(optimized_descriptions, optimized_system_prompt, optimized_instruction_prompt, metrics, baseline_score, optimized_score, optimized_demos=None, api_calls=0, total_tokens=0, estimated_cost_usd=None)
Result of Pydantic model optimization.
Attributes:
| Name | Type | Description |
|---|---|---|
optimized_descriptions |
dict[str, str]
|
Dictionary mapping field paths to optimized descriptions. |
optimized_system_prompt |
str | None
|
Optimized system prompt (if provided). |
optimized_instruction_prompt |
str | None
|
Optimized instruction prompt (if provided). |
optimized_demos |
list[dict[str, Any]] | None
|
Few-shot examples (input_data, expected_output) for the extraction prompt. |
metrics |
dict[str, Any]
|
Dictionary containing optimization metrics (score, improvement, etc.). |
baseline_score |
float
|
Baseline score before optimization. |
optimized_score |
float
|
Score after optimization. |
api_calls |
int
|
Total number of API calls made during optimization. |
total_tokens |
int
|
Total tokens used during optimization (if available). |
estimated_cost_usd |
float | None
|
Estimated cost in USD (if available). |
ExtractionResult
dataclass
¶
Result of extraction with optional metadata.
Attributes:
| Name | Type | Description |
|---|---|---|
data |
BaseModel
|
The extracted Pydantic model instance. |
confidence |
float | None
|
Confidence score (0.0-1.0) if requested. |
raw_output |
str | None
|
Raw LLM output text. |
PrompterState
dataclass
¶
PrompterState(model_schema, optimized_descriptions, optimized_system_prompt, optimized_instruction_prompt, model_id, model_config, version, metadata, optimized_demos=None)
State of a Prompter instance for serialization.
This class contains all the information needed to save and restore a Prompter instance.
Attributes:
| Name | Type | Description |
|---|---|---|
model_schema |
dict[str, Any]
|
JSON schema of the Pydantic model. |
optimized_descriptions |
dict[str, str]
|
Dictionary of optimized field descriptions. |
optimized_system_prompt |
str | None
|
Optimized system prompt (if any). |
optimized_instruction_prompt |
str | None
|
Optimized instruction prompt (if any). |
model_id |
str
|
LLM model identifier. |
model_config |
dict[str, Any]
|
Model configuration (API base, version, etc.). |
version |
str
|
dspydantic version for compatibility checking. |
metadata |
dict[str, Any]
|
Additional metadata (timestamp, optimization metrics, etc.). |
FieldOptimizationProgress
dataclass
¶
FieldOptimizationProgress(phase, score_before, score_after, improved, total_fields, field_path=None, field_index=None, elapsed_seconds=0.0, optimized_value=None)
Progress update emitted during field-by-field optimization.
Attributes:
| Name | Type | Description |
|---|---|---|
phase |
str
|
Current optimization phase. Valid values: - "baseline": Initial evaluation before optimization - "fields": Field description optimization - "skipped": Field was skipped (already above threshold) - "system_prompt": System prompt optimization - "instruction_prompt": Instruction prompt optimization - "complete": Optimization finished |
score_before |
float
|
Score before this optimization step. |
score_after |
float
|
Score after this optimization step. |
improved |
bool
|
True if score improved. |
total_fields |
int
|
Total number of fields being optimized. |
field_path |
str | None
|
Dot-notation path of the field just optimized (None for non-field phases). |
field_index |
int | None
|
1-based index of the field (None for non-field phases). |
elapsed_seconds |
float
|
Wall-clock seconds elapsed since optimization started. |
optimized_value |
str | None
|
The optimized description or prompt text (None for non-field/non-prompt phases). |
Example¶
The Example class represents a single example for optimization. It supports multiple input types:
- Text: Plain text string or dictionary for prompt templates
- Images: File path (
image_path) or base64-encoded string (image_base64) - PDFs: File path (
pdf_path) - automatically converted to images at specified DPI (default: 300)
PDFs are converted to images page by page for processing. Use pdf_dpi parameter to control conversion quality (default: 300 DPI).
OptimizationResult¶
The OptimizationResult dataclass contains the results of optimization:
optimized_descriptions: Dictionary mapping field paths to optimized descriptionsoptimized_system_prompt: Optimized system prompt (if provided)optimized_instruction_prompt: Optimized instruction prompt (if provided)metrics: Dictionary containing optimization metricsbaseline_score: Baseline score before optimizationoptimized_score: Score after optimizationapi_calls: Total API calls made during optimizationtotal_tokens: Total tokens used during optimization
ExtractionResult¶
The ExtractionResult dataclass is returned by predict_with_confidence():
data: The extracted Pydantic model instanceconfidence: Confidence score (0.0-1.0)raw_output: Raw LLM output text (optional)
PrompterState¶
The PrompterState dataclass contains all information needed to save and restore a Prompter instance.
FieldOptimizationProgress¶
The FieldOptimizationProgress dataclass is emitted by the on_progress callback during optimization to track progress:
phase: Current optimization phase ("baseline", "fields", "skipped", "system_prompt", "instruction_prompt", "complete")score_before: Score before this optimization stepscore_after: Score after this optimization stepimproved: Whether the score improvedtotal_fields: Total number of fields being optimizedfield_path: Dot-notation path of the field being optimized (None for non-field phases)field_index: 1-based index of the field (None for non-field phases)elapsed_seconds: Wall-clock seconds elapsed since optimization startedoptimized_value: The actual optimized description or prompt text (new in v0.1.3+)
Usage with Callbacks¶
def my_progress_callback(progress: FieldOptimizationProgress):
if progress.phase == "fields":
print(f"{progress.field_path}: {progress.score_before:.0%} → {progress.score_after:.0%}")
if progress.optimized_value:
print(f" Optimized to: {progress.optimized_value!r}")
optimizer.on_progress = my_progress_callback
Callbacks are automatically invoked when verbose=True with rich-formatted output showing optimized values.