Types¶

Core types and data structures.

Example ¶

Example(expected_output=None, text=None, image_path=None, image_base64=None, pdf_path=None, pdf_dpi=300)

Example data for optimization.

This class automatically prepares input data from various input types: - Plain text - Images (from file path or base64 string) - PDFs (converted to images at specified DPI)

Examples:

# Plain text
Example(
    text="John Doe, 30 years old",
    expected_output={"name": "John Doe", "age": 30}
)

# Text dict for template formatting
Example(
    text={"name": "John Doe", "location": "New York"},
    expected_output={"name": "John Doe", "age": 30}
)

# Image from file
Example(
    image_path="document.png",
    expected_output={"name": "John Doe", "age": 30}
)

# PDF (converted to 300 DPI images)
Example(
    pdf_path="document.pdf",
    pdf_dpi=300,
    expected_output={"name": "John Doe", "age": 30}
)

# Combined text and image
Example(
    text="Extract information from this document",
    image_path="document.png",
    expected_output={"name": "John Doe", "age": 30}
)

# Image from base64 string
Example(
    image_base64="iVBORw0KG...",
    expected_output={"name": "John Doe", "age": 30}
)

# Without expected_output (uses LLM judge for evaluation)
Example(
    text="John Doe, 30 years old",
    expected_output=None
)

Attributes:

Name	Type	Description
`input_data`		Input data dictionary (automatically generated from input parameters).
`text_dict`		Dictionary of text values for template formatting. Used to format instruction prompt templates with placeholders like "{key}". Set automatically when text parameter is a dict.
`expected_output`		Expected output. Can be a str, dict, or Pydantic model matching the target schema. If a string, it will be wrapped in a single-field model with field name "output". If a Pydantic model, it will be converted to a dict for comparison. If None, evaluation will use an LLM judge or custom evaluation function instead of comparing against expected output.

Initialize an Example.

Parameters:

Name	Type	Description	Default
`expected_output`	`str \| dict[str, Any] \| BaseModel \| None`	Expected output. Can be a str, dict, or Pydantic model. If a string, it will be wrapped in a single-field model with field name "output". If None, evaluation will use an LLM judge or custom evaluation function.	`None`
`text`	`str \| dict[str, str] \| None`	Plain text input (str) or dictionary of text values for template formatting (dict). If a dict, keys correspond to placeholders in instruction prompt templates (e.g., {"key": "value"}). For input_data text extraction, known keys "text", "review", "content", "input" are checked first. If none match, values are joined with spaces as fallback.	`None`
`image_path`	`str \| Path \| None`	Path to an image file to convert to base64.	`None`
`image_base64`	`str \| None`	Base64-encoded image string.	`None`
`pdf_path`	`str \| Path \| None`	Path to a PDF file to convert to images.	`None`
`pdf_dpi`	`int`	DPI for PDF conversion (default: 300).	`300`

Raises:

Type	Description
`ValueError`	If no input parameters are provided.

Source code in src/dspydantic/types.py

def __init__(
    self,
    expected_output: str | dict[str, Any] | BaseModel | None = (None),
    text: str | dict[str, str] | None = None,
    image_path: str | Path | None = None,
    image_base64: str | None = None,
    pdf_path: str | Path | None = None,
    pdf_dpi: int = 300,
) -> None:
    """Initialize an Example.

    Args:
        expected_output: Expected output. Can be a str, dict, or Pydantic model.
            If a string, it will be wrapped in a single-field model with field name "output".
            If None, evaluation will use an LLM judge or custom evaluation function.
        text: Plain text input (str) or dictionary of text values for template
            formatting (dict). If a dict, keys correspond to placeholders in
            instruction prompt templates (e.g., {"key": "value"}). For input_data
            text extraction, known keys "text", "review", "content", "input" are
            checked first. If none match, values are joined with spaces as fallback.
        image_path: Path to an image file to convert to base64.
        image_base64: Base64-encoded image string.
        pdf_path: Path to a PDF file to convert to images.
        pdf_dpi: DPI for PDF conversion (default: 300).

    Raises:
        ValueError: If no input parameters are provided.
    """
    self.expected_output = expected_output

    if isinstance(text, dict):
        self.text_dict = text
        text_string = (
            text.get("text")
            or text.get("review")
            or text.get("content")
            or text.get("input")
            or None
        )
        if text_string is None and text:
            text_string = " ".join(str(v) for v in text.values())
    else:
        self.text_dict = {}
        text_string = text

    # Use prepare_input_data to create input_data from parameters
    # If text_string is None and no other inputs, we'll set input_data manually later
    try:
        self.input_data = prepare_input_data(
            text=text_string,
            image_path=image_path,
            image_base64=image_base64,
            pdf_path=pdf_path,
            pdf_dpi=pdf_dpi,
        )
    except ValueError:
        # If no inputs provided and text is a dict, create empty input_data
        # It can be set manually later if needed
        if isinstance(text, dict):
            self.input_data = {}
        else:
            raise

Functions¶

OptimizationResult `dataclass` ¶

OptimizationResult(optimized_descriptions, optimized_system_prompt, optimized_instruction_prompt, metrics, baseline_score, optimized_score, optimized_demos=None, api_calls=0, total_tokens=0, estimated_cost_usd=None)

Result of Pydantic model optimization.

Attributes:

Name	Type	Description
`optimized_descriptions`	`dict[str, str]`	Dictionary mapping field paths to optimized descriptions.
`optimized_system_prompt`	`str \| None`	Optimized system prompt (if provided).
`optimized_instruction_prompt`	`str \| None`	Optimized instruction prompt (if provided).
`optimized_demos`	`list[dict[str, Any]] \| None`	Few-shot examples (input_data, expected_output) for the extraction prompt.
`metrics`	`dict[str, Any]`	Dictionary containing optimization metrics (score, improvement, etc.).
`baseline_score`	`float`	Baseline score before optimization.
`optimized_score`	`float`	Score after optimization.
`api_calls`	`int`	Total number of API calls made during optimization.
`total_tokens`	`int`	Total tokens used during optimization (if available).
`estimated_cost_usd`	`float \| None`	Estimated cost in USD (if available).

ExtractionResult `dataclass` ¶

ExtractionResult(data, confidence=None, raw_output=None)

Result of extraction with optional metadata.

Attributes:

Name	Type	Description
`data`	`BaseModel`	The extracted Pydantic model instance.
`confidence`	`float \| None`	Confidence score (0.0-1.0) if requested.
`raw_output`	`str \| None`	Raw LLM output text.

PrompterState `dataclass` ¶

PrompterState(model_schema, optimized_descriptions, optimized_system_prompt, optimized_instruction_prompt, model_id, model_config, version, metadata, optimized_demos=None)

State of a Prompter instance for serialization.

This class contains all the information needed to save and restore a Prompter instance.

Attributes:

Name	Type	Description
`model_schema`	`dict[str, Any]`	JSON schema of the Pydantic model.
`optimized_descriptions`	`dict[str, str]`	Dictionary of optimized field descriptions.
`optimized_system_prompt`	`str \| None`	Optimized system prompt (if any).
`optimized_instruction_prompt`	`str \| None`	Optimized instruction prompt (if any).
`model_id`	`str`	LLM model identifier.
`model_config`	`dict[str, Any]`	Model configuration (API base, version, etc.).
`version`	`str`	dspydantic version for compatibility checking.
`metadata`	`dict[str, Any]`	Additional metadata (timestamp, optimization metrics, etc.).

FieldOptimizationProgress `dataclass` ¶

FieldOptimizationProgress(phase, score_before, score_after, improved, total_fields, field_path=None, field_index=None, elapsed_seconds=0.0, optimized_value=None)

Progress update emitted during field-by-field optimization.

Attributes:

Name	Type	Description
`phase`	`str`	Current optimization phase. Valid values: - "baseline": Initial evaluation before optimization - "fields": Field description optimization - "skipped": Field was skipped (already above threshold) - "system_prompt": System prompt optimization - "instruction_prompt": Instruction prompt optimization - "complete": Optimization finished
`score_before`	`float`	Score before this optimization step.
`score_after`	`float`	Score after this optimization step.
`improved`	`bool`	True if score improved.
`total_fields`	`int`	Total number of fields being optimized.
`field_path`	`str \| None`	Dot-notation path of the field just optimized (None for non-field phases).
`field_index`	`int \| None`	1-based index of the field (None for non-field phases).
`elapsed_seconds`	`float`	Wall-clock seconds elapsed since optimization started.
`optimized_value`	`str \| None`	The optimized description or prompt text (None for non-field/non-prompt phases).

Example¶

The Example class represents a single example for optimization. It supports multiple input types:

Text: Plain text string or dictionary for prompt templates
Images: File path (image_path) or base64-encoded string (image_base64)
PDFs: File path (pdf_path) - automatically converted to images at specified DPI (default: 300)

PDFs are converted to images page by page for processing. Use pdf_dpi parameter to control conversion quality (default: 300 DPI).

OptimizationResult¶

The OptimizationResult dataclass contains the results of optimization:

optimized_descriptions: Dictionary mapping field paths to optimized descriptions
optimized_system_prompt: Optimized system prompt (if provided)
optimized_instruction_prompt: Optimized instruction prompt (if provided)
metrics: Dictionary containing optimization metrics
baseline_score: Baseline score before optimization
optimized_score: Score after optimization
api_calls: Total API calls made during optimization
total_tokens: Total tokens used during optimization

ExtractionResult¶

The ExtractionResult dataclass is returned by predict_with_confidence():

data: The extracted Pydantic model instance
confidence: Confidence score (0.0-1.0)
raw_output: Raw LLM output text (optional)

PrompterState¶

The PrompterState dataclass contains all information needed to save and restore a Prompter instance.

FieldOptimizationProgress¶

The FieldOptimizationProgress dataclass is emitted by the on_progress callback during optimization to track progress:

phase: Current optimization phase ("baseline", "fields", "skipped", "system_prompt", "instruction_prompt", "complete")
score_before: Score before this optimization step
score_after: Score after this optimization step
improved: Whether the score improved
total_fields: Total number of fields being optimized
field_path: Dot-notation path of the field being optimized (None for non-field phases)
field_index: 1-based index of the field (None for non-field phases)
elapsed_seconds: Wall-clock seconds elapsed since optimization started
optimized_value: The actual optimized description or prompt text (new in v0.1.3+)

Usage with Callbacks¶

def my_progress_callback(progress: FieldOptimizationProgress):
    if progress.phase == "fields":
        print(f"{progress.field_path}: {progress.score_before:.0%} → {progress.score_after:.0%}")
        if progress.optimized_value:
            print(f"  Optimized to: {progress.optimized_value!r}")

optimizer.on_progress = my_progress_callback

Callbacks are automatically invoked when verbose=True with rich-formatted output showing optimized values.

Types¶

Example ¶

Functions¶

OptimizationResult dataclass ¶

ExtractionResult dataclass ¶

PrompterState dataclass ¶

FieldOptimizationProgress dataclass ¶

Example¶

OptimizationResult¶

ExtractionResult¶

PrompterState¶

FieldOptimizationProgress¶

Usage with Callbacks¶

See Also¶

OptimizationResult `dataclass` ¶

ExtractionResult `dataclass` ¶

PrompterState `dataclass` ¶

FieldOptimizationProgress `dataclass` ¶