Skip to content

feat: support simple conditional column generation #479

@nabinchha

Description

@nabinchha

Problem

DataDesigner's DAG executes every column for every row unconditionally. In multi-stage synthesis pipelines, expensive downstream generation (LLM calls, segmentation, etc.) runs even when an earlier gate column indicates the row should be filtered out.

Today the only workarounds are:

  1. Generate all columns unconditionally and post-filter — wasting LLM calls on rows that will be discarded
  2. Split into multiple DataDesigner.create() calls with intermediate filtering — losing single-pipeline ergonomics

Proposed Feature

Add a skip_when field to column configs that accepts a Jinja2 expression. When the expression evaluates truthy for a row, generation is skipped and the cell is set to None. Skips should auto-propagate through the DAG — downstream columns that depend on a skipped column also skip without requiring explicit configuration.

Example Use Case

config_builder.add_column(
    name="complexity_score", column_type="llm-structured", ...
)
config_builder.add_column(
    name="categories",
    column_type="llm-structured",
    skip_when="{{ complexity_score.overall_complexity_score < 6 }}",
    ...
)
# Everything downstream of categories auto-skips — no extra config needed
config_builder.add_column(name="instances", ...)
config_builder.add_column(name="multi_hop_query", ...)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions