-
Notifications
You must be signed in to change notification settings - Fork 92
feat: support simple conditional column generation #479
Copy link
Copy link
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Problem
DataDesigner's DAG executes every column for every row unconditionally. In multi-stage synthesis pipelines, expensive downstream generation (LLM calls, segmentation, etc.) runs even when an earlier gate column indicates the row should be filtered out.
Today the only workarounds are:
- Generate all columns unconditionally and post-filter — wasting LLM calls on rows that will be discarded
- Split into multiple
DataDesigner.create()calls with intermediate filtering — losing single-pipeline ergonomics
Proposed Feature
Add a skip_when field to column configs that accepts a Jinja2 expression. When the expression evaluates truthy for a row, generation is skipped and the cell is set to None. Skips should auto-propagate through the DAG — downstream columns that depend on a skipped column also skip without requiring explicit configuration.
Example Use Case
config_builder.add_column(
name="complexity_score", column_type="llm-structured", ...
)
config_builder.add_column(
name="categories",
column_type="llm-structured",
skip_when="{{ complexity_score.overall_complexity_score < 6 }}",
...
)
# Everything downstream of categories auto-skips — no extra config needed
config_builder.add_column(name="instances", ...)
config_builder.add_column(name="multi_hop_query", ...)Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request