Skip to content

Tracking Issue: expression projection pushdown #8190

@myrrc

Description

@myrrc

One thing we're blocked on in benchmarks like random access (and some clickbench queries) is duckdb doesn't push projection scalar (SELECT len(str), list_sum(list)) and aggregate (SELECT min(v), max(v)) expressions. We need to create a duckdb optimization pass (possibly propagating it upstream) that collapses PROJECTION -> [FILTER] -> GET (where GET is vortex) into vortex's new function.

For ease of use, first implementation should push only chains of scalar functions pointing to BoundColumnRef:

void pushdown_projection_expression(const std::vector<std::pair<column_t, Expression>>& exprs)

column_t - projection index of column or column index if projections are empty

^ to pushdown expression, it must be supported in bool pushdown_expression() function.

Example: select len(col) from table
projections = [0]
pushdown_projection_expression({0, Expr{len}})
Example: select col, len(col) from table
projections = [0, 0]
pushdown_projection_expression({1, Expr{len}})
Example: select min(col), max(col) from table (future work)
projections = [0, 0]
pushdown_projection_expression({0, Expr{min}}, {1, Expr{max}})
Example: select min(col), col, max(col) from table (future work)
projections = [0, 0, 0]
pushdown_projection_expression({0, Expr{min}}, {2, Expr{max}})
Example: select min(col), col, cantpush(col) from table
projections = [0, 0] // cantpush isn't supported, so duckdb requests only
"col", inserts a #0, #1, #1 projection and calculates cantpush() on second #1
pushdown_projection_expression({0, Expr{min}})
Example: select mean(col1 + col2), sum(abs(col2)) from table
No pushdown for mean() because there isn't a single column inside aggregate
but an expression.
sum(abs( bound column )) is pushed
pushdown_projection_expression({1, Expr{sum(abs())})

Metadata

Metadata

Assignees

Labels

ext/duckdbRelates to the DuckDB integrationtracking-issueShared implementation context for work likely to span multiple PRs.

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions