Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(query): fold constant subquery to build filter plan instead of join plan #17448

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

b41sh
Copy link
Member

@b41sh b41sh commented Feb 13, 2025

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Fold a subquery that returns a constant value to a constant value scalar, convert the join plan to a filter plan, which can be filtered using indexes to avoid reading the full amount of data and speed up the query.

for example:
main

root@0.0.0.0:48000/default> explain select * from numbers(10) as t where t.number in (select * from unnest([1,2]));
-[ EXPLAIN ]-----------------------------------
HashJoin
├── output columns: [t.number (#0)]
├── join type: LEFT SEMI
├── build keys: [CAST(subquery_2 (#2) AS UInt64 NULL)]
├── probe keys: [CAST(t.number (#0) AS UInt64 NULL)]
├── filters: []
├── estimated rows: 10.00
├── EvalScalar(Build)
│   ├── output columns: [unnest([1, 2]) (#2)]
│   ├── expressions: [get(1)(unnest([1, 2]) (#1))]
│   ├── estimated rows: 3.00
│   └── ProjectSet
│       ├── output columns: [unnest([1, 2]) (#1)]
│       ├── estimated rows: 3.00
│       ├── set returning functions: unnest([1, 2])
│       └── DummyTableScan
└── TableScan(Probe)
    ├── table: default.system.numbers
    ├── output columns: [number (#0)]
    ├── read rows: 10
    ├── read size: < 1 KiB
    ├── partitions total: 1
    ├── partitions scanned: 1
    ├── push downs: [filters: [], limit: NONE]
    └── estimated rows: 10.00

25 rows explain in 0.005 sec. Processed 0 rows, 0 B (0 row/s, 0 B/s)

this PR

root@0.0.0.0:48000/default> explain select * from numbers(10) as t where t.number in (select * from unnest([1,2]));
-[ EXPLAIN ]-----------------------------------
Filter
├── output columns: [t.number (#0)]
├── filters: [(t.number (#0) = 1 OR t.number (#0) = 2)]
├── estimated rows: 0.01
└── TableScan
    ├── table: default.system.numbers
    ├── output columns: [number (#0)]
    ├── read rows: 10
    ├── read size: < 1 KiB
    ├── partitions total: 1
    ├── partitions scanned: 1
    ├── push downs: [filters: [(numbers.number (#0) = 1 OR numbers.number (#0) = 2)], limit: NONE]
    └── estimated rows: 10.00

13 rows explain in 0.007 sec. Processed 0 rows, 0 B (0 row/s, 0 B/s)

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

@github-actions github-actions bot added the pr-bugfix this PR patches a bug in codebase label Feb 13, 2025
@b41sh b41sh requested a review from sundy-li February 13, 2025 07:28
@b41sh b41sh marked this pull request as ready for review February 13, 2025 07:29
@b41sh b41sh requested a review from Dousir9 February 13, 2025 07:29
@b41sh b41sh requested a review from sundy-li February 14, 2025 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-bugfix this PR patches a bug in codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants