Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorder boolean expressions (including filter predicates) according to evaluation cost / selectivity #11262

Open
Dandandan opened this issue Jul 4, 2024 · 3 comments
Labels
enhancement New feature or request

Comments

@Dandandan
Copy link
Contributor

Dandandan commented Jul 4, 2024

Is your feature request related to a problem or challenge?

After #11247 is merged we can look at ordering the boolean expressions according to a measure of evaluation cost.

Describe the solution you'd like

We can reorder expressions:

E.g. a expression like the following:
URL LIKE '%google%' AND code = 404.

Likely would be better reordered to code = 404 AND URL LIKE '%google%' in order to benefit most from short circuiting as code = 404 is less expensive.
One could also combine it with the estimate of selectivity to further optimize the order (low selectivity, batches more likely to be all false, high selectivity, batches more likely to be all true)

Describe alternatives you've considered

No response

Additional context

No response

@Dandandan Dandandan added the enhancement New feature or request label Jul 4, 2024
@Dandandan Dandandan changed the title Reorder filter predicates according to evaluation cost Reorder boolean expressions (including filter predicates) according to evaluation cost Jul 4, 2024
@Dandandan Dandandan changed the title Reorder boolean expressions (including filter predicates) according to evaluation cost Reorder boolean expressions (including filter predicates) according to evaluation cost / selectivity Jul 4, 2024
@suibianwanwank
Copy link
Contributor

I've seen discussions about predicate reordering in the calcite community before, and one of the big problems is that the engine doing reordering of predicates invalidates the user-designed order of predicates, if the user understands that our short circuit optimisation writes the sql as a better order, but the engine reordering invalidates his efforts.

@Dandandan
Copy link
Contributor Author

I've seen discussions about predicate reordering in the calcite community before, and one of the big problems is that the engine doing reordering of predicates invalidates the user-designed order of predicates, if the user understands that our short circuit optimisation writes the sql as a better order, but the engine reordering invalidates his efforts.

Good call, if we do it, it needs to be configurable so users/engines can disable the optimization.

@alamb
Copy link
Contributor

alamb commented Jul 5, 2024

We could potentially do some simple heuristics that would catch the common case -- like "treat regexp as very slow and do them after other predicates"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants