-
Notifications
You must be signed in to change notification settings - Fork 0
20097: Improve performance of CASE WHEN x THEN y ELSE NULL expressions
#225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis change modifies the CASE expression evaluation logic to handle scenarios where no else clause is present. The implementation introduces a scatter-based approach for single-branch CASE expressions without an else, consolidates optional else-branch handling across multiple evaluation paths, and reduces unnecessary else-expression evaluation. The code now branches on the presence or absence of an else expression, directing evaluation to either the existing merge-based logic (when else is present) or a new scatter-based result path (when else is absent). A new import for the scatter utility is added to support this change. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @martin-augment, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the performance and robustness of Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
PR Review: Improve performance of
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a performance improvement for CASE WHEN ... THEN ... END expressions by adding a specialized evaluation path that avoids an ELSE clause. The changes correctly handle the absence of an ELSE expression in several parts of the CASE expression evaluation logic, preventing potential panics and using the efficient scatter utility for constructing the result. The overall implementation is sound and enhances performance for this common SQL pattern. I have one suggestion for improving code consistency and robustness.
| let else_expr = | ||
| try_cast(Arc::clone(else_expr), &batch.schema(), return_type.clone()) | ||
| .unwrap_or_else(|_| Arc::clone(else_expr)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using unwrap_or_else here to ignore a potential error from try_cast seems risky. If try_cast fails, it likely indicates a bug in the planner where types were not correctly coerced. By ignoring the error and using the original expression, a subsequent error might occur during merge which would be harder to debug.
Other parts of this file handle the result of try_cast using ? to propagate the error (e.g., lines 745, 848, 936). For consistency and robustness, it would be better to do the same here.
| let else_expr = | |
| try_cast(Arc::clone(else_expr), &batch.schema(), return_type.clone()) | |
| .unwrap_or_else(|_| Arc::clone(else_expr)); | |
| let else_expr = try_cast(Arc::clone(else_expr), &batch.schema(), return_type.clone())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value:good-to-have; category:bug; feedback: The Gemini AI reviewer is correct! The type of the else expression is expected to match the batch's schema, so there is no need to fallback.
🤖 Augment PR SummarySummary: Improves physical evaluation of single-branch Changes:
Technical Notes: Keeps the existing merge-based path when an 🤖 Was this summary useful? React with 👍 or 👎 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| { | ||
| EvalMethod::ScalarOrScalar | ||
| } else if body.when_then_expr.len() == 1 && body.else_expr.is_some() { | ||
| } else if body.when_then_expr.len() == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes EvalMethod::ExpressionOrExpression apply even when else_expr is None (i.e. CASE WHEN … THEN … END). If that’s intentional, consider updating the ExpressionOrExpression doc comment to reflect that ELSE is now optional for this eval method.
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value:good-to-have; category:bug; feedback: The Augment AI reviewer is correct! The docstring of ExpressionOrExpression mentions both the then and the else expressions. The proposed change makes the else expression optional and this requires an update of the docstring too.
| match &self.else_expr { | ||
| None => { | ||
| let then_array = then_value.to_array(when_value.true_count())?; | ||
| scatter(&when_value, then_array.as_ref()).map(ColumnarValue::Array) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new else_expr == None fast-path relies on scatter to reconstruct the full output, but I don’t see a test that specifically exercises this path when the then expression is not eligible for InfallibleExprOrNull. Consider adding a regression test for CASE WHEN <cond> THEN <non-infallible expr> END (or ELSE NULL) to cover mask/null semantics.
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value:useful; category:bug; feedback: The Augment AI reviewer is correct! There are neither old unit tests for this case nor new ones. Prevents adding new functionality without test cases which could lead to unnoticed regressions in the future.
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
value:useful; category:bug; feedback: The Claude AI reviewer is correct! There are neither old unit tests for this case nor new ones. Prevents adding new functionality without test cases which could lead to unnoticed regressions in the future. |
20097: To review by AI