-
Notifications
You must be signed in to change notification settings - Fork 0
20070: Standardize UDF coercion error messages #221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThis pull request refactors error handling for function type coercion in DataFusion. Core changes include introducing a new generic ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @martin-augment, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the user experience when encountering type coercion errors with functions, particularly User-Defined Functions (UDFs). By standardizing and enriching the error messages, it provides clearer feedback to users about why their function calls failed due to type mismatches. The new approach distinguishes between user-defined and built-in functions, offering specific guidance like candidate signatures for the latter, and presenting the original, unadulterated error for the former, thereby streamlining the debugging process and reducing user frustration. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Code Review: Standardize UDF coercion error messagesThank you for this PR! This is a solid improvement to error messaging consistency across DataFusion. Here's my review: ✅ Strengths
🔍 Observations
🤔 Suggestions
🔒 Security & Performance
📊 Overall AssessmentThis is a well-executed PR that improves user experience through better error messages. The refactoring reduces code duplication and centralizes error formatting logic. The comprehensive test updates demonstrate thoroughness. Recommendation: ✅ Approve with minor suggestion to address the The single improvement I'd suggest is making the error prefix stripping more defensive, but this is a minor issue that doesn't block the PR. |
🤖 Augment PR SummarySummary: This PR standardizes type-coercion error messages for UDF-like functions across planning paths. Changes:
Technical Notes: Candidate signatures are rendered via 🤖 Was this summary useful? React with 👍 or 👎 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| err @ DataFusionError::Plan(_) => err | ||
| .strip_backtrace() | ||
| .strip_prefix("Error during planning: ") | ||
| .unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strip_prefix("Error during planning: ").unwrap() can panic if the string format ever differs (or if the prefix is already stripped), turning an error-reporting path into a crash. Consider avoiding unwrap here (e.g., extract the inner Plan message directly) so planning failures remain non-panicking.
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value:useful; category:bug; feedback: The Augment AI reviewer is correct! Error reporting should not cause crashes, ever. It is good to make a an attempt to cleanup the message but it should be in a safer way, as the proposed change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@datafusion/expr/src/utils.rs`:
- Around line 994-1002: The match arm handling DataFusionError::Plan can panic
because strip_prefix(...).unwrap() assumes the prefix exists; update the
DataFusionError::Plan branch (where original_error is matched) to call
strip_backtrace(), then attempt strip_prefix("Error during planning: ") safely
by using map/and_then or an if let to take the stripped result when present and
otherwise fall back to the unstripped string (i.e., replace the
.strip_prefix(...).unwrap().to_string() with logic that uses
.strip_prefix(...).map(|s| s.to_string()).unwrap_or_else(|| <original stripped
value>.to_string())). Keep the other branch using err.strip_backtrace()
unchanged and ensure the final original_error remains a String.
| let original_error = match original_error { | ||
| // Since we're returning a Plan error we don't want any double nesting. | ||
| // TODO: is there a better way to strip backtrace & the planning prefix? | ||
| err @ DataFusionError::Plan(_) => err | ||
| .strip_backtrace() | ||
| .strip_prefix("Error during planning: ") | ||
| .unwrap() | ||
| .to_string(), | ||
| err => err.strip_backtrace(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid potential panic when stripping the planning prefix.
strip_prefix(...).unwrap() will panic if the prefix is absent (e.g., a DataFusionError::Plan carrying a raw message without the standard prefix). Safer to fall back to the unstripped message.
🔧 Suggested fix
- let original_error = match original_error {
- // Since we're returning a Plan error we don't want any double nesting.
- // TODO: is there a better way to strip backtrace & the planning prefix?
- err @ DataFusionError::Plan(_) => err
- .strip_backtrace()
- .strip_prefix("Error during planning: ")
- .unwrap()
- .to_string(),
- err => err.strip_backtrace(),
- };
+ let original_error = match original_error {
+ // Since we're returning a Plan error we don't want any double nesting.
+ // TODO: is there a better way to strip backtrace & the planning prefix?
+ err @ DataFusionError::Plan(_) => {
+ let msg = err.strip_backtrace();
+ msg.strip_prefix("Error during planning: ")
+ .unwrap_or(&msg)
+ .to_string()
+ }
+ err => err.strip_backtrace(),
+ };🤖 Prompt for AI Agents
In `@datafusion/expr/src/utils.rs` around lines 994 - 1002, The match arm handling
DataFusionError::Plan can panic because strip_prefix(...).unwrap() assumes the
prefix exists; update the DataFusionError::Plan branch (where original_error is
matched) to call strip_backtrace(), then attempt strip_prefix("Error during
planning: ") safely by using map/and_then or an if let to take the stripped
result when present and otherwise fall back to the unstripped string (i.e.,
replace the .strip_prefix(...).unwrap().to_string() with logic that uses
.strip_prefix(...).map(|s| s.to_string()).unwrap_or_else(|| <original stripped
value>.to_string())). Keep the other branch using err.strip_backtrace()
unchanged and ensure the final original_error remains a String.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value:useful; category:bug; feedback: The CodeRabbit AI reviewer is correct! Error reporting should not cause crashes, ever. It is good to make a an attempt to cleanup the message but it should be in a safer way, as the proposed change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request standardizes the error messages for UDF type coercion, which is a great improvement for user experience. The core of the change is centralizing error message generation in a new utils::generate_signature_error_message function. This provides more informative and consistent errors. The related refactoring to simplify error propagation using the ? operator is also a good cleanup. I have one suggestion to improve the robustness of the new error handling logic.
| err @ DataFusionError::Plan(_) => err | ||
| .strip_backtrace() | ||
| .strip_prefix("Error during planning: ") | ||
| .unwrap() | ||
| .to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of .unwrap() here could cause a panic if the prefix is not found in the error string. While it seems safe with the current implementation of DataFusionError::Plan, it's better to avoid unwrap() in production code to make it more robust against future changes. The TODO comment also indicates this might be a concern. I suggest handling the None case gracefully.
err @ DataFusionError::Plan(_) => {
let s = err.strip_backtrace();
if let Some(unprefixed) = s.strip_prefix("Error during planning: ") {
unprefixed.to_string()
} else {
s
}
},There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
value:useful; category:bug; feedback: The Gemini AI reviewer is correct! Error reporting should not cause crashes, ever. It is good to make a an attempt to cleanup the message but it should be in a safer way, as the proposed change.
value:useful; category:bug; feedback: The Claude AI reviewer is correct! Error reporting should not cause crashes, ever. It is good to make a an attempt to cleanup the message but it should be in a safer way, as the proposed change. |
20070: To review by AI