-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Refactor duplicate code in type_coercion/functions.rs
#19518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| let arg_fields = arg_types | ||
| .iter() | ||
| .map(|dt| Field::new("f", dt.clone(), true)) | ||
| .map(Arc::new) | ||
| .collect::<Vec<_>>(); | ||
| let return_types = | ||
| rresult_return!(data_types_with_scalar_udf(&arg_types, udf.inner())); | ||
| rresult_return!(fields_with_udf(&arg_fields, udf.inner().as_ref())) | ||
| .into_iter() | ||
| .map(|f| f.data_type().to_owned()) | ||
| .collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as how UDAF handles this:
datafusion/datafusion/ffi/src/udaf/mod.rs
Lines 329 to 350 in e5ca510
| unsafe extern "C" fn coerce_types_fn_wrapper( | |
| udaf: &FFI_AggregateUDF, | |
| arg_types: RVec<WrappedSchema>, | |
| ) -> FFIResult<RVec<WrappedSchema>> { | |
| unsafe { | |
| let udaf = udaf.inner(); | |
| let arg_types = rresult_return!(rvec_wrapped_to_vec_datatype(&arg_types)); | |
| let arg_fields = arg_types | |
| .iter() | |
| .map(|dt| Field::new("f", dt.clone(), true)) | |
| .map(Arc::new) | |
| .collect::<Vec<_>>(); | |
| let return_types = rresult_return!(fields_with_aggregate_udf(&arg_fields, udaf)) | |
| .into_iter() | |
| .map(|f| f.data_type().to_owned()) | |
| .collect::<Vec<_>>(); | |
| rresult!(vec_datatype_to_rvec_wrapped(&return_types)) | |
| } | |
| } |
| /// | ||
| /// For more details on coercion in general, please see the | ||
| /// [`type_coercion`](crate::type_coercion) module. | ||
| #[deprecated(since = "52.0.0", note = "use fields_with_udf")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I'm of the mind to just remove these functions completely
| /// For more details on coercion in general, please see the | ||
| /// [`type_coercion`](crate::type_coercion) module. | ||
| #[deprecated(since = "52.0.0", note = "use fields_with_udf")] | ||
| pub fn data_types( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function was only ever used in unit tests below 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if its pub it still might be used elsewhere beyond DF engine
|
|
||
| matches!( | ||
| type_signature, | ||
| match type_signature { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using match here makes it more obvious which variants are not well supported
| ); | ||
| } | ||
| }, | ||
| TypeSignature::OneOf(signatures) => signatures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The scalar fn version actually has extra functionality in that it collects the errors together for oneof; unifying this code with aggregate/window udfs brings in this functionality for them, hence SLT fix was needed
| } | ||
|
|
||
| let current_types = expressions | ||
| let current_fields = expressions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| } | ||
|
|
||
| impl UDFCoercionExt for ScalarUDF { | ||
| fn name(&self) -> &str { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its bad the specialization feature is in nightly releases, so we could get rid of those parts
comphead
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Jefffrey
|
|
||
| # WindowFunction wrong signature | ||
| statement error DataFusion error: Error during planning: Failed to coerce arguments to satisfy a call to 'nth_value' function: coercion from Int32, Int64, Int64 to the signature OneOf\(\[Any\(0\), Any\(1\), Any\(2\)\]\) failed | ||
| statement error DataFusion error: Error during planning: Internal error: Function 'nth_value' failed to match any signature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really an Internal error ?
It is an error caused by the user's input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I aim to tackle in #19004
| // Every signature failed, return the joined error | ||
| if res.is_empty() { | ||
| internal_err!( | ||
| return internal_err!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a plan_err!()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I aim to tackle in #19004
Which issue does this PR close?
Rationale for this change
Found lots of code duplicated here, so unifying them.
What changes are included in this PR?
Introduce new trait
UDFCoercionExtto unify functions across scalar/aggregate/window UDFs for use in a single generic implementation. New unified functionsfields_with_udfandget_valid_types_with_udfwhich are generic across this new trait.Are these changes tested?
Existing tests.
Are there any user-facing changes?
Yes, deprecating functions.