Skip to content

Conversation

@Jefffrey
Copy link
Contributor

@Jefffrey Jefffrey commented Dec 28, 2025

Which issue does this PR close?

Rationale for this change

Found lots of code duplicated here, so unifying them.

What changes are included in this PR?

Introduce new trait UDFCoercionExt to unify functions across scalar/aggregate/window UDFs for use in a single generic implementation. New unified functions fields_with_udf and get_valid_types_with_udf which are generic across this new trait.

Are these changes tested?

Existing tests.

Are there any user-facing changes?

Yes, deprecating functions.

@github-actions github-actions bot added logical-expr Logical plan and expressions physical-expr Changes to the physical-expr crates optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) ffi Changes to the ffi crate labels Dec 28, 2025
Comment on lines 143 to 152
let arg_fields = arg_types
.iter()
.map(|dt| Field::new("f", dt.clone(), true))
.map(Arc::new)
.collect::<Vec<_>>();
let return_types =
rresult_return!(data_types_with_scalar_udf(&arg_types, udf.inner()));
rresult_return!(fields_with_udf(&arg_fields, udf.inner().as_ref()))
.into_iter()
.map(|f| f.data_type().to_owned())
.collect::<Vec<_>>();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as how UDAF handles this:

unsafe extern "C" fn coerce_types_fn_wrapper(
udaf: &FFI_AggregateUDF,
arg_types: RVec<WrappedSchema>,
) -> FFIResult<RVec<WrappedSchema>> {
unsafe {
let udaf = udaf.inner();
let arg_types = rresult_return!(rvec_wrapped_to_vec_datatype(&arg_types));
let arg_fields = arg_types
.iter()
.map(|dt| Field::new("f", dt.clone(), true))
.map(Arc::new)
.collect::<Vec<_>>();
let return_types = rresult_return!(fields_with_aggregate_udf(&arg_fields, udaf))
.into_iter()
.map(|f| f.data_type().to_owned())
.collect::<Vec<_>>();
rresult!(vec_datatype_to_rvec_wrapped(&return_types))
}
}

///
/// For more details on coercion in general, please see the
/// [`type_coercion`](crate::type_coercion) module.
#[deprecated(since = "52.0.0", note = "use fields_with_udf")]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'm of the mind to just remove these functions completely

/// For more details on coercion in general, please see the
/// [`type_coercion`](crate::type_coercion) module.
#[deprecated(since = "52.0.0", note = "use fields_with_udf")]
pub fn data_types(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function was only ever used in unit tests below 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if its pub it still might be used elsewhere beyond DF engine


matches!(
type_signature,
match type_signature {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using match here makes it more obvious which variants are not well supported

);
}
},
TypeSignature::OneOf(signatures) => signatures
Copy link
Contributor Author

@Jefffrey Jefffrey Dec 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scalar fn version actually has extra functionality in that it collects the errors together for oneof; unifying this code with aggregate/window udfs brings in this functionality for them, hence SLT fix was needed

@github-actions github-actions bot added the core Core DataFusion crate label Dec 28, 2025
@Jefffrey Jefffrey marked this pull request as ready for review December 28, 2025 10:50
}

let current_types = expressions
let current_fields = expressions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}

impl UDFCoercionExt for ScalarUDF {
fn name(&self) -> &str {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its bad the specialization feature is in nightly releases, so we could get rid of those parts

Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Jefffrey


# WindowFunction wrong signature
statement error DataFusion error: Error during planning: Failed to coerce arguments to satisfy a call to 'nth_value' function: coercion from Int32, Int64, Int64 to the signature OneOf\(\[Any\(0\), Any\(1\), Any\(2\)\]\) failed
statement error DataFusion error: Error during planning: Internal error: Function 'nth_value' failed to match any signature
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really an Internal error ?
It is an error caused by the user's input.

Copy link
Contributor Author

@Jefffrey Jefffrey Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I aim to tackle in #19004

// Every signature failed, return the joined error
if res.is_empty() {
internal_err!(
return internal_err!(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a plan_err!()

Copy link
Contributor Author

@Jefffrey Jefffrey Dec 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I aim to tackle in #19004

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate ffi Changes to the ffi crate logical-expr Logical plan and expressions optimizer Optimizer rules physical-expr Changes to the physical-expr crates sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants