Skip to content

Conversation

Kontinuation
Copy link
Member

Which issue does this PR close?

Rationale for this change

The metadata of fields created by VALUES SQL expression were lost during the creation of LogicalPlan::Values logical plan node. This patch tries to preserve the field metadata.

What changes are included in this PR?

This patch takes the metadata of fields in VALUES expression into consideration. The schema of VALUES expression will contain field with proper metadata.

Are these changes tested?

  1. Added a unit test
  2. Added the repro in the issue to the user_defined_scalar_functions test

Are there any user-facing changes?

No public API changes.

@github-actions github-actions bot added logical-expr Logical plan and expressions core Core DataFusion crate labels Sep 11, 2025
@Kontinuation Kontinuation marked this pull request as ready for review September 11, 2025 17:48
@Kontinuation
Copy link
Member Author

CC @paleolimbot

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking into this!

Comment on lines 312 to 315
common_metadata = FieldMetadata::merge_options(
common_metadata.as_ref(),
Some(&metadata),
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure this is the exact merge operation we want here, since it will happily merge two completely unrelated extensions (I think) by overwriting the metadata of the first with whatever was encountered last.

It is not perfect, but a potentially safer solution might be to error for any mismatched metadata (i.e., if there's any metadata on these expressions, it has to be identical here or it will error).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should ensure that all the metadata in VALUES must be identical. This has the most unambiguous semantics and does not pose too many restrictions to typical use cases of VALUES. I'll submit commits to fix this later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated LogicalPlanBuilder::values to require that all metadata should be identical. I'm not sure if a more permissive approach, such as computing the intersection of metadata, would work, but this more strict rule is a good start anyway.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Kontinuation and @paleolimbot -- this looks good to me.

@alamb
Copy link
Contributor

alamb commented Sep 17, 2025

I merged up to resolve a conflict

@alamb
Copy link
Contributor

alamb commented Sep 18, 2025

And there is now another one! I resolved it and merged up

@alamb alamb added this pull request to the merge queue Sep 18, 2025
@alamb
Copy link
Contributor

alamb commented Sep 18, 2025

🚀

Merged via the queue into apache:main with commit 293bf3e Sep 18, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extension metadata dropped from literals in SQL VALUES clause
3 participants