-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Refactor PartitionedFile: add ordering field and new_from_meta constructor #19596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
96de14b to
aeda804
Compare
…uctor Add infrastructure to track ordering information per file in preparation for ordering inference from Parquet metadata. Changes: - Add `ordering: Option<LexOrdering>` field to `PartitionedFile` struct - Add `new_from_meta(ObjectMeta)` constructor for creating files from metadata - Add `with_ordering()` and `with_partition_values()` builder methods - Update all PartitionedFile constructors to initialize `ordering: None` - Update callsites in test_util, proto, and substrait to use new_from_meta 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Update all PartitionedFile struct literals to use new_from_meta() builder pattern with appropriate builder methods like with_partition_values(), with_extensions(), with_range(), and with_statistics(). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
aeda804 to
bf51e52
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds infrastructure to support per-file ordering information in preparation for ordering inference from Parquet metadata. The changes introduce a new ordering field to PartitionedFile along with a cleaner constructor pattern using new_from_meta() and builder methods.
Key Changes:
- Added
ordering: Option<LexOrdering>field toPartitionedFilestruct with correspondingwith_ordering()builder method - Introduced
new_from_meta(ObjectMeta)constructor andwith_partition_values()builder method for cleaner object construction - Refactored all existing
PartitionedFileinstantiations across the codebase to use the new constructor pattern
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| datafusion/datasource/src/mod.rs | Core changes: added ordering field, new_from_meta() constructor, with_ordering() and with_partition_values() builder methods, updated all constructors to initialize ordering |
| datafusion/substrait/src/physical_plan/consumer.rs | Refactored to use new_from_meta() instead of manual struct construction |
| datafusion/proto/src/physical_plan/from_proto.rs | Refactored protobuf conversion to use new_from_meta() with builder pattern, updated test helper |
| datafusion/datasource/src/file_scan_config.rs | Refactored test helper to use new_from_meta() with builder methods |
| datafusion/datasource/src/display.rs | Simplified test utility to use new_from_meta() |
| datafusion/datasource-parquet/src/row_group_filter.rs | Refactored test setup to use new_from_meta() |
| datafusion/core/tests/parquet/page_pruning.rs | Simplified test setup to use new_from_meta() |
| datafusion/core/tests/parquet/custom_reader.rs | Refactored to use new_from_meta() with with_extensions() builder |
| datafusion/core/src/test_util/parquet.rs | Simplified FileScanConfigBuilder setup to use new_from_meta() |
| datafusion/core/src/datasource/physical_plan/parquet.rs | Refactored multiple test helpers to use new_from_meta() with builder pattern |
| datafusion/core/src/datasource/mod.rs | Simplified test utility to use new_from_meta() |
| datafusion/core/src/datasource/file_format/mod.rs | Simplified test utility to use new_from_meta() |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
Which issue does this PR close?
Part of #19433
Rationale for this change
In preparation for ordering inference from Parquet metadata, we need to be able to store per-file ordering information on
PartitionedFile. This PR adds the necessary infrastructure.What changes are included in this PR?
ordering: Option<LexOrdering>field toPartitionedFilestructnew_from_meta(ObjectMeta)constructor for creating files from metadata (cleaner than manually constructing)with_ordering()builder method to set ordering informationwith_partition_values()builder method for consistencyPartitionedFileconstructors to initializeordering: Nonenew_from_metaAre these changes tested?
Yes, existing tests pass. The new field is currently always
Noneso no new tests are needed yet. Tests for ordering inference will come in a follow-up PR.Are there any user-facing changes?
No user-facing API changes. The
orderingfield is public but users typically constructPartitionedFilevia the provided constructors which handle this automatically.🤖 Generated with Claude Code