Support reverse parquet scan and fast parquet order inversion at row group level #18817

zhuqi-lucas · 2025-11-19T12:12:08Z

Which issue does this PR close?

Overview

This PR implements reverse scanning for Parquet files to optimize ORDER BY ... DESC LIMIT N queries on sorted data. When DataFusion detects that reversing the scan order would eliminate the need for a separate sort operation, it can now directly read Parquet files in reverse order.

Implementation Note: This PR implements Part 1 of the vision outlined in #17172 (Order Inversion at the DataFusion level).

Current implementation:

✅ Row-group-level reversal (this PR)
✅ Memory bounded by row group size (typically ~128MB)
✅ Significant performance improvements for common use cases

Future improvements (requires arrow-rs changes):

Page-level reverse decoding (would reduce memory to ~1MB per page)
In-place array flipping (would eliminate take kernel overhead)
See issue Fast parquet order inversion #17172 for full details

These enhancements would further optimize memory usage and latency, but the current implementation already provides substantial benefits for most workloads.

Rationale for this change

Motivation

Currently, queries like SELECT * FROM table ORDER BY sorted_column DESC LIMIT 100 require DataFusion to:

Read the entire file in forward order
Sort/reverse all results in memory
Apply the limit

For files that are already sorted in ascending order, this is inefficient. With this optimization, DataFusion can:

Read row groups in reverse order
Reverse individual batches progressively
Stream results directly without a separate sort operation
Stop reading early when the limit is reached (single-partition case)

Performance Benefits:

Eliminates memory-intensive sort operations for large files
Enables streaming for reverse-order queries with limits
Reduces query latency significantly for sorted data
Reduces I/O by stopping early when limit is satisfied (single-partition)

Scope and Limitations

This optimization applies to:

✅ Single-partition scans (most common case for sorted Parquet files) - Full optimization: sort completely eliminated
✅ Multi-partition scans - Partial optimization: each partition scans in reverse, but SortPreservingMerge is still required
✅ Queries with ORDER BY ... DESC on pre-sorted columns
✅ Queries with LIMIT clauses (most beneficial for single-partition)

This optimization does NOT apply to:

❌ Unsorted files - No benefit from reverse scanning
❌ Complex sort expressions that don't match file ordering

Single-partition vs Multi-partition:

Scenario	Optimization Effect	Why
Single-partition	Sort operation completely eliminated	No merge needed, direct reverse streaming with limit pushdown
Multi-partition	Per-partition sorts eliminated, but merge still required	Each partition reads in reverse (eliminating per-partition sorts), but `SortPreservingMergeExec` is needed to combine streams. Limit cannot be pushed to individual partitions.

Performance comparison:

Single-partition: ORDER BY DESC LIMIT N → Direct reverse scan with limit pushed down to DataSource
Multi-partition: ORDER BY DESC LIMIT N → Reverse scan per partition + LocalLimitExec + SortPreservingMergeExec

While multi-partition scans still require a merge operation, they benefit significantly from:

Elimination of per-partition sort operations
Parallel reverse scanning across partitions
Reduced data volume entering the merge stage via LocalLimitExec

Configuration

This optimization is enabled by default but can be controlled via:

SQL:

SET datafusion.execution.parquet.enable_reverse_scan = true/false;

Rust API:

let ctx = SessionContext::new()
    .with_config(
        SessionConfig::new()
            .with_parquet_reverse_scan(false)  // Disable optimization
    );

When to disable:

If your Parquet files have very large row groups (> 256MB) and memory is constrained (row group buffering required for correctness)
For debugging or performance comparison purposes
If you observe unexpected behavior (please report as a bug!)

Default: Enabled (true)

Implementation Details

Architecture

The implementation consists of four main components:

1. ParquetSource API (`source.rs`)

Added reverse_scan: bool field to ParquetSource
Added with_reverse_scan() and reverse_scan() methods
The flag is propagated through the file scan configuration

2. ParquetOpener (`opener.rs`)

Added reverse_scan: bool field
Row Group Reversal: Before building the stream, row group indices are reversed: row_group_indexes.reverse()
Stream Selection: Based on reverse_scan flag, creates either:
- Normal stream: RecordBatchStreamAdapter
- Reverse stream: ReversedParquetStream with row-group-level buffering

3. ReversedParquetStream (`opener.rs`)

A custom stream implementation that performs two-stage reversal with optional limit support:

Stage 1 - Row Reversal: Reverse rows within each batch using Arrow's take kernel

let indices = UInt32Array::from_iter_values((0..num_rows as u32).rev());
take(column, &indices, None)

Stage 2 - Batch Reversal: Reverse the order of batches within each row group

reversed_batches.into_iter().rev()

Key Properties:

Bounded Memory: Buffers at most one row group at a time (typically ~128MB with default Parquet writer settings)
Progressive Output: Outputs reversed batches immediately after each row group completes
Limit Support: Unified implementation handles both limited and unlimited scans
- With limit: Stops processing when limit is reached, avoiding unnecessary I/O
- Without limit: Processes entire file in reverse order
Metrics: Tracks row_groups_reversed, batches_reversed, and reverse_time

4. Physical Optimizer (`reverse_order.rs`)

New ReverseOrder optimization rule
Detects patterns where reversing the input satisfies sort requirements:
- SortExec with reversible input ordering
- GlobalLimitExec -> SortExec patterns (most beneficial case)
Uses TreeNodeRewriter to push reverse flag down to ParquetSource
Single-partition check: Only pushes limit to single-partition DataSourceExec to avoid correctness issues with multi-partition scans
Preserves correctness by checking:
- Input ordering compatibility
- Required input ordering constraints
- Ordering preservation properties

Why Row-Group-Level Buffering?

Row group buffering is necessary for correctness:

Parquet Structure: Files are organized into independent row groups (typically ~128MB with default settings)
Batch Boundaries: The parquet reader's batches may not align with row group boundaries
Correct Ordering: We must ensure complete row groups are reversed to maintain semantic correctness

This is the minimal buffering granularity that ensures correct results while still being compatible with arrow-rs's existing parquet reader architecture.

Memory Characteristics:

Maximum memory: Size of largest row group (typically ~128MB with default Parquet writer settings)
Not O(file_size), but O(row_group_size)
Acceptable trade-off for elimination of full-file sort operation

Why this is necessary:

Parquet batches don't align with row group boundaries
Must buffer complete row groups to ensure correct ordering
This is the minimal buffering granularity for correctness

Future Optimization: Page-level reverse scanning in arrow-rs could further reduce memory usage and improve latency by eliminating row-group buffering entirely.

What changes are included in this PR?

Core Implementation:
- ParquetSource: Added reverse scan flag and methods
- ParquetOpener: Row group reversal and stream creation logic
- ReversedParquetStream: Unified stream implementation with optional limit support
Physical Optimization:
- ReverseOrder: New optimizer rule for detecting and applying reverse scan optimization
- Pattern matching for SortExec and GlobalLimitExec -> SortExec
- Single-partition validation to ensure optimization is beneficial
Configuration:
- Added enable_reverse_scan config option (default: true)
- SQL and Rust API support
Metrics:
- row_groups_reversed: Count of reversed row groups
- batches_reversed: Count of reversed batches
- reverse_time: Time spent reversing data

Are these changes tested?

Yes, comprehensive tests added:

Unit Tests (opener.rs):

Single batch reversal
Multiple batch reversal
Multiple row group handling
Limit enforcement
Null value handling
ParquetSource flag propagation

Integration Tests (reverse_order.rs):

Sort removal optimization
Limit + Sort pattern optimization
Multi-partition handling (partial optimization with merge)
Nested sort patterns
Edge cases (empty exec, multiple columns, etc.)

SQL Logic Tests (.slt files):

End-to-end query validation
Single-partition reverse scan with multiple row groups
Multi-partition reverse scan with file reversal
Uneven partition handling
Performance comparisons
Correctness verification across various scenarios

Are there any user-facing changes?

New Configuration Option:

datafusion.execution.parquet.enable_reverse_scan (default: true)

Behavioral Changes:

Queries with ORDER BY ... DESC LIMIT N on sorted single-partition Parquet files will automatically use reverse scanning when beneficial
Multi-partition queries benefit from per-partition reverse scanning, though merge is still required
No changes to query results - only performance improvements
New metrics available in query execution metrics

Breaking Changes:

None. This is a purely additive optimization that maintains backward compatibility.

xudong963 · 2025-11-22T13:44:37Z

Also cc @suremarc, finally, we're contributing our reversed parquet optimization to upstream, I guess you may be interested in seeing it.

zhuqi-lucas · 2025-11-22T13:51:06Z

Thank you @xudong963 @suremarc, i do a lot of changes comparing our internal implementation in this PR, but i think in general the major design is similar to our internal version, the row group level reverse. But need to add more follow-up PRs to improve it further, for example, we should support customer output order source, so that we can integrated it with ordered_partiton source, etc.

alamb · 2025-11-23T11:58:51Z

Thanks -- I'll try and review this tomorrow

zhuqi-lucas · 2025-11-23T12:00:47Z

Thanks -- I'll try and review this tomorrow

Thank you @alamb !

zhuqi-lucas · 2025-11-23T12:05:26Z

datafusion/common/src/config.rs

+        /// are read in reverse order to eliminate sort operations.
+        /// Note: This buffers one row group at a time (typically ~128MB).
+        /// Default: true
+        pub enable_reverse_scan: bool, default = true


Note, i default to true for reverse optimization, we can default to false if you think it's risky for some cases.

The key risk is the memory overhead, because it's row group level reverse, so we need to cache the row group level batches, if we setting the row group max size big, it will use high memory.

zhuqi-lucas · 2025-11-23T12:08:13Z

datafusion/physical-optimizer/src/pushdown_sort.rs

+}
+
+/// Remove unnecessary sort based on the logic from EnforceSorting::analyze_immediate_sort_removal
+fn remove_unnecessary_sort(


Note, i add this to reverse order because after reverse order, we can optimize more to remove the sort, so we don't need to execute the enforce sort optimization again after this optimization.

How about putting pushdown_sort before enforce_sorting

Thanks @xudong963 , i tried this before, but it seems caused other problems for other optimizers issues, i can test again.

The best solution may be that we don't need a new optimizer pushdown_sort, so we can just enhance the existed optimizer to support it. I will try this later.

zhuqi-lucas · 2025-11-23T12:57:14Z

datafusion/sqllogictest/test_files/topk.slt

 ----
-physical_plan
-01)SortExec: TopK(fetch=3), expr=[number@0 ASC NULLS LAST], preserve_partitioning=[false]
-02)--DataSourceExec: file_groups={1 group: [[WORKSPACE_ROOT/datafusion/sqllogictest/test_files/scratch/topk/partial_sorted/1.parquet]]}, projection=[number, letter, age], output_ordering=[number@0 DESC, letter@1 ASC NULLS LAST], file_type=parquet, predicate=DynamicFilter [ empty ]


Reverse scan make it don't need to sort here, and it's very fast.

2010YOUY01 · 2025-11-24T04:57:49Z

Supporting scanning Parquet files in reverse order is an absolutely great idea. I have a few questions.

Let me first rephrase it to make sure I understand correctly, this PR does:

For applicable query patterns (topK that has reverse order to the parquet existing order), reverse the row-group scanning order
For each row group, first cache all the result, then reverse the row-level order batch by batch.

This implementation is quite aggressive, I think it can get a bit tricky to tune it right, to avoid excessive caching, or reversing rows batch by batch become too expensive.

What if we limit the initial implementation only to reverse the row-group order, similar to what @adriangb is planning to do at file level in #17271
After scanning the last row-group, the topk dynamic filter will automatically get updated and skip the preceding row groups.

The benefits are simplicity and lower risk of regressions
The downside is it's too conservative and can't get the optimal performance. But once we have native reverse parquet decoding support in arrow-rs (that is described in the original issue Fast parquet order inversion #17172), we can implement the reverse scan at the row level as follow-ups.

zhuqi-lucas · 2025-11-24T05:50:52Z

Supporting scanning Parquet files in reverse order is an absolutely great idea. I have a few questions.

Let me first rephrase it to make sure I understand correctly, this PR does:

For applicable query patterns (topK that has reverse order to the parquet existing order), reverse the row-group scanning order

For each row group, first cache all the result, then reverse the row-level order batch by batch.

This implementation is quite aggressive, I think it can get a bit tricky to tune it right, to avoid excessive caching, or reversing rows batch by batch become too expensive.

What if we limit the initial implementation only to reverse the row-group order, similar to what @adriangb is planning to do at file level in #17271 After scanning the last row-group, the topk dynamic filter will automatically get updated and skip the preceding row groups.

The benefits are simplicity and lower risk of regressions

The downside is it's too conservative and can't get the optimal performance. But once we have native reverse parquet decoding support in arrow-rs (that is described in the original issue Fast parquet order inversion #17172), we can implement the reverse scan at the row level as follow-ups.

Thank you @2010YOUY01 for review and valid concern:
You raise valid concerns about memory overhead is what i mentioned the key risk for this approach.
However, I want to clarify that row group reversal alone cannot eliminate the SortExec - it only provides TopK filtering benefits. Without reversing rows within each row group, the data remains in the original order (e.g., ASC when we need DESC), so the sort must stay. I propose we keep the complete optimization but default enable_reverse_scan to false. Once we implement page-level caching in arrow-rs (which will reduce memory overhead significantly), we can consider enabling it by default.

And we've been running the full implementation (row group + row-level reversal) in production for very long time with excellent results: 10-100x speedups for time-series queries, well-controlled memory usage (~one row group cached at a time), but we need to note we should not make the row group size big if we enable this feature. And with very small limit, the high memory usage is very short time. Also the reverse time is very small compared to the benefit we remove all sort.

And if we want to improve the original scan to support limit the initial implementation only to reverse the row-group order, i think we can add follow-up PRs because this is another optimization which can't remove the sort for optimization so we need to do this in another PR.

And regarding native arrow-rs support for page-level reversal:

As discussed in arrow-rs#3922, implementing true page-level reverse decoding is
technically challenging due to:

Most encodings (except PLAIN) use length-prefixed blocks that can't be decoded backwards
Dremel record shredding for nested types is order-sensitive
Requires Offset Index (Parquet 2.0+) to locate pages

While arrow-rs may eventually support this (as proposed in #17172), it requires
significant work. Our current implementation (row group-level caching) is the most
practical solution available today and has been proven in production for very long time.

Once arrow-rs implements native page-level reversal, we can easily migrate to it
without changing the DataFusion API. But it should be a long way to go.

What's your opinion?

adriangb · 2025-11-24T06:24:24Z

I haven't looked into all of this discussion and code (I just got tagged). I've been looking into optimizing sorted scanning in DataFusion and IMO where we should land is:

Via metadata (FileScanConfig / ORDERED BY ... in SQL) users declare a known sort order of their files.
The planner uses statistics from the files + any ORDER BY clauses in the query to arrange file ordering to best match the query. The FileSource implementation can also receive the ORDER BY information and optimize scan order within a file (e.g. reversing the order of reads which is what I think this PR is doing).
If the planner is able to deduce from file level stats that the files can be ordered and the FileSource reports that it is able to produce batches in sorted order then the optimizer can optimize away the sort completely.

I hope that is helpful.

zhuqi-lucas · 2025-11-24T06:36:07Z

I haven't looked into all of this discussion and code (I just got tagged). I've been looking into optimizing sorted scanning in DataFusion and IMO where we should land is:

Via metadata (FileScanConfig / ORDERED BY ... in SQL) users declare a known sort order of their files.

The planner uses statistics from the files + any ORDER BY clauses in the query to arrange file ordering to best match the query. The FileSource implementation can also receive the ORDER BY information and optimize scan order within a file (e.g. reversing the order of reads which is what I think this PR is doing).

If the planner is able to deduce from file level stats that the files can be ordered and the FileSource reports that it is able to produce batches in sorted order then the optimizer can optimize away the sort completely.

I hope that is helpful.

Thank you @adriangb , it's helpful for future optimization.
My current implementation focuses on a specific optimization case: when data is already sorted and we need the reverse order, we can flip the scan direction instead of reading everything and sorting. The reverse_scan in FileSource handles the files/ and within-file ordering reversal.

I think these approaches are complementary - my PR handles the reverse scan optimization, while your vision provides a framework for broader sorted-scan optimizations using file-level statistics and metadata. Would be great to build toward that architecture incrementally.

2010YOUY01 · 2025-11-24T08:12:57Z

Thank you @2010YOUY01 for review and valid concern:
You raise valid concerns about memory overhead is what i mentioned the key risk for this approach.
However, I want to clarify that row group reversal alone cannot eliminate the SortExec - it only provides TopK filtering benefits. Without reversing rows within each row group, the data remains in the original order (e.g., ASC when we need DESC), so the sort must stay. I propose we keep the complete optimization but default enable_reverse_scan to false. Once we implement page-level caching in arrow-rs (which will reduce memory overhead significantly), we can consider enabling it by default.

Did you mean 'cannot eliminate the SortExec(TopK)'? Just to confirm there is no global sort, but it is true that we have do a topK on a whole row group for this naive approach.

I have a intuition that for this kind of workload, the bottleneck is on the parquet decoding speed, and an extra TopK won't introduce much additional overhead, so this naive approach can also get pretty fast.

It makes a lot of sense that it's very hard to implement page/row level reversal in arrow-rs side, so we have to figure out how to do this at row-group level.

Summary: Perhaps we can start by adding a few end-to-end benchmarks that reflect your typical production workload. If this PR’s approach shows a clear improvement over the naive approach in #18817 (comment) (I'm happy to do a quick prototype), we should definitely move forward.

zhuqi-lucas · 2025-11-24T08:27:06Z

Thank you @2010YOUY01 for review and valid concern:
You raise valid concerns about memory overhead is what i mentioned the key risk for this approach.
However, I want to clarify that row group reversal alone cannot eliminate the SortExec - it only provides TopK filtering benefits. Without reversing rows within each row group, the data remains in the original order (e.g., ASC when we need DESC), so the sort must stay. I propose we keep the complete optimization but default enable_reverse_scan to false. Once we implement page-level caching in arrow-rs (which will reduce memory overhead significantly), we can consider enabling it by default.

Did you mean 'cannot eliminate the SortExec(TopK)'? Just to confirm there is no global sort, but it is true that we have do a topK on a whole row group for this naive approach.

I have a intuition that for this kind of workload, the bottleneck is on the parquet decoding speed, and an extra TopK won't introduce much additional overhead, so this naive approach can also get pretty fast.

It makes a lot of sense that it's very hard to implement page/row level reversal in arrow-rs side, so we have to figure out how to do this at row-group level.

Summary: Perhaps we can start by adding a few end-to-end benchmarks that reflect your typical production workload. If this PR’s approach shows a clear improvement over the naive approach in #18817 (comment) (I'm happy to do a quick prototype), we should definitely move forward.

Nice point @2010YOUY01 , i agree most time will be decode page, i can change this PR to add the config to implement #18817 (comment) or create another PR for it, so we can have more options to compare, i agree the easier solution is better.

And a benchmark is really helpful, thanks!

xudong963 · 2025-11-24T08:48:24Z

FYI, I'll start reviewing the PR tomorrow.

zhuqi-lucas · 2025-11-24T08:51:10Z

FYI, I'll start reviewing the PR tomorrow.

Thanks @xudong963 !

adriangb · 2025-11-24T12:41:01Z

I haven't looked into all of this discussion and code (I just got tagged). I've been looking into optimizing sorted scanning in DataFusion and IMO where we should land is:

Via metadata (FileScanConfig / ORDERED BY ... in SQL) users declare a known sort order of their files.

The planner uses statistics from the files + any ORDER BY clauses in the query to arrange file ordering to best match the query. The FileSource implementation can also receive the ORDER BY information and optimize scan order within a file (e.g. reversing the order of reads which is what I think this PR is doing).

If the planner is able to deduce from file level stats that the files can be ordered and the FileSource reports that it is able to produce batches in sorted order then the optimizer can optimize away the sort completely.

I hope that is helpful.

Thank you @adriangb , it's helpful for future optimization. My current implementation focuses on a specific optimization case: when data is already sorted and we need the reverse order, we can flip the scan direction instead of reading everything and sorting. The reverse_scan in FileSource handles the files/ and within-file ordering reversal.

I think these approaches are complementary - my PR handles the reverse scan optimization, while your vision provides a framework for broader sorted-scan optimizations using file-level statistics and metadata. Would be great to build toward that architecture incrementally.

My point is that instead of enable_reverse_scan: bool we might want to consider a more holistic approach e.g. try_pushdown_sort(&self, order: LexOrdering) -> Result<Option<Arc<dyn ExecutionPlan>>> either at the ExecutionPlan level or at the DataSource level.

I'm not opposed to this as a step towards that but I'm not sure how helpful it is. Seeing something more concrete w.r.t. how this interacts with the bigger picture would be helpful IMO.

zhuqi-lucas · 2025-11-24T15:02:08Z

I haven't looked into all of this discussion and code (I just got tagged). I've been looking into optimizing sorted scanning in DataFusion and IMO where we should land is:

Via metadata (FileScanConfig / ORDERED BY ... in SQL) users declare a known sort order of their files.

The planner uses statistics from the files + any ORDER BY clauses in the query to arrange file ordering to best match the query. The FileSource implementation can also receive the ORDER BY information and optimize scan order within a file (e.g. reversing the order of reads which is what I think this PR is doing).

If the planner is able to deduce from file level stats that the files can be ordered and the FileSource reports that it is able to produce batches in sorted order then the optimizer can optimize away the sort completely.

I hope that is helpful.

Thank you @adriangb , it's helpful for future optimization. My current implementation focuses on a specific optimization case: when data is already sorted and we need the reverse order, we can flip the scan direction instead of reading everything and sorting. The reverse_scan in FileSource handles the files/ and within-file ordering reversal.
I think these approaches are complementary - my PR handles the reverse scan optimization, while your vision provides a framework for broader sorted-scan optimizations using file-level statistics and metadata. Would be great to build toward that architecture incrementally.

My point is that instead of enable_reverse_scan: bool we might want to consider a more holistic approach e.g. try_pushdown_sort(&self, order: LexOrdering) -> Result<Option<Arc<dyn ExecutionPlan>>> either at the ExecutionPlan level or at the DataSource level.

I'm not opposed to this as a step towards that but I'm not sure how helpful it is. Seeing something more concrete w.r.t. how this interacts with the bigger picture would be helpful IMO.

This is a great idea to have high level sort pushdown @adriangb , and reverse scan is one of the polices, i will refactor this PR to use this way, thanks!

Updated, i already changed to high level sort pushdown in this commit:
45089c5

xudong963 · 2025-11-25T08:47:40Z

datafusion/physical-optimizer/src/pushdown_sort.rs

+            // Successfully pushed down sort, now handle the limit
+            let total_fetch = limit_exec.skip() + limit_exec.fetch().unwrap_or(0);
+
+            // Try to push limit down as well if the source supports it


I think the current limit_pushdown physical optimizer rule can do this. So do we still need to distinguish the sort and limit + sort pattern?

I add those logics here, because i found the optimizer order make we always need to run some of them more than one times if we remove this logic.

So i will try to add our optimizer to the existed optimizer.

xudong963 · 2025-11-25T09:20:54Z

datafusion/physical-plan/src/execution_plan.rs

+    /// Try to create a new execution plan that satisfies the given sort ordering.
+    ///
+    /// Default implementation returns `Ok(None)`.
+    fn try_pushdown_sort(


Do we need to add the API for ExecutionPlan? Is it possible to push down sort during the pushdown sort optimizer? Because we need to traverse the plan in the rule, so it looks possible to find the target node and directly give it the order.

This is a good point.

alamb · 2025-11-25T22:36:55Z

I plan to review this PR carefully tomorrow

github-actions bot added optimizer Optimizer rules datasource Changes to the datasource crate sqllogictest SQL Logic Tests (.slt) physical-expr Changes to the physical-expr crates labels Nov 19, 2025

reverse parquet draft version

e8588b1

zhuqi-lucas force-pushed the reverse_parquet branch from 3c23790 to e8588b1 Compare November 20, 2025 06:10

zhuqi-lucas and others added 8 commits November 20, 2025 21:34

Support limit pushdown for reverse scan

2cf2e31

Support row group level cache

2f73a4a

Merge branch 'main' into reverse_parquet

f546a7f

fix

07ef9a2

fmt

c123a37

add more test

46cfd89

Add metrics for reverse scan row groups.

99e50de

fix

dbcf598

github-actions bot added the core Core DataFusion crate label Nov 21, 2025

zhuqi-lucas and others added 2 commits November 21, 2025 21:26

Merge branch 'main' into reverse_parquet

12f74d6

optimize code

2cfd73e

zhuqi-lucas mentioned this pull request Nov 21, 2025

Fast parquet order inversion #17172

Open

zhuqi-lucas and others added 3 commits November 21, 2025 23:38

Merge branch 'main' into reverse_parquet

3479672

Add more comments

98fac46

fmt

92b6487

zhuqi-lucas changed the title ~~Draft Reverse parquet~~ Support reverse Parquet Scan and fast parquet order inversion at row group level Nov 21, 2025

zhuqi-lucas changed the title ~~Support reverse Parquet Scan and fast parquet order inversion at row group level~~ Support reverse parquet scan and fast parquet order inversion at row group level Nov 21, 2025

zhuqi-lucas and others added 2 commits November 22, 2025 00:12

Merge branch 'main' into reverse_parquet

475bf3d

add enable/disable option

8ccca52

github-actions bot added common Related to common crate execution Related to the execution crate proto Related to proto crate labels Nov 22, 2025

zhuqi-lucas and others added 2 commits November 22, 2025 10:49

Merge branch 'main' into reverse_parquet

5d63557

fix slt

a2b44a5

alamb mentioned this pull request Nov 23, 2025

Andrew Lamb Weekly-ish Open Source plan - 2025-11-17 #18711

Closed

46 tasks

alamb mentioned this pull request Nov 23, 2025

Andrew Lamb Weekly-ish Open Source plan - 2025-11-24 #18888

Open

27 tasks

zhuqi-lucas commented Nov 23, 2025

View reviewed changes

Merge branch 'main' into reverse_parquet

52c9b30

simple test

bb31251

Change to sort pushdown architecture

45089c5

github-actions bot added the physical-plan Changes to the physical-plan crate label Nov 24, 2025

zhuqi-lucas and others added 4 commits November 24, 2025 23:25

Merge branch 'main' into reverse_parquet

d15994b

fix

9452005

make code easy

d338461

proto fix

2d77d77

xudong963 reviewed Nov 25, 2025

View reviewed changes

Support reverse parquet scan and fast parquet order inversion at row group level #18817

Are you sure you want to change the base?

Support reverse parquet scan and fast parquet order inversion at row group level #18817

Conversation

zhuqi-lucas commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Overview

Rationale for this change

Motivation

Scope and Limitations

Configuration

Implementation Details

Architecture

1. ParquetSource API (source.rs)

2. ParquetOpener (opener.rs)

3. ReversedParquetStream (opener.rs)

4. Physical Optimizer (reverse_order.rs)

Why Row-Group-Level Buffering?

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

xudong963 commented Nov 22, 2025

Uh oh!

zhuqi-lucas commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alamb commented Nov 23, 2025

Uh oh!

zhuqi-lucas commented Nov 23, 2025

Uh oh!

zhuqi-lucas Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas Nov 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xudong963 Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas Nov 23, 2025

Choose a reason for hiding this comment

Uh oh!

2010YOUY01 commented Nov 24, 2025

Uh oh!

zhuqi-lucas commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adriangb commented Nov 24, 2025

Uh oh!

zhuqi-lucas commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

2010YOUY01 commented Nov 24, 2025

Uh oh!

zhuqi-lucas commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xudong963 commented Nov 24, 2025

Uh oh!

zhuqi-lucas commented Nov 24, 2025

Uh oh!

adriangb commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhuqi-lucas commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

xudong963 Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

zhuqi-lucas commented Nov 19, 2025 •

edited

Loading

1. ParquetSource API (`source.rs`)

2. ParquetOpener (`opener.rs`)

3. ReversedParquetStream (`opener.rs`)

4. Physical Optimizer (`reverse_order.rs`)

zhuqi-lucas commented Nov 22, 2025 •

edited

Loading

zhuqi-lucas Nov 23, 2025 •

edited

Loading

zhuqi-lucas commented Nov 24, 2025 •

edited

Loading

zhuqi-lucas commented Nov 24, 2025 •

edited

Loading

zhuqi-lucas commented Nov 24, 2025 •

edited

Loading

adriangb commented Nov 24, 2025 •

edited

Loading

zhuqi-lucas commented Nov 24, 2025 •

edited

Loading

xudong963 Nov 25, 2025 •

edited

Loading