Making scan limit of RecordBatchReaderExec handle row limit efficiently#36
Merged
jiayuasu merged 2 commits intoapache:mainfrom Sep 8, 2025
Merged
Making scan limit of RecordBatchReaderExec handle row limit efficiently#36jiayuasu merged 2 commits intoapache:mainfrom
jiayuasu merged 2 commits intoapache:mainfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull Request Overview
This PR fixes two issues in RecordBatchReaderExec: the limit parameter now correctly applies to row count instead of batch count, and replaces inappropriate RwLock usage with Mutex for better thread safety. The changes improve both correctness and performance of the record batch reader functionality.
- Implements proper row-based limiting with
RowLimitedIteratorthat tracks consumed rows and handles batch truncation - Replaces
RwLockwithMutexand removes unsafeSyncimplementation for better thread safety - Adds comprehensive test coverage for various row limit scenarios including edge cases with empty batches
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
limitinRecordBatchReaderExecis supposed to be a number of rows. However, it is currently being applied as a number of batches. This patch fixes this by implementing a structure that keeps track of the number of iterated rows and stops early.This patch also fixes another problem: the usage of
RwLockinRecordBatchReaderExecis inappropriate, since we only access the underlying reader protected by the lock in write mode. We use aMutexhere instead, and removedunsafe impl Sync for RecordBatchReaderExec.