Skip to content

Conversation

@2010YOUY01
Copy link
Contributor

Which issue does this PR close?

  • Closes #.

Rationale for this change

ParquetOpener::open() is a critical function for parquet planning, it's the entry point for many major steps like row-group/file pruning.

It has almost 400 lines of code now, this PR adds some markers to the code blocks/important steps, to make this function easier to navigate. (though I may have overlooked some critical steps)

Ideally, we should break these blocks into utilities. I tried extracting some of them with AI, but the resulting utilities still have unclear semantics, with many input arguments and output items. Overall, the complexity doesn’t seem reduced after the change. I think it’s possible to factor them into helper functions with clear semantics, but that likely requires someone who understands the implementation details very well.

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the datasource Changes to the datasource crate label Jan 7, 2026
@2010YOUY01 2010YOUY01 added this pull request to the merge queue Jan 8, 2026
Merged via the queue into apache:main with commit 102caeb Jan 8, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

datasource Changes to the datasource crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants