19948: fix: maintain inner list nullability for `array_sort` #209

martin-augment · 2026-01-25T21:18:20Z

19948: To review by AI

gemini-code-assist · 2026-01-25T21:18:34Z

Summary of Changes

Hello @martin-augment, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request fixes a bug in the array_sort function where the nullability of the inner elements of list arrays was not correctly preserved. The changes ensure that the sorted list array retains the exact nullability characteristics of its input, preventing unintended schema modifications and improving data integrity.

Highlights

Inner List Nullability Preservation: The array_sort function now correctly preserves the nullability of the inner elements within list arrays, ensuring that the output schema accurately reflects the input's nullability.
Simplified Return Type Logic: The return_type method for array_sort has been simplified to directly clone the input argument's data type, which inherently maintains the inner list's nullability without explicit re-declaration.
Null Buffer Handling Refinement: The internal array_sort_generic function has been refactored to reuse the original list array's null buffer, rather than constructing a new one, thereby ensuring that existing nullability is carried over.
New Test Cases: New SQL logic tests have been added to verify that array_sort correctly handles and maintains the non-null property of inner list elements across List, LargeList, and FixedSizeList types.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-01-25T21:19:08Z

Walkthrough

This pull request modifies the array sorting implementation and adds comprehensive test coverage. The primary change refactors the sort.rs file to simplify the return type handling, remove validity buffer tracking, and adjust parameter passing for List and LargeList types to use FieldRef by value. Additionally, new tests are introduced to verify array sorting behavior across various list element types, inner nullability preservation, complex types, and error handling scenarios.

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude · 2026-01-25T21:19:42Z

Pull Request Review: Fix `array_sort` Inner List Nullability

Summary

This PR fixes a regression introduced in apache#17657 where array_sort was incorrectly hardcoding inner field nullability to true in the return_type function while preserving it in the execution path. The fix correctly maintains the input field's nullability specification.

Code Quality & Best Practices ✅

Strengths:

Simplified return_type implementation (sort.rs:134-136): The change from a complex match statement to simply cloning the input type is elegant and correct. This properly preserves all aspects of the input type including nullability.
Cleaner null handling (sort.rs:256): Replacing NullBufferBuilder with direct cloning of the input's null buffer (list_array.nulls().cloned()) is more efficient and ensures correctness by preserving the exact null pattern from the input.
Removed unused imports: Properly cleaned up NullBufferBuilder and plan_err imports that are no longer needed.
Better field ownership semantics (sort.rs:196, 200): Changed from passing &FieldRef to FieldRef which is more idiomatic since FieldRef is already an Arc - cloning the Arc is cheap and avoids unnecessary dereferencing.

Test Coverage ✅

Excellent test additions:

Tests cover the core bug fix - maintaining non-null inner types (test_files/array.slt:2580-2591)
Tests verify behavior across different list types: List, LargeList, and FixedSizeList (test_files/array.slt:2593-2603)
Tests check both NULL and non-NULL outer values with non-nullable inner types
Uses arrow_typeof to verify type preservation - critical for catching this class of bug

Potential Issues & Concerns

Minor Concern: Empty Array Handling

When elements.is_empty() (sort.rs:249-250), the code creates a new null array. This is correct behavior, but worth noting that this occurs when all rows in the batch are NULL. The function preserves the field metadata correctly in this case.

Logic Verification ✅

The refactored null handling logic is correct:

Before: Manually built a null buffer by appending null/non-null for each row
After: Directly clones the input's null buffer

This is semantically identical since the function doesn't change which rows are null - it only sorts the contents of non-null rows. The new approach is simpler and more efficient.

Performance Considerations ✅

Improvements:

Eliminated redundant null buffer construction: No longer builds a new NullBufferBuilder and populates it row-by-row. Instead, directly clones the existing buffer.
Reduced memory allocations: One less vector allocation (valid builder is removed)
Better Arc handling: Using Arc::clone(field) is explicit and idiomatic

Security Concerns ✅

No security concerns identified. The changes:

Don't introduce new unsafe code
Don't change error handling in a way that could leak information
Properly handle null values without risk of null pointer dereference

Correctness Verification

The fix addresses the core issue:

Problem: return_type was hardcoding .new_list(field.data_type().clone(), true) - the true forces nullability regardless of input
Solution: return_type now returns arg_types[0].clone() which preserves the original type exactly
Runtime: Already preserved field metadata via Arc::clone(field), so execution was correct
Result: Type system and runtime are now consistent

Recommendations

Optional Cleanup Opportunity

At sort.rs:216-218, when a row is null, the code appends a zero-length entry. Consider adding a comment explaining this behavior:

if list_array.is_null(i) {
    // Null rows contribute zero-length entries to maintain offset alignment
    array_lengths.push(0);

This is already correct, but the comment would help future maintainers understand why we track lengths for null rows.

Final Verdict: LGTM ✅

This is a high-quality fix that:

✅ Correctly addresses the regression
✅ Simplifies the code
✅ Improves performance
✅ Includes comprehensive tests
✅ Follows Rust best practices
✅ Has no security concerns

The changes are well-reasoned and the implementation is clean. Great work fixing this type preservation bug!

gemini-code-assist

Code Review

This pull request addresses a bug in the array_sort function where the nullability of inner list elements was not preserved. The return_type implementation was incorrectly forcing elements to become nullable.

The fix correctly changes return_type to return the input type, preserving the original nullability. The implementation of array_sort has also been refactored to be cleaner and more robust. It now reuses the null buffer from the input array instead of manually reconstructing it, which is a significant improvement.

The added tests in array.slt effectively verify the fix for List, LargeList, and FixedSizeList.

Overall, the changes are excellent, addressing the bug correctly and improving code quality. I have no further suggestions.

augmentcode · 2026-01-25T21:24:13Z

🤖 Augment PR Summary

Summary: Fixes array_sort to preserve the input list’s element nullability (e.g. List(non-null Int32)) rather than forcing list items to be nullable.

Changes:

ArraySort::return_type now returns the (coerced) first argument type unchanged.
List output construction now reuses the input list’s null buffer to keep outer nulls consistent.
Added sqllogictest coverage for List/LargeList/FixedSizeList inputs with non-null elements.

Technical Notes: Fixed-size lists are still coerced to List via ListCoercion::FixedSizedListToList; sorting behavior is unchanged aside from type/nullability preservation.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. No suggestions at this time.

Comment augment review to trigger a new review at any time.

Jefffrey added 2 commits January 23, 2026 12:30

fix: maintain inner list nullability for array_sort

1278790

refactor

4364121

gemini-code-assist bot reviewed Jan 25, 2026

View reviewed changes

augmentcode bot reviewed Jan 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

19948: fix: maintain inner list nullability for `array_sort` #209

19948: fix: maintain inner list nullability for `array_sort` #209

martin-augment commented Jan 25, 2026

Uh oh!

gemini-code-assist bot commented Jan 25, 2026

Uh oh!

coderabbitai bot commented Jan 25, 2026 •

edited

Loading

Uh oh!

claude bot commented Jan 25, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

augmentcode bot commented Jan 25, 2026

Uh oh!

augmentcode bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

19948: fix: maintain inner list nullability for array_sort #209

Are you sure you want to change the base?

19948: fix: maintain inner list nullability for array_sort #209

Conversation

martin-augment commented Jan 25, 2026

Uh oh!

gemini-code-assist bot commented Jan 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

claude bot commented Jan 25, 2026

Pull Request Review: Fix array_sort Inner List Nullability

Summary

Code Quality & Best Practices ✅

Test Coverage ✅

Potential Issues & Concerns

Minor Concern: Empty Array Handling

Logic Verification ✅

Performance Considerations ✅

Security Concerns ✅

Correctness Verification

Recommendations

Optional Cleanup Opportunity

Final Verdict: LGTM ✅

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

augmentcode bot commented Jan 25, 2026

Uh oh!

augmentcode bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

19948: fix: maintain inner list nullability for `array_sort` #209

19948: fix: maintain inner list nullability for `array_sort` #209

coderabbitai bot commented Jan 25, 2026 •

edited

Loading

Pull Request Review: Fix `array_sort` Inner List Nullability