8822: Add support for using ListView arrays and types through FFI#19
8822: Add support for using ListView arrays and types through FFI#19martin-augment wants to merge 4 commits intomainfrom
Conversation
WalkthroughThis pull request adds support for ListView and LargeListView data type variants across the Apache Arrow Rust codebase. Changes include introducing new enum variants in the schema crate with FFI serialization and deserialization support for formats "+vl" and "+vL"; extending buffer handling and null array construction in array and data modules; and adding corresponding tests for round-trip conversions. PyArrow integration testing was updated to support list view arrays in PyArrow 16 and later. ✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (4)
🧰 Additional context used🧬 Code graph analysis (4)arrow-data/src/data.rs (2)
arrow-schema/src/ffi.rs (1)
arrow-pyarrow-integration-testing/tests/test_sql.py (2)
arrow-array/src/ffi.rs (5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Code Review: Add ListView/LargeListView FFI SupportThank you for implementing FFI support for ListView arrays! This is a solid implementation that properly extends the FFI interface to support list-view layouts. Below is my comprehensive review: ✅ Code Quality & Best PracticesStrengths:
🐛 Potential IssuesCritical: Incorrect variadic Flag Change (arrow-data/src/data.rs:1786)- variadic: true,
+ variadic: false,Concern: This changes the Questions:
Action Required: Please clarify this change. If it's unrelated to ListView, it should be in a separate PR or needs justification. 🔍 Areas for Consideration1. Buffer Length Calculation (arrow-array/src/ffi.rs:480-487)(DataType::ListView(_), 1)
| (DataType::ListView(_), 2)
| (DataType::LargeListView(_), 1)
| (DataType::LargeListView(_), 2) => {
let bits = bit_width(data_type, i)?;
debug_assert_eq!(bits % 8, 0);
length * (bits / 8)
}✅ Correct: Uses 2. Bit Width Specification (arrow-array/src/ffi.rs:183-186)// List views have two i32 buffers, offsets and sizes
(DataType::ListView(_), 1) | (DataType::ListView(_), 2) => i32::BITS as _,
// Large list views have two i64 buffers, offsets and sizes
(DataType::LargeListView(_), 1) | (DataType::LargeListView(_), 2) => i64::BITS as _,✅ Correct: Properly handles both buffer 1 (offsets) and buffer 2 (sizes) with appropriate bit widths. 3. Null Data Creation (arrow-data/src/data.rs:623-632)DataType::ListView(f) => (
vec![zeroed(len * 4), zeroed(len * 4)],
vec![ArrayData::new_empty(f.data_type())],
true,
),✅ Correct: Uses Note: This differs from FixedSizeList which uses ⚡ Performance Considerations✅ No Performance Concerns:
🔒 Security Concerns✅ No Security Issues Identified:
Recommendation: Ensure that the receiving side (when importing from FFI) properly validates:
🧪 Test CoverageCurrent Coverage (Good):
Suggested Additional Tests:
📝 Minor Style Notes
📋 SummaryOverall Assessment: ⭐⭐⭐⭐ (4/5) This is a well-implemented PR that correctly adds FFI support for ListView arrays following established patterns. The main blocker is clarification on the Before Merge:
Approved Pending: Resolution of the variadic flag change question. Great work on this implementation! The code is clean, follows existing patterns well, and includes good test coverage. 🎉 |
|
augment review |
value:incorrect-but-reasonable; category:bug; feedback:The Claude AI reviewer is correct that this change is not related to the topic of this Pull Request. The PR author made the change because it is a bug according to the Arrow specification (https://arrow.apache.org/docs/format/Columnar.html#variadic-buffers) |
8822: To review by AI