feat(rust/sedona-schema): Add Int8, UInt64, Int64 band data types, don't panic in BandMetadataRef#589
Conversation
… align ordinals with GDAL
e53444b to
a9fd5ec
Compare
There was a problem hiding this comment.
Pull request overview
This PR extends the BandDataType enum to support three additional data types (Int8, UInt64, Int64) and aligns all enum discriminants with GDAL's GDALDataType ordinals for non-complex types.
Changes:
- Renumbered existing
BandDataTypevariants from 0-6 to 1-7 and addedInt8(14),UInt64(12),Int64(13) - Derived
HashandCopytraits onBandDataTypeandStorageTypefor better ergonomics - Updated test data generators and test cases to handle all 10 band data types
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| rust/sedona-schema/src/raster.rs | Updated BandDataType and StorageType enum definitions with new variants and trait derivations |
| rust/sedona-raster/src/array.rs | Updated match arms in BandMetadataRefImpl::data_type() to decode new ordinals |
| rust/sedona-raster/src/builder.rs | Extended test coverage with new data type test cases |
| rust/sedona-testing/src/rasters.rs | Refactored generate_random_band_data using macro and added support for new data types |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
paleolimbot
left a comment
There was a problem hiding this comment.
Thank you!
It's great to align these with GDAL's model but probably better for the Rust enum / internal serialization to be separated from the integer storage.
rust/sedona-raster/src/array.rs
Outdated
| _ => panic!( | ||
| "Unknown band data type: {}", |
There was a problem hiding this comment.
Should this return an error or is this validated before we get here? (i.e., do we risk a panic for a corrupted raster array?)
There was a problem hiding this comment.
Sure. I found that storage_type also need to be fixed, I have changed both storage_type() and data_type() to return Result<...>.
| // Test all BandDataType variants | ||
| let test_cases = vec![ | ||
| (BandDataType::UInt8, vec![1u8, 2u8, 3u8, 4u8]), | ||
| (BandDataType::Int8, vec![255u8, 254u8, 253u8, 252u8]), // -1, -2, -3, -4 as i8 |
There was a problem hiding this comment.
If possible this may be a good time to add test for a corrupted band identifier to make sure it doesn't panic
There was a problem hiding this comment.
Added test_invalid_band_metadata_returns_err for testing this
rust/sedona-schema/src/raster.rs
Outdated
| /// Ordinals match GDALDataType for real-valued pixel types only. | ||
| /// GDT_Unknown (0) and complex types (CInt16=8, CInt32=9, CFloat32=10, CFloat64=11) | ||
| /// are intentionally omitted. |
There was a problem hiding this comment.
You could handle these with an Other(usize) variant (whose internal representation is opaque bytes).
There was a problem hiding this comment.
We don't need Other variant for now since we decided to not enforce the values of BandDataType to match GDAL. I believe that it is a better design.
rust/sedona-schema/src/raster.rs
Outdated
| #[repr(u16)] | ||
| #[derive(Clone, Debug, PartialEq, Eq)] | ||
| #[derive(Clone, Debug, PartialEq, Eq, Hash, Copy)] | ||
| pub enum BandDataType { |
There was a problem hiding this comment.
This is probably more flexible without the #[repr(u16)] since at some point we may need a variant to handle other types of data (e.g., generic fixed size). It's great to align our representation to GDAL's but it is probably better scoped as try_to_gdal_band_data_type() -> u16 and try_from_gdal_band_data_type(gdal_type: u16).
There was a problem hiding this comment.
The values here were happened to be the same as GDAL band data types. We'll have functions for converting between then rather than implicitly relying on it. https://github.com/Kontinuation/sedona-db/blob/0e73e17a79332cf7e58b9881084d86dd3f3531c1/rust/sedona-raster-gdal/src/gdal_common.rs#L50-L88
I decided to simply define the values without GDAL awareness just as before.
Summary
BandDataTypevariants:Int8,UInt64,Int64HashandCopyon bothBandDataTypeandStorageTypeenums for ergonomic useBandMetadataRefImpl::data_type()to decode the new ordinalsgenerate_random_band_data,get_nodata_value_for_type) to handle all 10 band data types