Making sd_format show geospatial values in arrays and structs properly#1
Making sd_format show geospatial values in arrays and structs properly#1
sd_format show geospatial values in arrays and structs properly#1Conversation
a9f7268 to
0c1561d
Compare
There was a problem hiding this comment.
Pull Request Overview
This PR revamps the sd_format function to properly handle geospatial values within complex nested data structures (arrays and structs) by recursively formatting spatial columns while leaving non-spatial columns unchanged.
- Refactored implementation to use a single kernel that handles all data types rather than separate kernels for default and geometry types
- Added recursive formatting support for nested structures (structs, lists, and list views)
- Enhanced type system to properly transform geospatial types to UTF8 while preserving other data types
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| rust/sedona-functions/src/sd_format.rs | Complete rewrite of sd_format implementation with recursive type handling and comprehensive test coverage |
| rust/sedona-expr/src/scalar_udf.rs | Added is_any() matcher to support accepting any argument type |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
8d361c8 to
82990fb
Compare
| /// Helper function to verify that actual WKT values match expected values, | ||
| /// handling the normalization of comma spacing in WKT output | ||
| fn assert_wkt_values_match(actual_array: &StringArray, expected_values: &[Option<&str>]) { | ||
| for (i, expected) in expected_values.iter().enumerate() { | ||
| match expected { | ||
| Some(expected_value) => { | ||
| let actual_value = actual_array.value(i); | ||
| // Note: WKT output may not have spaces after commas | ||
| let normalized_expected = expected_value.replace(", ", ","); | ||
| assert_eq!(actual_value, normalized_expected); | ||
| } | ||
| None => assert!(actual_array.is_null(i)), | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Are there cases where we're generating WKT whose result we can't guess in advance?
There was a problem hiding this comment.
No. sd_format works with concrete ColumnarValues, so we'll always know the expected result in advance. We normalize the WKTs before comparison just for being robust.
This patch revamped the implementation of
sd_formatto work on any input types, and format the spatial columns recursively while leaving non-spatial columns untouched.Before applying this patch:
After applying this patch: