Skip to content

feat(vortex-geo): native Point extension type and GeoDistance scalar function#8372

Open
HarukiMoriarty wants to merge 9 commits into
developfrom
nemo/geo-point
Open

feat(vortex-geo): native Point extension type and GeoDistance scalar function#8372
HarukiMoriarty wants to merge 9 commits into
developfrom
nemo/geo-point

Conversation

@HarukiMoriarty

Copy link
Copy Markdown

Summary

This PR adds a native point type to vortex-geo. Points are by far the most common geometry in analytical datasets, and a columnar representation makes their coordinates directly accessible without parsing WKB.

It also adds the scalar function: point-to-point distance with PostGIS ST_Distance semantics (planar/Euclidean, results in CRS units).

API Changes

Adds to vortex-geo, all registered through vortex_geo::initialize:

  • Extension type Point (vortex.geo.point): a location stored as Struct<x, y, z?, m?> of non-nullable f64, where z? is an optional elevation and m? an optional measure.
  • Coordinate: the internal value a point scalar unpacks to.
  • Scalar function GeoDistance (vortex.geo.distance): per-row distance between two equal-length point columns; either or both operands may be constant, in which case the query point is decoded once and broadcast.

Testing

Unit tests cover dtype validation for every GeoArrow dimension (and rejection of invalid storage), round-tripping a point column through scalar execution back to the original coordinates, WKT display for all four dimensions, and distance over all operand shapes: column-to-constant (either side), column-to-column, and constant-to-constant.


Supersedes #8342 (same change, moved from my fork to an in-repo branch).

Adds a GeoArrow-style `Point` extension type (Struct<x,y,[z],[m]>, dimension-ready)
and the planar `GeoDistance` scalar function between two point columns.

Signed-off-by: Nemo Yu <zyu379@wisc.edu>
… point

GeoDistance computes the planar distance from each point in a column to a
single constant query point (e.g. `ST_Distance(column, point)`). The second
operand must be a constant: it is decoded once and broadcast over the column
rather than materialized to one identical row per output element. Column-to-
column distance is unsupported and errors.

`try_new_array` now infers the output length from the point column instead of
taking it as an explicit parameter.

Signed-off-by: Nemo Yu <zhenghong@spiraldb.com>
…field types

Signed-off-by: Nemo Yu <zyu379@wisc.edu>
…s on construction

Signed-off-by: Nemo Yu <zyu379@wisc.edu>
@codspeed-hq

codspeed-hq Bot commented Jun 11, 2026

Copy link
Copy Markdown

Merging this PR will degrade performance by 12.33%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚡ 3 improved benchmarks
❌ 7 regressed benchmarks
✅ 1526 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation chunked_bool_canonical_into[(1000, 10)] 20.5 µs 35.5 µs -42.34%
Simulation decompress_rd[f64, (10000, 0.0)] 111.3 µs 138.2 µs -19.46%
Simulation decompress_rd[f64, (10000, 0.1)] 111.3 µs 138 µs -19.35%
Simulation decompress_rd[f64, (10000, 0.01)] 111 µs 137.6 µs -19.35%
Simulation chunked_varbinview_canonical_into[(1000, 10)] 161.7 µs 197.9 µs -18.3%
Simulation chunked_varbinview_into_canonical[(1000, 10)] 176.8 µs 213.1 µs -17%
Simulation decompress_rd[f32, (10000, 0.1)] 80.7 µs 89.7 µs -10.1%
Simulation decompress_rd[f64, (100000, 0.0)] 980.4 µs 845.4 µs +15.97%
Simulation bitwise_not_vortex_buffer_mut[128] 244.4 ns 215.3 ns +13.55%
Simulation bitwise_not_vortex_buffer_mut[1024] 304.7 ns 275.6 ns +10.58%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing nemo/geo-point (e92be4b) with develop (9383c35)

Open in CodSpeed

Comment thread vortex-geo/src/extension/point.rs Outdated
Comment thread vortex-geo/src/extension/coordinate.rs Outdated
let DType::Struct(fields, _) = dtype else {
vortex_bail!("coordinate storage must be a Struct, was {dtype}");
};
let names: Vec<&str> = fields.names().iter().map(|n| n.as_ref()).collect();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why alloc?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed. The names are now staged in a stack buffer inside from_field_names so the slice-pattern match still works, and coordinate_dimension zips names with fields directly instead of collecting.

vortex_ensure!(
matches!(
field,
DType::Primitive(PType::F64, Nullability::NonNullable)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that two fields are Nullable?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

z/m are optional fields (per dimension), not nullable ones — the GeoArrow spec requires coordinate fields to be non-nullable, with "only the outer level allowed to have nulls". So a point can be missing entirely, but a present point can't have a null ordinate.

Ref: https://geoarrow.org/format.html

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh maybe you should make it Struct<x, y, ?z, ?m> instead, I forgot that ? also means possibly null, not just optional. Or you can do Struct<x, y, {z}, {m}>

HarukiMoriarty and others added 2 commits June 11, 2026 13:21
Co-authored-by: Joe Isaacs <joe.isaacs@live.co.uk>
Signed-off-by: Nemo Yu <83347615+HarukiMoriarty@users.noreply.github.com>
Signed-off-by: Nemo Yu <zyu379@wisc.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants