Proposal: SpatialBench for Presto Geospatial Benchmarking
Overview
This proposal introduces Apache Sedona SpatialBench as a standardized geospatial benchmark for Presto Native (CPU and GPU), following the same integration pattern that velox-testing uses for TPC-H and TPC-DS.
SpatialBench fills a critical gap: while TPC-H and TPC-DS thoroughly exercise relational operators (joins, aggregations, sorting), they contain zero geospatial operations. As NVIDIA invests in GPU-accelerating Velox's spatial functions (ST_Distance, ST_Contains, ST_Intersects, etc.), we need a reproducible benchmark to measure progress and prevent regressions.
Why SpatialBench
The Problem
Standard database benchmarks (TPC-H, TPC-DS) do not test:
- Spatial predicate evaluation (point-in-polygon, intersection)
- Distance-based filtering and joins
- Geometry construction and serialization (WKB/WKT)
- Spatial aggregations (convex hull, union)
- Mixed spatial + relational workloads (spatial joins with GROUP BY)
Without a spatial benchmark, we cannot quantify the impact of GPU acceleration on geospatial queries, nor compare Presto Native GPU against CPU or Java workers for spatial workloads.
Why SpatialBench Specifically
| Criteria |
SpatialBench |
Alternative (ad-hoc queries) |
| Reproducible |
Deterministic data generator with scale factors |
No guarantee |
| Standardized |
12 queries covering all major spatial operations |
Cherry-picked |
| Scalable |
SF1 (1GB) to SF1000+ (1TB+) |
Fixed datasets |
| Community |
Apache-licensed, multi-engine (DuckDB, Spark, Sedona) |
Single-use |
| Realistic |
Transportation/urban mobility star schema |
Synthetic points |
| Unbiased |
No engine-specific optimizations baked in |
Tuned to one engine |
Integration with velox-testing
Architecture
SpatialBench integrates with velox-testing following the established TPC-H pattern:
velox-testing/
├── presto/
│ ├── pbench/benchmarks/
│ │ ├── tpch/ # Existing
│ │ │ ├── queries/
│ │ │ ├── duckdb_queries/
│ │ │ └── sf100.json
│ │ └── spatialbench/ # New
│ │ ├── queries/
│ │ │ ├── q01.sql ... q12.sql
│ │ └── sf1.json, sf10.json
│ └── scripts/
│ ├── setup_benchmark_tables.sh # Extended for spatialbench
│ └── run_benchmark.sh # Already supports -b flag
├── benchmarks/
│ └── spatialbench/
│ ├── scripts/
│ │ ├── generate_data.sh # Wraps spatialbench-cli
│ │ ├── setup_tables.sh # Hive external tables
│ │ └── run_benchmark.sh # Standalone runner
│ ├── queries/
│ │ └── q01.sql ... q12.sql # Presto-syntax queries
│ └── README.md
└── benchmark_data_tools/
└── generate_table_schemas.py # Extended for spatialbench
Data Pipeline
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────────┐
│ spatialbench-cli │────▶│ Parquet files │────▶│ Hive external │
│ (Rust, SF1-1000) │ │ /datasets/ │ │ tables in Presto │
│ │ │ spatialbench/ │ │ (file metastore) │
└─────────────────┘ │ sf{N}/ │ └─────────────────────┘
│ ├── trip/ │ │
│ ├── building/ │ ▼
│ ├── zone/ │ ┌─────────────────────┐
│ ├── customer/ │ │ Benchmark runner │
│ ├── driver/ │ │ (Q1-Q12, timing, │
│ └── vehicle/ │ │ CSV results) │
└──────────────────┘ └─────────────────────┘
Workflow (mirrors TPC-H)
| Step |
TPC-H |
SpatialBench |
| 1. Generate data |
generate_data_files.py --benchmark-type tpch --scale-factor 100 |
generate_data.sh -s "1 10 100" |
| 2. Start Presto |
start_native_gpu_presto.sh |
Same |
| 3. Create tables |
setup_benchmark_tables.sh -b tpch -s tpch_sf100 -d sf100 |
setup_benchmark_tables.sh -b spatialbench -s spatialbench_sf1 -d sf1 |
| 4. Run benchmark |
run_benchmark.sh -b tpch -s tpch_sf100 |
run_benchmark.sh -b spatialbench -s spatialbench_sf1 |
| 5. Collect results |
CSV + profiling |
Same |
Dataset
SpatialBench uses a transportation/urban mobility star schema:
| Table |
Description |
Key Spatial Columns |
SF1 Rows |
SF1 Size |
| trip |
Taxi/rideshare trips (fact) |
t_pickuploc (WKB Point), t_dropoffloc (WKB Point) |
~5M |
~350 MB |
| building |
Building footprints (dimension) |
b_boundary (WKB Polygon) |
~20K |
~2 MB |
| zone |
Administrative zones (dimension) |
z_boundary (WKB Polygon) |
~160K |
~80 MB |
| customer |
Customer dimension |
— |
~150K |
~5 MB |
| driver |
Driver dimension |
— |
~10K |
~400 KB |
| vehicle |
Vehicle dimension |
— |
~10K |
~400 KB |
Geometry columns are stored as WKB (Well-Known Binary) in VARBINARY Parquet columns, matching Presto's ST_GeomFromBinary() input format.
Queries
The 12 queries are designed to test distinct spatial operations:
Tier 1: GPU-Acceleratable (Priority)
| Query |
Operation |
Spatial Functions |
GPU Path |
| Q1 |
Distance filter + sort |
ST_Distance, ST_X, ST_Y |
great_circle_distance on GPU; coordinate extraction planned |
| Q3 |
Distance filter + aggregation |
ST_Distance + GROUP BY |
Same as Q1 |
| Q7 |
Detour detection |
ST_Distance point-to-point |
Direct GPU acceleration |
| Q8 |
Building proximity join |
ST_Distance join |
GPU distance computation |
Tier 2: Requires Point-in-Polygon GPU Support
| Query |
Operation |
Spatial Functions |
GPU Path |
| Q2 |
Point-in-polygon filter |
ST_Intersects |
cuSpatial or custom kernel |
| Q4 |
Spatial join + aggregation |
ST_Within |
Same |
| Q6 |
Zone statistics |
ST_Intersects, ST_Within |
Same |
| Q10 |
Zone LEFT JOIN |
ST_Within |
Same |
| Q11 |
Cross-zone count |
ST_Within (double join) |
Same |
Tier 3: Complex Geometry Operations (CPU Fallback)
| Query |
Operation |
Spatial Functions |
GPU Path |
| Q5 |
Convex hull area |
ST_ConvexHull, ST_Collect, ST_Area |
GEOS CPU fallback |
| Q9 |
Building IoU |
ST_Intersection, ST_Area |
GEOS CPU fallback |
| Q12 |
KNN join |
ST_Distance ranking |
Distance on GPU, ranking on CPU |
Scale Factors
| SF |
Trip Rows |
Total Data Size |
Target Environment |
| 1 |
~5M |
~500 MB |
Development, CI |
| 10 |
~50M |
~5 GB |
Single GPU testing |
| 100 |
~500M |
~50 GB |
Multi-GPU benchmarking |
| 1000 |
~5B |
~500 GB |
Large-scale evaluation |
GPU Acceleration Roadmap
Phase 1: Foundation (Current)
Phase 2: Point-in-Polygon
Phase 3: Complex Operations
Measurement
For each phase, SpatialBench provides a clear metric:
Speedup = T(Java Presto, SF100) / T(Native GPU Presto, SF100)
Per-query breakdown shows exactly which spatial operations benefit from GPU acceleration and which remain CPU-bound, guiding investment in the next phase.
Implementation Status
| Component |
Status |
Location |
| Data generator script |
Done |
benchmarks/spatialbench/scripts/generate_data.sh |
| SQL queries (Q1-Q12) |
Done |
benchmarks/spatialbench/queries/ |
| Table setup script |
Done |
benchmarks/spatialbench/scripts/setup_tables.sh |
| Benchmark runner |
Done |
benchmarks/spatialbench/scripts/run_benchmark.sh |
| pbench integration |
Planned |
presto/pbench/benchmarks/spatialbench/ |
setup_benchmark_tables.sh extension |
Planned |
-b spatialbench support |
| CI integration |
Planned |
GitHub Actions with SF1 |
References
Proposal: SpatialBench for Presto Geospatial Benchmarking
Overview
This proposal introduces Apache Sedona SpatialBench as a standardized geospatial benchmark for Presto Native (CPU and GPU), following the same integration pattern that velox-testing uses for TPC-H and TPC-DS.
SpatialBench fills a critical gap: while TPC-H and TPC-DS thoroughly exercise relational operators (joins, aggregations, sorting), they contain zero geospatial operations. As NVIDIA invests in GPU-accelerating Velox's spatial functions (
ST_Distance,ST_Contains,ST_Intersects, etc.), we need a reproducible benchmark to measure progress and prevent regressions.Why SpatialBench
The Problem
Standard database benchmarks (TPC-H, TPC-DS) do not test:
Without a spatial benchmark, we cannot quantify the impact of GPU acceleration on geospatial queries, nor compare Presto Native GPU against CPU or Java workers for spatial workloads.
Why SpatialBench Specifically
Integration with velox-testing
Architecture
SpatialBench integrates with velox-testing following the established TPC-H pattern:
Data Pipeline
Workflow (mirrors TPC-H)
generate_data_files.py --benchmark-type tpch --scale-factor 100generate_data.sh -s "1 10 100"start_native_gpu_presto.shsetup_benchmark_tables.sh -b tpch -s tpch_sf100 -d sf100setup_benchmark_tables.sh -b spatialbench -s spatialbench_sf1 -d sf1run_benchmark.sh -b tpch -s tpch_sf100run_benchmark.sh -b spatialbench -s spatialbench_sf1Dataset
SpatialBench uses a transportation/urban mobility star schema:
t_pickuploc(WKB Point),t_dropoffloc(WKB Point)b_boundary(WKB Polygon)z_boundary(WKB Polygon)Geometry columns are stored as WKB (Well-Known Binary) in VARBINARY Parquet columns, matching Presto's
ST_GeomFromBinary()input format.Queries
The 12 queries are designed to test distinct spatial operations:
Tier 1: GPU-Acceleratable (Priority)
ST_Distance,ST_X,ST_Ygreat_circle_distanceon GPU; coordinate extraction plannedST_Distance+GROUP BYST_Distancepoint-to-pointST_DistancejoinTier 2: Requires Point-in-Polygon GPU Support
ST_IntersectsST_WithinST_Intersects,ST_WithinST_WithinST_Within(double join)Tier 3: Complex Geometry Operations (CPU Fallback)
ST_ConvexHull,ST_Collect,ST_AreaST_Intersection,ST_AreaST_DistancerankingScale Factors
GPU Acceleration Roadmap
Phase 1: Foundation (Current)
great_circle_distancevia cuDF AST (fused single-kernel)ST_X,ST_Ycoordinate extractionST_PointconstructionST_Distancefor Point-Point (Euclidean on GPU)Phase 2: Point-in-Polygon
ST_Contains/ST_Within/ST_Intersectsfor Point-in-PolygonPhase 3: Complex Operations
ST_Area,ST_Lengthfor simple geometriesST_Envelope,ST_BufferST_Union,ST_Intersection,ST_ConvexHullMeasurement
For each phase, SpatialBench provides a clear metric:
Per-query breakdown shows exactly which spatial operations benefit from GPU acceleration and which remain CPU-bound, guiding investment in the next phase.
Implementation Status
benchmarks/spatialbench/scripts/generate_data.shbenchmarks/spatialbench/queries/benchmarks/spatialbench/scripts/setup_tables.shbenchmarks/spatialbench/scripts/run_benchmark.shpresto/pbench/benchmarks/spatialbench/setup_benchmark_tables.shextension-b spatialbenchsupportReferences