Skip to content

feat(python/sedonadb): Implement parameter binding#575

Merged
paleolimbot merged 9 commits intoapache:mainfrom
paleolimbot:parameterized-queries
Feb 10, 2026
Merged

feat(python/sedonadb): Implement parameter binding#575
paleolimbot merged 9 commits intoapache:mainfrom
paleolimbot:parameterized-queries

Conversation

@paleolimbot
Copy link
Member

@paleolimbot paleolimbot commented Feb 4, 2026

This PR provides the ability to bind arbitrary Python objects as parameters in a query (e.g. SELECT ST_Envelope($1)). As discussed in the ticket, most of the work here is actually around the logic to convert arbitrary Python objects, where the heuristic is:

  • Have a special-case list (e.g., this handles Shapely objects and GeoDataFrame/GeoSeries)
  • Try obj.__arrow_c_array__() protocol
  • Try pyarrow.array([obj])

In general, dataframe-ish objects with exactly one value and anything convertible to arrow of length one works (this is vaugely the same logic as how you can put a subquery in any spot where a scalar is expected). It also works nicely for geometry because GeoPandas objects preserve their CRS but shapely objects don't (so geo_df.geometry[0] is lossy but geo_df.geometry is not).

As a side effect this also kick starts our expression API with a single expression type Literal.

Closes #111.

import pandas as pd
import sedona.db

sd = sedona.db.connect()

sd.sql("SELECT $1 + 1 as two", params=(1, )).show()
#> ┌───────┐
#> │  two  │
#> │ int64 │
#> ╞═══════╡
#> │     2 │
#> └───────┘

sd.sql("SELECT $foofy + 1 as two", params={"foofy": 1}).show()
#> ┌───────┐
#> │  two  │
#> │ int64 │
#> ╞═══════╡
#> │     2 │
#> └───────┘

Geometry objects can also be bound here:

import sedona.db
import shapely

sd = sedona.db.connect()

sd.sql("SELECT ST_Buffer($1, 0.5)").with_params(
    shapely.box(1, 2, 3, 4)
).to_pandas().plot()

@paleolimbot paleolimbot requested a review from Copilot February 5, 2026 19:33
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements parameter binding for SedonaDB, enabling users to bind arbitrary Python objects as query parameters using placeholders like $1 or $my_param. The implementation includes a new Literal expression type and conversion logic that handles various Python objects including Shapely geometries, GeoPandas objects, and Arrow-compatible types.

Changes:

  • Added parameter binding support to SQL queries via with_params() method and params argument to sql()
  • Implemented Literal expression type with conversion logic for Python objects to Arrow arrays
  • Added Rust-side support for importing Arrow scalars and binding parameters to DataFrames

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
python/sedonadb/tests/test_dataframe.py Tests for parameter binding with positional and named parameters
python/sedonadb/tests/expr/test_literal.py Comprehensive tests for literal expression conversion from various Python types
python/sedonadb/src/import_from.rs Added import_arrow_scalar function to convert Arrow arrays to scalar values with metadata
python/sedonadb/src/dataframe.rs Implemented with_params method to bind positional and named parameters to DataFrames
python/sedonadb/python/sedonadb/expr/literal.py Core literal expression implementation with conversion logic for Python objects
python/sedonadb/python/sedonadb/expr/init.py Module initialization for expression API
python/sedonadb/python/sedonadb/dataframe.py Added with_params() method to DataFrame class
python/sedonadb/python/sedonadb/context.py Added params argument to sql() method for parameter binding

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@paleolimbot paleolimbot marked this pull request as ready for review February 5, 2026 19:37
return f"{type(obj).__module__}.{type(obj).__name__}"


SPECIAL_CASED_LITERALS = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this work for subclass if we match the class names?

and missing LinearRing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch on LinearRing!

I use the class name approach (which doesn't catch subclasses) because in order to do isinstance(obj, shapely.Geometry) we need to import shapely, which is something I'd rather not do at the module level if it can be avoided. If something comes up where there's no alternative we could require certain dependencies for using parameterized queries.

if len(obj) != 1:
raise ValueError("Can't create SedonaDB literal from Series with length != 1")

if obj.dtype.name == "geometry":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also check crs is defined here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if dtype.name == "geometry" implies that the series is a GeoPandas GeoSeries, and if can assume that crs is defined in this case. Besides, the PR LGTM.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is because of a fun corner case in Pandas land: geo_df.iloc[0] is a Series with dtype geometry, not a GeoSeries (hence checking obj.array.crs instead of obj.crs).

@paleolimbot paleolimbot merged commit 09b084b into apache:main Feb 10, 2026
5 checks passed
@paleolimbot paleolimbot deleted the parameterized-queries branch February 10, 2026 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add parameterized queries

4 participants