Skip to content

feat(rust/sedona): Add SedonaFairSpillPool memory pool and CLI memory limit support#599

Merged
Kontinuation merged 3 commits intoapache:mainfrom
Kontinuation:feat/sedona-fair-spill-pool
Feb 13, 2026
Merged

feat(rust/sedona): Add SedonaFairSpillPool memory pool and CLI memory limit support#599
Kontinuation merged 3 commits intoapache:mainfrom
Kontinuation:feat/sedona-fair-spill-pool

Conversation

@Kontinuation
Copy link
Member

Summary

  • Add SedonaFairSpillPool, a new memory pool that reserves a configurable fraction of total memory for unspillable consumers, preventing spillable operators from exhausting all available memory (addresses datafusion#17334 in the Sedona context)
  • Add --memory-limit, --mem-pool-type, and --unspillable-reserve-ratio CLI arguments to sedona-cli for configuring memory pool behavior
  • Refactor SedonaContext::new_local_interactive() to expose new_local_interactive_with_runtime_env(), allowing callers to inject a custom RuntimeEnv with pre-configured memory pools

Motivation

When running out-of-core spatial joins, spillable operators (e.g., SpatialJoinExec) could consume all available memory, causing unspillable operators (e.g., RepartitionExec's merge consumer) to fail with OOM errors. The SedonaFairSpillPool mitigates this by reserving a configurable portion (default 20%) of the memory pool exclusively for unspillable allocations.

… limit support

Add a new SedonaFairSpillPool that reserves a configurable fraction of memory
for unspillable consumers, preventing spillable operators from exhausting all
available memory. This addresses the issue where operators like RepartitionExec
(unspillable) could fail when spatial join operators (spillable) consumed all
memory.

Also add --memory-limit, --mem-pool-type, and --unspillable-reserve-ratio CLI
options to sedona-cli, and refactor SedonaContext to accept a custom RuntimeEnv
for memory pool configuration.
@Kontinuation Kontinuation requested a review from Copilot February 11, 2026 16:55
@Kontinuation Kontinuation marked this pull request as draft February 11, 2026 16:55
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds configurable memory-pool behavior to Sedona (including a new fair spill pool with reserved capacity for unspillable consumers) and wires this through sedona-cli flags by allowing SedonaContext to be created with an injected RuntimeEnv.

Changes:

  • Introduces SedonaFairSpillPool with an “unspillable reserve” mechanism plus unit tests.
  • Adds --memory-limit, --mem-pool-type, and --unspillable-reserve-ratio to sedona-cli, including parsing for human-readable sizes.
  • Refactors SedonaContext construction to support injecting a pre-configured RuntimeEnv.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
sedona-cli/src/pool_type.rs Adds a PoolType enum to select greedy vs fair pool from CLI.
sedona-cli/src/main.rs Adds CLI flags, builds a RuntimeEnv with selected memory pool, and adds size parsing helpers.
sedona-cli/src/lib.rs Exposes the new pool_type module.
rust/sedona/src/memory_pool.rs Implements SedonaFairSpillPool with reserved unspillable capacity and tests.
rust/sedona/src/lib.rs Exposes the new memory_pool module publicly.
rust/sedona/src/context.rs Adds new_local_interactive_with_runtime_env to allow custom runtime env injection.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Validate unspillable_reserve_ratio is within 0.0..=1.0 via clap value_parser
- Deduplicate NonZeroUsize::new(10) into a shared local variable
- Remove negative sign from size parsing regex for clearer error messages
- Fix grammar: 'resulting a reservation' -> 'resulting in a reservation'
@Kontinuation Kontinuation marked this pull request as ready for review February 11, 2026 18:05
Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Comment on lines +262 to +263
fn parse_size_string(size: &str, label: &str) -> Result<usize, String> {
static BYTE_SUFFIXES: LazyLock<HashMap<&'static str, ByteUnit>> = LazyLock::new(|| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you copy DataFusion's test for this?

https://github.com/apache/datafusion/blob/ecf3b502cfd8c5baeaabb97737605ef66549c753/datafusion-cli/src/main.rs#L464-L510

This could also go in rust/sedona (we'll need it in R and Python, too?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test, also augmented with decimal numbers and too large numbers.

Comment on lines +18 to +24
use std::{
fmt::{self, Display, Formatter},
str::FromStr,
};

#[derive(PartialEq, Debug, Clone)]
pub enum PoolType {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we will also need this in R and Python...should it go in rust/sedona?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this to rust/sedona.

Copy link
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good idea. If it works well, we could use it in DataFusion upstream. Looking forward to the results using it in the spilling queries!

One potential follow-up: we could set configurations through SQL SET ... statements like https://datafusion.apache.org/user-guide/configs.html#runtime-configuration-settings

- Move PoolType from sedona-cli to rust/sedona for R/Python reuse
- Fix parse_size_string to handle decimal values (e.g. 1.5g, 0.5m)
- Add comprehensive tests for parse_size_string
- Fix SedonaFairSpillPool doc comment formatting
@Kontinuation Kontinuation force-pushed the feat/sedona-fair-spill-pool branch from 5f07a36 to 754e996 Compare February 12, 2026 13:01
Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@Kontinuation Kontinuation merged commit b4f4e9f into apache:main Feb 13, 2026
17 checks passed
@paleolimbot paleolimbot added this to the 0.3.0 milestone Feb 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants