Skip to content

Conversation

EeshanBembi
Copy link
Contributor

Summary

Add SQL-compliant overflow checking for arithmetic operations by default. Previously,
DataFusion allowed numeric overflow to wrap silently, which differs from SQL standard
behavior and other databases like PostgreSQL, Trino, and Snowflake.

Changes

  • Add fail_on_overflow configuration option that defaults to true for SQL-standard
    behavior
  • Update BinaryExpr to use checked arithmetic operations when overflow checking is enabled
  • Modify binary() function signature to accept ExecutionProps for configuration access
  • Fix all call sites throughout the codebase to pass the ExecutionProps parameter
  • Add helper functions for test code to provide default ExecutionProps

Behavior

Before:

SELECT 10000000000 * 10000000000;
-- Returns: 7766279631452241920 (wrapped overflow)

After (default):
SELECT 10000000000 * 10000000000;
-- Error: Arithmetic overflow: Overflow happened on: 10000000000 * 10000000000

After (with overflow disabled):
SET datafusion.execution.fail_on_overflow = false;
SELECT 10000000000 * 10000000000;
-- Returns: 7766279631452241920 (wrapped overflow)

Testing

  • All existing overflow tests pass
  • New behavior verified with the example from the issue
  • Configuration option properly toggles behavior
  • Maintains backward compatibility through configuration

Fixes #17539

Add SQL-compliant overflow checking for arithmetic operations by default.
Previously, DataFusion allowed numeric overflow to wrap silently, which
differs from SQL standard behavior and other databases.

Changes:
- Add `fail_on_overflow` configuration option (defaults to true)
- Update BinaryExpr to use checked arithmetic when fail_on_overflow is enabled
- Modify binary() function signature to accept ExecutionProps for config access
- Fix all call sites to pass ExecutionProps parameter
- Add helper functions for test code to provide default ExecutionProps

The behavior now matches PostgreSQL, Trino, and Snowflake:
- SELECT 10000000000 * 10000000000 returns overflow error by default
- Can be disabled with SET datafusion.execution.fail_on_overflow = false

Fixes apache#17539
@github-actions github-actions bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate common Related to common crate labels Sep 13, 2025
@Jefffrey
Copy link
Contributor

PR body claims this has been tested but it doesn't look like it builds successfully in the first place?

.as_ref()
.map(|cfg| cfg.execution.fail_on_overflow)
.unwrap_or(true);
Ok(Arc::new(BinaryExpr::new(lhs, op, rhs).with_fail_on_overflow(fail_on_overflow)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's deprecate BinaryExpr::new and make sure all usages within DataFusion codebase are updated correctly.

}

/// Helper function for tests that creates a binary expression with default ExecutionProps
fn binary_test(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

binary_test is suitable for a boolean-returning binary operators, e.g. < or =
it's not suitable for others, e.g. +, /

Suggested change
fn binary_test(
fn binary_expr(

use datafusion_expr::Operator;
use datafusion_physical_expr_common::physical_expr::fmt_sql;

/// Helper function for tests that creates a binary expression with default ExecutionProps
fn binary_test(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn binary_test(
fn binary_expr(

use datafusion_expr::Operator;

/// Helper function for tests that provides default ExecutionProps for binary function calls
fn binary_test(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn binary_test(
fn binary_expr(

rhs: Arc<dyn PhysicalExpr>,
schema: &Schema,
) -> Result<Arc<dyn PhysicalExpr>> {
let execution_props = ExecutionProps::new();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure that default behavior configured by ExecutionProps::new() is to fail on overflow

@andygrove andygrove added the api change Changes the API exposed to users of the crate label Sep 15, 2025
@andygrove
Copy link
Member

@EeshanBembi, I am curious about the performance impact of enabling overflow checking by default. Could you add criterion benchmarks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate common Related to common crate core Core DataFusion crate physical-expr Changes to the physical-expr crates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Numeric overflow should result in query error
4 participants