Skip to content

19926: fix: increase ROUND decimal precision to prevent overflow truncation#204

Open
martin-augment wants to merge 4 commits intomainfrom
pr-19926-2026-01-23-07-54-04
Open

19926: fix: increase ROUND decimal precision to prevent overflow truncation#204
martin-augment wants to merge 4 commits intomainfrom
pr-19926-2026-01-23-07-54-04

Conversation

@martin-augment
Copy link
Owner

19926: To review by AI

@gemini-code-assist
Copy link

Summary of Changes

Hello @martin-augment, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue in the ROUND function where decimal values could experience truncation due to overflow. The solution involves intelligently adjusting the output scale of decimal numbers based on the rounding precision specified. This change ensures that rounding operations on decimal types maintain data integrity and behave consistently with industry-standard implementations, improving the robustness of DataFusion's mathematical functions.

Highlights

  • Decimal Precision Fix for ROUND function: The ROUND function for decimal types has been updated to prevent overflow truncation by dynamically adjusting the output scale. This aligns its behavior with other database systems like Spark and DuckDB.
  • Dynamic Return Type Calculation: The return_type method in ScalarUDFImpl was replaced with return_field_from_args. This allows the ROUND function to determine the output decimal scale based on the decimal_places argument, specifically reducing the scale when decimal_places is a constant scalar.
  • Enhanced round_decimal Function: The core round_decimal helper function now accepts an output_scale parameter, enabling precise control over the resulting decimal's scale and ensuring correct handling of integer precision reclamation.
  • Updated SQL Logic Tests: SQL logic tests (.slt files) have been modified and expanded to validate the new behavior, including scenarios where scale reduction prevents overflow and handles carry-overs correctly.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Jan 23, 2026

Walkthrough

The changes refactor the round function implementation to compute return field types using a new return_field_from_args method instead of the existing return_type approach. The round_decimal function signature was modified to accept separate input_scale and output_scale parameters for more precise decimal scaling control. The round_columnar function now accepts a return_type parameter to influence decimal and float rounding operations. Test expectations were updated to reflect adjusted scales resulting from the rounding operations, and new test cases were added to cover carry-over and max-precision edge cases. Across three files, 169 lines were added and 46 lines were removed.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@claude
Copy link

claude bot commented Jan 23, 2026

Code Review: ROUND Decimal Precision Fix

This PR addresses overflow issues in the ROUND function for decimal types by reducing the output scale to reclaim precision for the integer part. Here's my analysis:


✅ Positive Aspects

  1. Correct Problem Identification: The PR correctly addresses the overflow issue when rounding decimals at max precision by reducing the scale, which reclaims precision bits for the integer part.

  2. Comprehensive Coverage: All decimal types (Decimal32, Decimal64, Decimal128, Decimal256) are handled consistently.

  3. Proper Scale Handling: The logic correctly handles:

    • Positive decimal_places: reduces scale to min(current_scale, decimal_places)
    • Negative decimal_places: reduces scale to 0
    • Non-scalar decimal_places: preserves original scale (good defensive programming)
  4. Test Coverage: Good test cases added including:

    • Carry-over scenarios (999.9 → 1000)
    • Max precision edge case (38-digit decimal)
    • Scale reduction verification

🔍 Code Quality Issues

1. Duplicated Code (round.rs:146-177)

The scale calculation logic is duplicated across all 4 decimal types:

if dp_is_scalar {
    let new_scale = (*scale).min(decimal_places.max(0) as i8);
    Decimal32(*precision, new_scale)
} else {
    Decimal32(*precision, *scale)
}

Recommendation: Extract this into a helper function:

fn calculate_decimal_scale(
    precision: u8,
    scale: i8,
    decimal_places: i32,
    dp_is_scalar: bool,
    constructor: impl Fn(u8, i8) -> DataType
) -> DataType {
    if dp_is_scalar {
        let new_scale = scale.min(decimal_places.max(0) as i8);
        constructor(precision, new_scale)
    } else {
        constructor(precision, scale)
    }
}

2. Duplicated Scale Calculation (round.rs:231-258)

Same scale calculation is repeated in the scalar fast path. Should share the same logic with return_field_from_args.

3. Complex Conditional Logic (round.rs:461-486)

The output_scale == computed_scale branching is hard to follow. The comment says "scale reduction" but the else branch also handles the original behavior.

Recommendation: Add more detailed comments explaining:

  • Why we need two different code paths
  • What each path represents (scale reduction vs. scale preservation)
  • Examples of when each path is taken

🐛 Potential Bugs

1. Int64 to i32 Cast May Overflow (round.rs:133)

Some(ScalarValue::Int64(Some(v))) => (*v as i32, true),

Issue: If v > i32::MAX or v < i32::MIN, this will silently wrap/truncate.

Recommendation: Use checked conversion:

Some(ScalarValue::Int64(Some(v))) => {
    if let Ok(v_i32) = i32::try_from(*v) {
        (v_i32, true)
    } else {
        (0, false) // or return an error
    }
}

2. Inconsistent Null Handling (round.rs:134)

When decimal_places is null, the code treats it as a non-scalar and keeps the original scale. This seems correct but might be surprising behavior.

Question: Should round(123.456, NULL) return an error or NULL instead?

3. Missing Validation (round.rs:148, 156, 164, 172)

let new_scale = (*scale).min(decimal_places.max(0) as i8);

Issue: If decimal_places > i8::MAX (127), the cast will truncate. While unlikely in practice, this could cause unexpected behavior.

Recommendation: Add validation:

if decimal_places > i8::MAX as i32 {
    return exec_err!("decimal_places {} exceeds maximum scale {}", decimal_places, i8::MAX);
}

⚡ Performance Considerations

  1. Tuple Pattern Matching (round.rs:307): The switch from value_array.data_type() to (value_array.data_type(), return_type) adds slight overhead, but it's negligible and necessary for correctness. ✅

  2. Extra Parameter Passing (round.rs:268-273): Adding return_type parameter is clean and doesn't impact performance. ✅


🔒 Security Concerns

No security issues identified. The overflow checks are properly implemented using checked arithmetic operations. ✅


📋 Test Coverage Concerns

Missing Test Cases:

  1. Column-based decimal_places: No test for when decimal_places is an array/column rather than a scalar constant
  2. Negative decimal_places edge cases: Only basic -1 case tested
  3. Scale preservation case: No test verifying that non-scalar dp preserves scale
  4. Int64 decimal_places: No test for round(decimal, bigint_column)
  5. Error cases: No test for extreme decimal_places values

Recommendation: Add sqllogictest cases:

-- Test column-based decimal_places
SELECT round(cast(col1 as decimal(10,3)), col2) FROM table;

-- Test negative decimal_places boundary
SELECT round(cast('999.99' as decimal(5,2)), -3);

-- Test very large decimal_places
SELECT round(cast('123.456' as decimal(10,5)), 1000);

📝 Documentation

  1. return_type deprecation (round.rs:188-190): Good use of internal_err! to guide future developers. ✅

  2. Missing changelog: Consider adding a note about the breaking change in output scale behavior.

  3. Function comments: The round_decimal function signature changed significantly but the doc comments weren't updated to explain the new output_scale parameter.


🎯 Summary

Overall Assessment: This is a solid fix that addresses a real overflow issue. The core logic is sound and the test coverage demonstrates the key scenarios work correctly.

Priority Issues to Address:

  1. 🔴 High: Fix Int64 to i32 cast (potential silent overflow)
  2. 🟡 Medium: Reduce code duplication in scale calculation
  3. 🟡 Medium: Add validation for decimal_places > i8::MAX
  4. 🟢 Low: Add test coverage for column-based decimal_places
  5. 🟢 Low: Improve comments in complex conditional logic

Recommendation: Request changes to address the Int64 cast issue and code duplication before merging.


Great work identifying and fixing this overflow issue! The approach of reducing scale to reclaim precision is the correct solution. 🎉

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes an overflow issue in the round function for decimal types by adjusting the output scale. The changes are mainly in datafusion/functions/src/math/round.rs and seem logically correct. The updated tests in .slt files properly cover the new behavior and add more coverage, which is great. My main feedback is regarding code duplication in several places, which could be refactored to improve maintainability. I've left specific suggestions on how to address this.

Comment on lines +146 to +177
Decimal32(precision, scale) => {
if dp_is_scalar {
let new_scale = (*scale).min(decimal_places.max(0) as i8);
Decimal32(*precision, new_scale)
} else {
Decimal32(*precision, *scale)
}
}
Decimal64(precision, scale) => {
if dp_is_scalar {
let new_scale = (*scale).min(decimal_places.max(0) as i8);
Decimal64(*precision, new_scale)
} else {
Decimal64(*precision, *scale)
}
}
Decimal128(precision, scale) => {
if dp_is_scalar {
let new_scale = (*scale).min(decimal_places.max(0) as i8);
Decimal128(*precision, new_scale)
} else {
Decimal128(*precision, *scale)
}
}
Decimal256(precision, scale) => {
if dp_is_scalar {
let new_scale = (*scale).min(decimal_places.max(0) as i8);
Decimal256(*precision, new_scale)
} else {
Decimal256(*precision, *scale)
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's significant code duplication in these match arms for handling different decimal types. This can make future maintenance harder. Consider refactoring this logic to reduce repetition, for example by using a macro.

Here's a suggestion using a macro:

macro_rules! adjust_decimal_type {
    ($DecimalTy:ident, $precision:expr, $scale:expr, $dp_is_scalar:expr, $decimal_places:expr) => {
        if $dp_is_scalar {
            let new_scale = (*$scale).min($decimal_places.max(0) as i8);
            DataType::$DecimalTy(*$precision, new_scale)
        } else {
            DataType::$DecimalTy(*$precision, *$scale)
        }
    };
}

// Then in the match statement:
let return_type = match input_type {
    Float32 => Float32,
    Decimal32(precision, scale) => adjust_decimal_type!(Decimal32, precision, scale, dp_is_scalar, decimal_places),
    Decimal64(precision, scale) => adjust_decimal_type!(Decimal64, precision, scale, dp_is_scalar, decimal_places),
    Decimal128(precision, scale) => adjust_decimal_type!(Decimal128, precision, scale, dp_is_scalar, decimal_places),
    Decimal256(precision, scale) => adjust_decimal_type!(Decimal256, precision, scale, dp_is_scalar, decimal_places),
    _ => Float64,
};

This would make the code more concise and easier to maintain.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The Gemini AI reviewer is correct! Using a macro or a helper function would reduce the amount of duplicated code. Prevents maintaining duplicated code

Comment on lines 231 to 259
ScalarValue::Decimal128(Some(v), precision, scale) => {
let rounded = round_decimal(*v, *scale, dp)?;
// Reduce scale to reclaim integer precision
let new_scale = (*scale).min(dp.max(0) as i8);
let rounded = round_decimal(*v, *scale, new_scale, dp)?;
let scalar =
ScalarValue::Decimal128(Some(rounded), *precision, *scale);
ScalarValue::Decimal128(Some(rounded), *precision, new_scale);
Ok(ColumnarValue::Scalar(scalar))
}
ScalarValue::Decimal256(Some(v), precision, scale) => {
let rounded = round_decimal(*v, *scale, dp)?;
let new_scale = (*scale).min(dp.max(0) as i8);
let rounded = round_decimal(*v, *scale, new_scale, dp)?;
let scalar =
ScalarValue::Decimal256(Some(rounded), *precision, *scale);
ScalarValue::Decimal256(Some(rounded), *precision, new_scale);
Ok(ColumnarValue::Scalar(scalar))
}
ScalarValue::Decimal64(Some(v), precision, scale) => {
let rounded = round_decimal(*v, *scale, dp)?;
let new_scale = (*scale).min(dp.max(0) as i8);
let rounded = round_decimal(*v, *scale, new_scale, dp)?;
let scalar =
ScalarValue::Decimal64(Some(rounded), *precision, *scale);
ScalarValue::Decimal64(Some(rounded), *precision, new_scale);
Ok(ColumnarValue::Scalar(scalar))
}
ScalarValue::Decimal32(Some(v), precision, scale) => {
let rounded = round_decimal(*v, *scale, dp)?;
let new_scale = (*scale).min(dp.max(0) as i8);
let rounded = round_decimal(*v, *scale, new_scale, dp)?;
let scalar =
ScalarValue::Decimal32(Some(rounded), *precision, *scale);
ScalarValue::Decimal32(Some(rounded), *precision, new_scale);
Ok(ColumnarValue::Scalar(scalar))
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to another comment, there is a lot of duplicated logic for handling different decimal types here. This could be refactored to improve maintainability, for instance, by using a macro.

Here's a possible implementation:

macro_rules! round_decimal_scalar {
    ($v:expr, $precision:expr, $scale:expr, $dp:expr, $DecimalScalar:ident) => {{{
        let new_scale = (*$scale).min($dp.max(0) as i8);
        let rounded = round_decimal(*$v, *$scale, new_scale, $dp)?;
        let scalar = ScalarValue::$DecimalScalar(Some(rounded), *$precision, new_scale);
        Ok(ColumnarValue::Scalar(scalar))
    }}};
}

// Then in the match statement:
match value_scalar {
    // ...
    ScalarValue::Decimal128(Some(v), precision, scale) => {
        round_decimal_scalar!(v, precision, scale, dp, Decimal128)
    }
    ScalarValue::Decimal256(Some(v), precision, scale) => {
        round_decimal_scalar!(v, precision, scale, dp, Decimal256)
    }
    // ... and so on for other decimal types
}

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The Gemini AI reviewer is correct! Using a macro or a helper function would reduce the amount of duplicated code. Prevents maintaining duplicated code

@augmentcode
Copy link

augmentcode bot commented Jan 23, 2026

🤖 Augment PR Summary

Summary: Updates the round scalar UDF to avoid decimal overflow/truncation by reducing decimal scale when rounding with constant decimal_places.

Changes:

  • Implements return_field_from_args for round and computes a scale-adjusted decimal return type when decimal_places is a constant.
  • Updates scalar fast-path rounding for Decimal32/64/128/256 to emit values with the reduced output scale.
  • Plumbs the computed return_type into the columnar path so array rounding can build arrays with the adjusted scale.
  • Extends round_decimal to optionally keep original-scale behavior or return reduced-scale results (including negative decimal_places).
  • Updates existing SQLLogicTest expectations for the new decimal output types.
  • Adds SQLLogicTest coverage for carry-over rounding and rounding at max precision where overflow previously occurred.

Technical Notes: Scale is only reduced when decimal_places is a constant; if it varies per row, the original input scale is preserved.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

| dt @ Decimal64(_, _) => dt,
Decimal32(precision, scale) => {
if dp_is_scalar {
let new_scale = (*scale).min(decimal_places.max(0) as i8);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decimal_places.max(0) as i8 can wrap for large values (e.g. 1000 -> negative), potentially producing an invalid negative decimal scale; it seems safer to clamp/compare in i32 before converting to i8.

Other Locations
  • datafusion/functions/src/math/round.rs:156
  • datafusion/functions/src/math/round.rs:164
  • datafusion/functions/src/math/round.rs:172
  • datafusion/functions/src/math/round.rs:233
  • datafusion/functions/src/math/round.rs:240
  • datafusion/functions/src/math/round.rs:247
  • datafusion/functions/src/math/round.rs:254

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The Augment AI reviewer is correct! The decimal_places is an input provided by the user, so it should not be trusted blindly. Every conversion to a smaller numeric type should be done in a checked way and too big values should be reported as errors instead of using the overflown value.

Ok(Arc::new(Field::new(
self.name(),
return_type,
input_field.is_nullable(),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The returned field nullability only mirrors the first argument (input_field.is_nullable()), but round will also produce nulls when decimal_places is null/nullable; consider including nullability from all args.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The Augment AI reviewer is correct! decimal_places=NULL should lead to a NULL result too. This is how other DBs also implement it - Postgres, DuckDB, Apache Spark. Prevents surprising behavior from Apache DataFusion that is different than other similar products.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

if args.scalar_arguments.len() > 1 {
match args.scalar_arguments[1] {
Some(ScalarValue::Int32(Some(v))) => (*v, true),
Some(ScalarValue::Int64(Some(v))) => (*v as i32, true),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe Int64 truncation in decimal_places extraction

Medium Severity

The scalar_arguments[1] extraction casts Int64 values to i32 using the unsafe as i32 operator, which silently truncates values outside the i32 range. For example, passing 5000000000i64 would be truncated to 705032704, causing incorrect scale calculation in return_field_from_args. This should use .try_into() to properly handle overflow.

Fix in Cursor Fix in Web

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The Bugbot AI reviewer is correct! The decimal_places is an input provided by the user, so it should not be trusted blindly. Every conversion to a smaller numeric type should be done in a checked way and too big values should be reported as errors instead of using the overflown value.

decimal_places: Option<ArrayRef>,
) -> Result<ArrayRef, DataFusionError> {
let number_rows = value.len();
let return_type = value.data_type().clone();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test helper incorrectly uses input type as return

Medium Severity

The round_arrays test helper uses value.data_type().clone() as the return type, which is incorrect for decimal inputs where the scale should be reduced to match decimal_places. This works for float tests but would produce wrong results for decimal array tests. The helper should compute the correct return type by reducing the scale when decimal_places is a scalar constant.

Fix in Cursor Fix in Web

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:good-to-have; category:bug; feedback:The Bugbot AI reviewer is correct! The return type might be different for decimal input types which scale is reduced due to carry-over. Any test for decimal that uses this helper will face the issue. Prevents possible test failure if such a test is introduced.

| dt @ Decimal64(_, _) => dt,
Decimal32(precision, scale) => {
if dp_is_scalar {
let new_scale = (*scale).min(decimal_places.max(0) as i8);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Integer overflow in decimal_places cast to i8

High Severity

The code casts decimal_places.max(0) as i8 without bounds checking, causing integer wrapping when decimal_places > 127. For example, round(value, 200) would compute new_scale = scale.min(200 as i8) = scale.min(-56), producing a negative scale and creating invalid decimal types. This leads to incorrect results or runtime errors. The cast appears at lines 148, 156, 164, 172, 233, 240, 247, and 254.

Additional Locations (2)

Fix in Cursor Fix in Web

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The Bugbot AI reviewer is correct! The decimal_places is an input provided by the user, so it should not be trusted blindly. Every conversion to a smaller numeric type should be done in a checked way and too big values should be reported as errors instead of using the overflown value.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@datafusion/functions/src/math/round.rs`:
- Around line 140-177: The decimal branch logic for Decimal32/64/128/256
computes new_scale by casting decimal_places to i8 which can wrap; before
casting clamp decimal_places to the range [-128, 127] or better to the existing
*scale bounds so it cannot overflow (e.g., let clamped =
decimal_places.max(i32::from(-*scale)).min(i32::from(*scale)) or clamp to
i8::MIN..=i8::MAX) and then cast clamped as i8; update the new_scale computation
in the match arms for Decimal32, Decimal64, Decimal128 and Decimal256 and apply
the same clamp for the other similar block referenced in the comment (lines
~232-258).

Comment on lines +140 to +177
// Calculate return type based on input type
// For decimals: reduce scale to decimal_places (reclaims precision for integer part)
// This matches Spark/DuckDB behavior where ROUND adjusts the scale
// BUT only if dp is a constant - otherwise keep original scale
let return_type = match input_type {
Float32 => Float32,
dt @ Decimal128(_, _)
| dt @ Decimal256(_, _)
| dt @ Decimal32(_, _)
| dt @ Decimal64(_, _) => dt,
Decimal32(precision, scale) => {
if dp_is_scalar {
let new_scale = (*scale).min(decimal_places.max(0) as i8);
Decimal32(*precision, new_scale)
} else {
Decimal32(*precision, *scale)
}
}
Decimal64(precision, scale) => {
if dp_is_scalar {
let new_scale = (*scale).min(decimal_places.max(0) as i8);
Decimal64(*precision, new_scale)
} else {
Decimal64(*precision, *scale)
}
}
Decimal128(precision, scale) => {
if dp_is_scalar {
let new_scale = (*scale).min(decimal_places.max(0) as i8);
Decimal128(*precision, new_scale)
} else {
Decimal128(*precision, *scale)
}
}
Decimal256(precision, scale) => {
if dp_is_scalar {
let new_scale = (*scale).min(decimal_places.max(0) as i8);
Decimal256(*precision, new_scale)
} else {
Decimal256(*precision, *scale)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clamp decimal_places before casting to i8 to avoid scale wraparound.

Casting large positive decimal_places directly to i8 can wrap (e.g., 1000 → -24), yielding incorrect output scales/metadata. Use an i32 clamp against the existing scale and then cast.

🛠️ Proposed fix (apply to all decimal variants shown)
@@
-        let return_type = match input_type {
+        let dp_clamped = decimal_places.max(0);
+        let return_type = match input_type {
@@
-            Decimal32(precision, scale) => {
+            Decimal32(precision, scale) => {
                 if dp_is_scalar {
-                    let new_scale = (*scale).min(decimal_places.max(0) as i8);
+                    let new_scale = (*scale as i32).min(dp_clamped) as i8;
                     Decimal32(*precision, new_scale)
                 } else {
                     Decimal32(*precision, *scale)
                 }
             }
@@
-            Decimal64(precision, scale) => {
+            Decimal64(precision, scale) => {
                 if dp_is_scalar {
-                    let new_scale = (*scale).min(decimal_places.max(0) as i8);
+                    let new_scale = (*scale as i32).min(dp_clamped) as i8;
                     Decimal64(*precision, new_scale)
                 } else {
                     Decimal64(*precision, *scale)
                 }
             }
@@
-            Decimal128(precision, scale) => {
+            Decimal128(precision, scale) => {
                 if dp_is_scalar {
-                    let new_scale = (*scale).min(decimal_places.max(0) as i8);
+                    let new_scale = (*scale as i32).min(dp_clamped) as i8;
                     Decimal128(*precision, new_scale)
                 } else {
                     Decimal128(*precision, *scale)
                 }
             }
@@
-            Decimal256(precision, scale) => {
+            Decimal256(precision, scale) => {
                 if dp_is_scalar {
-                    let new_scale = (*scale).min(decimal_places.max(0) as i8);
+                    let new_scale = (*scale as i32).min(dp_clamped) as i8;
                     Decimal256(*precision, new_scale)
                 } else {
                     Decimal256(*precision, *scale)
                 }
             }
@@
-            let dp = if let ScalarValue::Int32(Some(dp)) = dp_scalar {
+            let dp = if let ScalarValue::Int32(Some(dp)) = dp_scalar {
                 *dp
             } else {
@@
-                ScalarValue::Decimal128(Some(v), precision, scale) => {
+                ScalarValue::Decimal128(Some(v), precision, scale) => {
                     // Reduce scale to reclaim integer precision
-                    let new_scale = (*scale).min(dp.max(0) as i8);
+                    let new_scale = (*scale as i32).min(dp.max(0)) as i8;
                     let rounded = round_decimal(*v, *scale, new_scale, dp)?;
                     let scalar =
                         ScalarValue::Decimal128(Some(rounded), *precision, new_scale);
                     Ok(ColumnarValue::Scalar(scalar))
                 }
@@
-                ScalarValue::Decimal256(Some(v), precision, scale) => {
-                    let new_scale = (*scale).min(dp.max(0) as i8);
+                ScalarValue::Decimal256(Some(v), precision, scale) => {
+                    let new_scale = (*scale as i32).min(dp.max(0)) as i8;
                     let rounded = round_decimal(*v, *scale, new_scale, dp)?;
                     let scalar =
                         ScalarValue::Decimal256(Some(rounded), *precision, new_scale);
                     Ok(ColumnarValue::Scalar(scalar))
                 }
@@
-                ScalarValue::Decimal64(Some(v), precision, scale) => {
-                    let new_scale = (*scale).min(dp.max(0) as i8);
+                ScalarValue::Decimal64(Some(v), precision, scale) => {
+                    let new_scale = (*scale as i32).min(dp.max(0)) as i8;
                     let rounded = round_decimal(*v, *scale, new_scale, dp)?;
                     let scalar =
                         ScalarValue::Decimal64(Some(rounded), *precision, new_scale);
                     Ok(ColumnarValue::Scalar(scalar))
                 }
@@
-                ScalarValue::Decimal32(Some(v), precision, scale) => {
-                    let new_scale = (*scale).min(dp.max(0) as i8);
+                ScalarValue::Decimal32(Some(v), precision, scale) => {
+                    let new_scale = (*scale as i32).min(dp.max(0)) as i8;
                     let rounded = round_decimal(*v, *scale, new_scale, dp)?;
                     let scalar =
                         ScalarValue::Decimal32(Some(rounded), *precision, new_scale);
                     Ok(ColumnarValue::Scalar(scalar))
                 }

Also applies to: 232-258

🤖 Prompt for AI Agents
In `@datafusion/functions/src/math/round.rs` around lines 140 - 177, The decimal
branch logic for Decimal32/64/128/256 computes new_scale by casting
decimal_places to i8 which can wrap; before casting clamp decimal_places to the
range [-128, 127] or better to the existing *scale bounds so it cannot overflow
(e.g., let clamped =
decimal_places.max(i32::from(-*scale)).min(i32::from(*scale)) or clamp to
i8::MIN..=i8::MAX) and then cast clamped as i8; update the new_scale computation
in the match arms for Decimal32, Decimal64, Decimal128 and Decimal256 and apply
the same clamp for the other similar block referenced in the comment (lines
~232-258).

Copy link
Owner Author

@martin-augment martin-augment Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value:useful; category:bug; feedback:The CodeRabbit AI reviewer is correct! The decimal_places is an input provided by the user, so it should not be trusted blindly. Every conversion to a smaller numeric type should be done in a checked way and too big values should be reported as errors instead of using the overflown value.

@martin-augment
Copy link
Owner Author

. Int64 to i32 Cast May Overflow (round.rs:133)

Some(ScalarValue::Int64(Some(v))) => (*v as i32, true),

Issue: If v > i32::MAX or v < i32::MIN, this will silently wrap/truncate.

Recommendation: Use checked conversion:

Some(ScalarValue::Int64(Some(v))) => {
    if let Ok(v_i32) = i32::try_from(*v) {
        (v_i32, true)
    } else {
        (0, false) // or return an error
    }
}

value:useful; category:bug; feedback:The Claude AI reviewer is correct! The decimal_places is an input provided by the user, so it should not be trusted blindly. Every conversion to a smaller numeric type should be done in a checked way and too big values should be reported as errors instead of using the overflown value.

@martin-augment
Copy link
Owner Author

2. Inconsistent Null Handling (round.rs:134)

When decimal_places is null, the code treats it as a non-scalar and keeps the original scale. This seems correct but might be surprising behavior.

Question: Should round(123.456, NULL) return an error or NULL instead?

value:useful; category:bug; feedback:The Claude AI reviewer is correct! decimal_places=NULL should lead to a NULL result too. This is how other DBs also implement it - Postgres, DuckDB, Apache Spark. Prevents surprising behavior from Apache DataFusion that is different than other similar products.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants