19369: Feat: Allow pow with negative & non-integer exponent on decimals#130
19369: Feat: Allow pow with negative & non-integer exponent on decimals#130martin-augment wants to merge 1 commit intomainfrom
Conversation
WalkthroughThe changes modify the decimal power function implementation to support negative and non-integer exponents. The ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Pull Request Review: Allow pow with negative & non-integer exponent on decimalsSummaryThis PR extends the 🔴 Critical Issues1. Type Conversion Security Risk (power.rs:178-183)let base_f64 = format!("{base:?}")
.parse::<f64>()Issue: Using Impact: Could cause runtime errors or incorrect results for certain decimal values. Recommendation: Use direct numeric conversion instead: // For i128/i64/i32 types, convert directly
let base_f64 = base.as_i128() as f64 / scale_factor;
// Or use ToPrimitive trait if available2. Precision Loss Not Documented (power.rs:171-203)Issue: The fallback function converts decimals to f64, performs the calculation, and converts back. This can lead to:
Impact: Users may get unexpected results when using negative/fractional exponents with high-precision decimals. Recommendation:
3. Overflow Check Incomplete (power.rs:196)if result_rounded.abs() > i128::MAX as f64 {Issue: This check has a subtle bug. When converting Recommendation: // Use a safer threshold that accounts for f64 precision
const I128_MAX_AS_F64: f64 = 1.7014118346046923e38; // Slightly under i128::MAX
if result_rounded.abs() >= I128_MAX_AS_F64 {
|
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
datafusion/functions/src/math/power.rs(3 hunks)datafusion/sqllogictest/test_files/decimal.slt(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Cursor Bugbot
- GitHub Check: Analyze (rust)
🔇 Additional comments (8)
datafusion/functions/src/math/power.rs (4)
119-121: LGTM! Clean delegation for negative exponents.The logic correctly identifies that negative exponents require float-based computation since they produce fractional results that the integer path cannot handle.
158-169: LGTM! Well-structured fast-path and error handling.The function appropriately routes small non-negative integer exponents to the fast integer path, handles non-finite exponents with a clear error, and delegates other cases to the new fallback implementation.
488-520: LGTM! Comprehensive test coverage for the fallback function.The tests effectively cover the key scenarios:
- Negative exponents producing fractional results
- Non-integer exponents (square/cube roots)
- Negative bases with integer exponents
- Verification that positive integer exponents use the fast path
- Error handling for non-finite exponents (NaN, Infinity)
205-260: Decimal256Type overflow handling is adequate.The chunked accumulation approach using
mul_checked()on lines 230-247 properly detects overflow for all decimal types, includingDecimal256Type. While the billion-based chunking could theoretically be optimized with a larger chunk size for wider types, this is not necessary given the existing overflow checks are sufficient.datafusion/sqllogictest/test_files/decimal.slt (4)
957-967: LGTM! Negative exponent tests validate the new functionality.The tests verify that negative exponents now work via the float fallback path:
- Line 961:
2^-5 = 0.03125correctly rounds to0with scale 0- Line 966:
4^-1 = 0.25is precisely represented with scale 5The tests confirm both precision behavior and type preservation (returning Decimal128).
987-991: LGTM! Non-integer exponent test validates the fallback path.The test correctly expects
2.5^4.2 ≈ 46.87to round to46.9with scale 1. This confirms that non-integer exponents now work via the float fallback.
993-997: LGTM! Error messages updated consistently with implementation.The error messages now describe non-finite exponents as "Cannot use non-finite exp: NaN"/"inf", matching the implementation in
pow_decimal_float(line 164 of power.rs).
999-1005: LGTM! Large exponent tests validate path routing and error handling.The tests correctly distinguish between:
- Line 1000: Float exponent
5000000000.1uses fallback path → infinity → overflow error- Line 1004: Integer exponent
5000000000exceedsu32::MAX→ integer path → unsupported errorThe comments clearly explain the different code paths taken.
| let base_f64 = format!("{base:?}") | ||
| .parse::<f64>() | ||
| .map(|v| v / scale_factor) | ||
| .map_err(|_| { | ||
| ArrowError::ComputeError(format!("Cannot convert base {base:?} to f64")) | ||
| })?; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find and read the power.rs file
find . -name "power.rs" -path "*/datafusion/*" -type f | head -5Repository: martin-augment/datafusion
Length of output: 109
🏁 Script executed:
cat -n ./datafusion/functions/src/math/power.rs | sed -n '165,210p'Repository: martin-augment/datafusion
Length of output: 1744
🏁 Script executed:
# Search for ArrowNativeTypeOp trait definition to understand available methods
rg -n "trait ArrowNativeTypeOp" --type=rust -A 20 | head -80Repository: martin-augment/datafusion
Length of output: 51
🏁 Script executed:
# Search for ArrowNativeTypeOp import or trait definition more broadly
rg "ArrowNativeTypeOp" --type=rust -B 3 -A 10 | head -100Repository: martin-augment/datafusion
Length of output: 8812
🏁 Script executed:
# Check for i32, i64, i128 Debug output to understand what the format!("{base:?}") produces
# Also look for as_i128 or similar conversion methods in the codebase
rg "as_i128|to_i128|into.*i128" --type=rust datafusion/functions/src/math/ -B 2 -A 2 | head -50Repository: martin-augment/datafusion
Length of output: 1261
🏁 Script executed:
# Look at the utils module to see what conversion functions are available
rg "decimal128_to_i128|decimal.*to_i128" --type=rust -B 5 -A 10 | head -80Repository: martin-augment/datafusion
Length of output: 5235
🏁 Script executed:
# Look at the full pow_decimal_float_fallback function and surrounding code to understand the context better
cat -n ./datafusion/functions/src/math/power.rs | sed -n '140,230p'Repository: martin-augment/datafusion
Length of output: 3541
🏁 Script executed:
# Check what types are passed to pow_decimal_float_fallback by looking at where it's called
rg "pow_decimal_float" --type=rust -B 5 -A 2 datafusion/functions/src/math/power.rs | head -100Repository: martin-augment/datafusion
Length of output: 3423
🏁 Script executed:
# Check i256 type and how it's handled in Arrow
rg "i256" --type=rust -B 2 -A 5 datafusion/functions/src/math/power.rs | head -50Repository: martin-augment/datafusion
Length of output: 51
🏁 Script executed:
# Search for i256 in the codebase to understand how it's used
rg "i256" --type=rust -B 2 -A 2 | head -80Repository: martin-augment/datafusion
Length of output: 6872
🏁 Script executed:
# Check how i256 is converted to/from other types
rg "from_i128|to_i128" --type=rust -B 2 -A 2 | grep -A 3 -B 3 "i256" | head -60Repository: martin-augment/datafusion
Length of output: 5403
🏁 Script executed:
# Let's check what debug output actually looks like for these types by examining test cases
cat -n ./datafusion/functions/src/math/power.rs | sed -n '240,300p'Repository: martin-augment/datafusion
Length of output: 2272
🏁 Script executed:
# Test what Debug output looks like for i128 and i32
cat > /tmp/test_debug.rs << 'EOF'
fn main() {
let vals: &[i128] = &[123, 456, -789, 0, i128::MAX];
for v in vals {
println!("Debug: {v:?}, Parsed: {}", format!("{v:?}").parse::<f64>().unwrap_or(f64::NAN));
}
}
EOF
rustc /tmp/test_debug.rs -o /tmp/test_debug 2>/dev/null && /tmp/test_debug || echo "Rust not available"Repository: martin-augment/datafusion
Length of output: 87
🏁 Script executed:
# Look at the complete decimal_from_i128 function and understand the flow
cat -n ./datafusion/functions/src/math/power.rs | sed -n '195,260p'Repository: martin-augment/datafusion
Length of output: 2415
🏁 Script executed:
# Final check: Look at what types are supported and how i256 is used
cat -n ./datafusion/functions/src/math/power.rs | sed -n '1,50p'Repository: martin-augment/datafusion
Length of output: 2406
Critical: Overflow check uses hardcoded i128::MAX for all decimal types.
Line 196 checks result_rounded.abs() > i128::MAX as f64 regardless of whether the target type T is i32 (Decimal32), i64 (Decimal64), i128 (Decimal128), or i256 (Decimal256). This allows overflow for smaller types and is too restrictive for Decimal256.
For example:
- Decimal32 accepts values up to i128::MAX but can only store up to i32::MAX (~2.1 billion)
- Decimal64 accepts values up to i128::MAX but can only store up to i64::MAX (~9.2 quintillion)
- Decimal256 cannot store values larger than i128::MAX even though it should support larger values
The overflow check should be type-aware. The downstream decimal_from_i128::<T>() function (lines 205-259) already performs proper overflow validation using mul_checked() and add_checked() for the specific type T, so the early check at line 196 is both redundant and incorrect.
Consider removing the premature overflow check at line 196 and relying on the type-specific overflow handling in decimal_from_i128::<T>().
There was a problem hiding this comment.
value:useful; category:bug; feedback:The CodeRabbit AI reviewer is correct! The implementation supports values up to i128::MAX and this makes Decimal256 as powerful as Decimal128. This should be either improved or documented as a known limitation.
| if result_rounded.abs() > i128::MAX as f64 { | ||
| return Err(ArrowError::ArithmeticOverflow(format!( | ||
| "Unsupported exp value: {exp}" | ||
| "Result {result_rounded} is too large for the target decimal type" | ||
| ))); | ||
| } |
There was a problem hiding this comment.
Major: Overflow check doesn't account for target decimal type size.
The overflow check compares against i128::MAX, but the target type T varies:
Decimal32Typeusesi32(max ~2.1 billion)Decimal64Typeusesi64(max ~9.2 quintillion)Decimal128Typeusesi128Decimal256Typeusesi256
This means:
- For Decimal32/64, values that pass this check could still overflow when converted to
T - For Decimal256, this check is too restrictive and rejects valid values
Apply this pattern to check against the actual type bounds:
- if result_rounded.abs() > i128::MAX as f64 {
+ // Check against the actual bounds of type T
+ // This requires knowing T's range; consider adding a trait method
+ // or using type-specific validation
+ let result_i128 = result_rounded as i128;
+
+ // Try the conversion and let decimal_from_i128 handle overflow
+ decimal_from_i128::<T>(result_i128)Alternatively, rely on decimal_from_i128 to perform the overflow checking, and remove this preliminary check since it will be caught anyway.
Verify the native type sizes and bounds:
#!/bin/bash
# Find the native type definitions for decimal types
ast-grep --pattern 'type Native = $_;' | rg -C2 'Decimal(32|64|128|256)'There was a problem hiding this comment.
value:useful; category:bug; feedback:The CodeRabbit AI reviewer is correct! The implementation supports values up to i128::MAX and this makes Decimal256 as powerful as Decimal128. This should be either improved or documented as a known limitation.
🤖 Augment PR SummarySummary: Extends decimal support for Changes:
Technical Notes: The fallback errors on non-finite exponents and on non-finite results (e.g., huge exponents producing infinity), and preserves the input decimal scale in outputs where possible. 🤖 Was this summary useful? React with 👍 or 👎 |
| where | ||
| T: From<i32> + ArrowNativeTypeOp, | ||
| { | ||
| let scale_factor = 10f64.powi(scale as i32); |
There was a problem hiding this comment.
pow_decimal_int still treats negative scale as not supported, but pow_decimal_float_fallback will currently proceed with negative scale (since it uses powi(scale as i32)). That can make power succeed for some negative-scale decimals where other exponent paths still error, so it’s worth confirming the intended consistency.
🤖 Was this useful? React with 👍 or 👎
There was a problem hiding this comment.
value:useful; category:bug; feedback:The Augment AI reviewer is correct! The Pull Request description and new unit/SLT tests talk only about negative exponent support but not about negative scale. A check for negative scale should be either added or new unit tests should be added.
| T: From<i32> + ArrowNativeTypeOp, | ||
| { | ||
| let scale_factor = 10f64.powi(scale as i32); | ||
| let base_f64 = format!("{base:?}") |
There was a problem hiding this comment.
Converting base to f64 via format!("{base:?}").parse() is potentially brittle for non-primitive decimal backing types (e.g., Decimal256), since Debug formatting isn’t guaranteed to be a plain numeric string. If Debug ever changes or includes type wrappers, this fallback path would start failing at runtime.
🤖 Was this useful? React with 👍 or 👎
There was a problem hiding this comment.
value:annoying; category:bug; feedback:The Augment AI reviewer is not correct! arrow-rs project provides the arrow-buffer crate that defines the i256 type. It's Debug and Display implementations are the same! And it also provides FromStr trait implementation that is used by str::parse::() - https://github.com/apache/arrow-rs/blob/6b290d192e2372f4bcbf7da91b4bd85dac43e4d5/arrow-buffer/src/bigint/mod.rs#L67-L79. There are unit tests confirming this - https://github.com/apache/arrow-rs/blob/6b290d192e2372f4bcbf7da91b4bd85dac43e4d5/arrow-buffer/src/bigint/mod.rs#L1332-L1381
| if exp < 0f64 || exp >= u32::MAX as f64 { | ||
|
|
||
| let result_scaled = result_f64 * scale_factor; | ||
| let result_rounded = result_scaled.round(); |
There was a problem hiding this comment.
The fallback rescales with round(), which can produce results that differ materially from the existing integer-exponent path’s truncation behavior (e.g., scale 0 cases like 2^-1 would round to 1). It may be worth clarifying/aligning the intended rounding semantics for decimal power when using the float fallback.
🤖 Was this useful? React with 👍 or 👎
There was a problem hiding this comment.
value:good-to-have; category:bug; feedback:The Augment AI reviewer is correct! The rounding loses precision and it should not be applied.
| .map(|v| v / scale_factor) | ||
| .map_err(|_| { | ||
| ArrowError::ComputeError(format!("Cannot convert base {base:?} to f64")) | ||
| })?; |
There was a problem hiding this comment.
Bug: Decimal256 fallback fails due to Debug format parsing
The pow_decimal_float_fallback function uses format!("{base:?}") (Debug format) to convert the base value to a string for parsing as f64. For primitive types like i32, i64, and i128, Debug outputs a parseable number. However, for i256 (used by Decimal256), the derived Debug format outputs a struct representation like i256 { low: X, high: Y } which cannot be parsed as f64. This causes Decimal256 with negative or non-integer exponents to always fail with an unhelpful "Cannot convert base" error, even though the PR intends to add support for these cases. The similar log_decimal256 function in log.rs handles this correctly by using to_i128() conversion instead.
There was a problem hiding this comment.
value:annoying; category:bug; feedback:The Bugbot AI reviewer is not correct! arrow-rs project provides the arrow-buffer crate that defines the i256 type. It's Debug and Display implementations are the same! And it also provides FromStr trait implementation that is used by str::parse::() - https://github.com/apache/arrow-rs/blob/6b290d192e2372f4bcbf7da91b4bd85dac43e4d5/arrow-buffer/src/bigint/mod.rs#L67-L79. There are unit tests confirming this - https://github.com/apache/arrow-rs/blob/6b290d192e2372f4bcbf7da91b4bd85dac43e4d5/arrow-buffer/src/bigint/mod.rs#L1332-L1381
value:annoying; category:bug; feedback:The Claude AI reviewer is not correct! arrow-rs project provides the arrow-buffer crate that defines the i256 type. It's Debug and Display implementations are the same! And it also provides FromStr trait implementation that is used by str::parse::() - https://github.com/apache/arrow-rs/blob/6b290d192e2372f4bcbf7da91b4bd85dac43e4d5/arrow-buffer/src/bigint/mod.rs#L67-L79. There are unit tests confirming this - https://github.com/apache/arrow-rs/blob/6b290d192e2372f4bcbf7da91b4bd85dac43e4d5/arrow-buffer/src/bigint/mod.rs#L1332-L1381 |
value:useful; category:bug; feedback:The Claude AI reviewer is correct! The implementation supports values up to i128::MAX and this makes Decimal256 as powerful as Decimal128. This should be either improved or documented as a known limitation. |
19369: To review by AI