-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Feat: Allow pow with negative & non-integer exponent on decimals #19369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@Jefffrey could you review this ! |
1ccabc0 to
029e2bd
Compare
| decimal_from_i128::<T>(result_rounded as i128) | ||
| } | ||
|
|
||
| fn decimal_from_i128<T>(value: i128) -> Result<T, ArrowError> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentionally not supporting Decimal256 (i.e. i256) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
practically the f64 computation already loses precision past ~10^15 significant digits , isn't it ??
That's why I thought it isn't needed..
If you say then I can add that too
like add a Decimal256 path.
029e2bd to
d0193a1
Compare
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do wonder, with how many different paths we're introducing here to account for all edge cases, how other systems handle doing power for numeric/decimals 🤔
Do they just cast to float anyway or they have similar complexity of checks
| if value >= i32::MIN as i128 && value <= i32::MAX as i128 { | ||
| return Ok(T::from(value as i32)); | ||
| } | ||
|
|
||
| let is_negative = value < 0; | ||
| let abs_value = value.unsigned_abs(); | ||
|
|
||
| let billion = 1_000_000_000u128; | ||
| let mut result = T::from(0); | ||
| let mut multiplier = T::from(1); | ||
| let billion_t = T::from(1_000_000_000); | ||
|
|
||
| let mut remaining = abs_value; | ||
| while remaining > 0 { | ||
| let chunk = (remaining % billion) as i32; | ||
| remaining /= billion; | ||
|
|
||
| let chunk_value = T::from(chunk).mul_checked(multiplier).map_err(|_| { | ||
| ArrowError::ArithmeticOverflow(format!( | ||
| "Overflow while converting {value} to decimal type" | ||
| )) | ||
| })?; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really not quite sure what is happening here 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
converting an i128 result back to the generic type T (which could be i32, i64, i128, or i256). Since we can't directly convert i128 → T for all types, so we split the i128 into chunks of 10^9 (a billion), which fits in i32. Then we reconstruct T by: chunk[n] * (10^9)^n + chunk[n-1] * (10^9)^(n-1) + and so on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer a simplified implementation that has no need for this manual conversion from i128 back to i32/i64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In PostgreSQL they cast decimal to float64, compute pow(), cast back. They accept the precision loss. Also in MySQL same approach - converts DECIMAL to DOUBLE for POW(). |
4dda5d8 to
9f79025
Compare
|
Hey @Jefffrey could you review this !! |
9f79025 to
d1dde6d
Compare
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing all this logic introduced, I'm beginning to question whether there is actual benefit to having a native log implementation 🤔
Perhaps we should just revert to casting it to float and accept the accuracy loss
Thoughts @theirix ?
| "Cannot make unscale factor for {scale} and {exp}" | ||
| )) | ||
| }) | ||
| <T as From<i32>>::from(10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a little ugly, I recommend checking the existing traits to see if they have what we already need.
See:
Fair enough, the logic becomes more convoluted. The original idea was to introduce common decimal operations. Scale-preserving operations like abs, round, gcd, etc., are easy to implement and support. Some other operations with a natural mapping to decimals (like log10, pow10) adjust scales and do not have a natural analogue in the arrow buffer, leading to more complex logic. These operations are typical for data analytics, and applications could benefit from them. So ten-based operations can be calculated precisely, while for the rest and for more complicated operations, of course, it is fine to lose precision using a native float implementation. First, we should reuse the arrow's foundational primitives as much as possible. If there is an Second, I believe more logic should be isolated in |
That makes sense. I guess what we could also do to alleviate this complexity (and ensure less performance impact) would be:
This can be done in followup PRs of course but at least sets a roadmap for us. |
Sounds like a plan. The routing should be better made based on the type signature, rather than at eval time.
|
d1dde6d to
5015797
Compare
…pow with array exponents
Which issue does this PR close?
Closes #19348
Rationale for this change
Previously, pow() on decimal types would error for negative exponents and non-integer exponents with messages like:
What changes are included in this PR?
Added pow_decimal_float_fallback function that:
Are these changes tested?
Yes:
Unit tests for pow_decimal_float_fallback covering negative exponents, fractional exponents, cube roots
Updated SQL logic tests in decimal.slt
Are there any user-facing changes?
Yes. The following queries now work instead of returning errors: