fix: decimal conversion looses value on lower precision#6836
fix: decimal conversion looses value on lower precision#6836tustvold merged 3 commits intoapache:mainfrom
Conversation
|
@andygrove @viirya @alamb please take a look. |
andygrove
left a comment
There was a problem hiding this comment.
LGTM. The logic now matches the logic in cast_floating_point_to_decimal128. Thanks @himadripal
|
There is a regression in let array = vec![Some(i128::MAX), Some(i128::MIN)];
let input_decimal_array = create_decimal_array(array, 10, 3).unwrap();I would expect this to fail, so we probably need to add the same validation there. |
On second thoughts, it seems it was an intentional design decision not to validate this on array creation. Instead, an |
tustvold
left a comment
There was a problem hiding this comment.
Have we run any benchmarks (I'm not sure if any actually exist) to confirm this doesn't significantly regress performance.
It seems unfortunate to be always performing overflow checks, when in many cases it should be possible to prove that precision overflow can't occur and need not be checked for
I'll create a separate PR (probably tomorrow) to add some criterion benchmarks |
Changed the test to pass. |
…eded. revert whitespace changes formatting check
68b0f68 to
6c50fe3
Compare
|
Can anyone please let the build run - workflows waiting for approval. |
I created a simple benchmark for decimal casting in #6850. Unsurprisingly, validating that the results are correct is slower than not validating the results. beforeafter (this PR)We currently have the config option of So, yes, it is a performance regression, but the previous behavior was incorrect. This PR now makes this work as advertised. @tustvold Is there a use case we need to support for faster casts without validating results per the CastOptions? |
No, the cast should be checked, apologies if that wasn't clear, my concern was the PR as originally formulated blindly performed the checked conversion regardless of the input type, even when the cast was increasing the precision. Given the whole purpose of tracking precision is to avoid overflow checks, it seemed a little off. I'll try to find some time to take another look at this PR as from a quick scan this looks to have been addressed |
tustvold
left a comment
There was a problem hiding this comment.
This is still overly pessimistic, for example, if I increase both the precision and scale by the same amount I shouldn't need to perform any checks.
However, the kernel is already fallible (try_unary/unary_opt vs unary) and so this kernel already won't be vectorizing properly, and so it isn't like we're regressing a highly optimised kernel here.
I've therefore instead opted to file #6877 and if people care they can action it.
* decimal conversion looses value on lower precision, throws error now on overflow. * fix review comments and fix formatting. * for simple case of equal scale and bigger precision, no conversion needed. revert whitespace changes formatting check --------- Co-authored-by: himadripal <[email protected]>
* decimal conversion looses value on lower precision, throws error now on overflow. * fix review comments and fix formatting. * for simple case of equal scale and bigger precision, no conversion needed. revert whitespace changes formatting check --------- Co-authored-by: Himadri Pal <[email protected]> Co-authored-by: himadripal <[email protected]>
Which issue does this PR close?
Closes #6833
Closes #.
Rationale for this change
Decimal128 to Decimal128 with smaller precision produces incorrect results in some cases.
What changes are included in this PR?
It adds a decimal validation after conversion to check if the converted result can fit into the specified precision and scale
Are there any user-facing changes?