support cast decimal to signed numeric#1073
Conversation
…d decimal to signed numeric type support decimal to unsigned numeric
d17180f to
070bb40
Compare
|
@alamb PTAL |
Codecov Report
@@ Coverage Diff @@
## master #1073 +/- ##
==========================================
+ Coverage 82.27% 82.31% +0.03%
==========================================
Files 168 168
Lines 49281 49376 +95
==========================================
+ Hits 40547 40643 +96
+ Misses 8734 8733 -1
Continue to review full report at Codecov.
|
alamb
left a comment
There was a problem hiding this comment.
Thanks @liukun4515 -- looks good
I also recommend we add some Decimal cases to get_all_types in https://github.com/liukun4515/arrow-rs/blob/support_decimal_to_signed_numeric/arrow/src/compute/kernels/cast.rs#L4350
Perhaps as a follow on PR
| // check the overflow | ||
| // For example: Decimal(128,10,0) as i8 | ||
| // 128 is out of range i8 | ||
| if v <= max_bound && v >= min_bound { |
| let div: i128 = 10_i128.pow(*$SCALE as u32); | ||
| let min_bound = ($NATIVE_TYPE::MIN) as i128; | ||
| let max_bound = ($NATIVE_TYPE::MAX) as i128; | ||
| for i in 0..array.len() { |
There was a problem hiding this comment.
FWIW as a performance optimization in the future, we might be able to avoid the use of a builder and create the arrays directly
So something like
let new_array: Int8Array = array.iter()
.map(|v| v.map(|v| {
let v = v / div;
if v <= max_bound && v >= min_bound {
...
}
}).collect()?;
Or something
There was a problem hiding this comment.
add follow-up issue to track this enhancement.
#1083
| } | ||
|
|
||
| // TODO remove this function if the decimal array has the creator function | ||
| fn create_decimal_array( |
arrow/src/compute/kernels/cast.rs
Outdated
| use num::traits::Pow; | ||
|
|
||
| macro_rules! generate_cast_test_case { | ||
| ($INPUT_ARRAY: expr, $INPUT_ARRAY_TYPE: expr, $OUTPUT_TYPE_ARRAY: ident, $OUTPUT_TYPE: expr, $OUTPUT_VALUES: expr) => { |
There was a problem hiding this comment.
I think you could avoid having to pass in $INPUT_ARRAY_TYPE by using the data_type() function
like
let input_array_type = $INPUT_ARRAY.data_type()| vec![Some(1_i64), Some(2_i64), Some(3_i64), None, Some(5_i64)] | ||
| ); | ||
|
|
||
| // overflow test: out of range of max i8 |
| Float32Array, | ||
| &DataType::Float32, | ||
| vec![ | ||
| Some(1.25_f32), |
| let div: i128 = 10_i128.pow(*$SCALE as u32); | ||
| let min_bound = ($NATIVE_TYPE::MIN) as i128; | ||
| let max_bound = ($NATIVE_TYPE::MAX) as i128; | ||
| for i in 0..array.len() { |
There was a problem hiding this comment.
In general I think the use of Builders is not as fast as constructing arrays using FromIter as the bounds are checked on each element
In the following case, if we could structure the code in the following way
let output_array: $OUTPUT_ARRAY_TYPE = array
.iter()
.map(|v| {
v.map(|v| {
// scale the value v
})
}).collect()It may be significantly faster.
Perhaps a good follow on PR
There was a problem hiding this comment.
Thanks.
I will do this in the follow-up PR.
Maybe I need also to add some micro benchmark to get the result of performance improvement.
|
It appears this PR has a clippy error as well now |
| | Dictionary(_, _), | ||
| Null, | ||
| ) => true, | ||
| (Decimal(_, _), _) => false, |
There was a problem hiding this comment.
All data types expect above signed numeric data type, the result will be false with decimal.
| None, | ||
| Some(525), | ||
| Some(112345678), | ||
| Some(112345679), |
There was a problem hiding this comment.
this case is used to test the float32 excessive precision
| Some(325), | ||
| None, | ||
| Some(525), | ||
| Some(112345678901234568), |
There was a problem hiding this comment.
This case is used to test float64 excessive precision
* add cast test macro function; refactor other type to decimal type; add decimal to signed numeric type support decimal to unsigned numeric * address the comments and fix the clippy
* add cast test macro function; refactor other type to decimal type; add decimal to signed numeric type support decimal to unsigned numeric * address the comments and fix the clippy Co-authored-by: Kun Liu <[email protected]>
Which issue does this PR close?
part of #1043
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?