feat: Support Timestamp +/- Interval types#4038
Conversation
There was a problem hiding this comment.
PR looks neat! The PR performs arithmetic with respect to the UTC epoch, and it lacks timezone support on output. You can utilize with_timezone_opt() on output to support timezone information.
Maybe you can check the discussion under apache/datafusion#5764.
arrow-array/src/types.rs
Outdated
| .ok_or_else(|| ArrowError::ComputeError("Timestamp out of range".to_string())) | ||
| } | ||
|
|
||
| /// Adds the given IntervalMonthDayNanoType to an arrow TimestaTimestampMillisecondTypempSecondType |
There was a problem hiding this comment.
| /// Adds the given IntervalMonthDayNanoType to an arrow TimestaTimestampMillisecondTypempSecondType | |
| /// Adds the given IntervalMonthDayNanoType to an arrow TimestampMillisecondType |
|
LGTM! |
alamb
left a comment
There was a problem hiding this comment.
Thank you @Weijun-H -- this looks epic.
I think this PR needs a little more test coverage (ensure that all fields are non zero for some tests) and there is an outstanding comment by @metesynnada but otherwise this looks good to me
cc @tustvold in case you would like to weigh in
arrow-arith/src/arithmetic.rs
Outdated
There was a problem hiding this comment.
One thing I noticed while reviewing this code is that it will only support Timestamp + Interval not Interval + Timestamp - given this seems to be the case for Date + Interval as well, I don't think we have to fix it as part of this PR, but I will file a follow on ticket to try and improve the situation.
arrow-arith/src/arithmetic.rs
Outdated
There was a problem hiding this comment.
It would be good if at least one of these intervals had a year field set to makes sure it survive the roundtrip
Like
Some(IntervalYearMonthType::make_value(1, 2)),
There was a problem hiding this comment.
Likewise in all the tests below, it would be nice if more than one field was non zero so that we can be sure the field values are passed through to the corresponding operation
arrow-arith/src/arithmetic.rs
Outdated
There was a problem hiding this comment.
FYI you can write this more concisely like
| let result = result | |
| .as_any() | |
| .downcast_ref::<TimestampSecondArray>() | |
| .unwrap(); | |
| let result = result.as_primitive::<TimestampSecondType>(); |
arrow-array/src/types.rs
Outdated
There was a problem hiding this comment.
I tried to come up with a better formulation of this signature to avoid the boiler plate repetition (by finding some way to make the timestamp types have a templated to_naive function but I couldn't come up with anything better.
|
Also, when testing locally I got this error I think because this PR uses APIs that were introduced in chrono I needed to make this change (which I think we should do in the PR): diff --git a/arrow-array/Cargo.toml b/arrow-array/Cargo.toml
index 1b417bb0e..634a0aa64 100644
--- a/arrow-array/Cargo.toml
+++ b/arrow-array/Cargo.toml
@@ -44,7 +44,7 @@ ahash = { version = "0.8", default-features = false, features = ["runtime-rng"]
arrow-buffer = { workspace = true }
arrow-schema = { workspace = true }
arrow-data = { workspace = true }
-chrono = { version = "0.4.23", default-features = false, features = ["clock"] }
+chrono = { version = "0.4.24", default-features = false, features = ["clock"] }
chrono-tz = { version = "0.8", optional = true }
num = { version = "0.4", default-features = false, features = ["std"] }
half = { version = "2.1", default-features = false, features = ["num-traits"] } |
d2b2714 to
e513934
Compare
alamb
left a comment
There was a problem hiding this comment.
Thank you @Weijun-H -- this PR looks good to me. Thank you to @metesynnada for the review.
@tustvold @waitingkuo or @avantgardnerio would anyone else like to review this PR prior to merge?
arrow-arith/src/arithmetic.rs
Outdated
| // timestamp second + interval year month | ||
| let a = TimestampSecondArray::from(vec![1, 2, 3, 4, 5]); | ||
| let b = IntervalYearMonthArray::from(vec![ | ||
| Some(IntervalYearMonthType::make_value(1, 1)), |
There was a problem hiding this comment.
The reason I was suggesting tests that had different non zero fields (e.g. years and months in this example)
e.g.
Some(IntervalYearMonthType::make_value(1, 2)),Would be to catch errors such as when the year and month fields were mixed up in the code. For example, with
Some(IntervalYearMonthType::make_value(1, 1)),If the code made a mistake with which field , these tests would still pass.
There was a problem hiding this comment.
While I still think the tests can be improved, I also think they are good enough to merge
|
Here is a PR to bump the chrono version: #4093 |
Which issue does this PR close?
Closes #3963
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?