ARROW-9961: [Rust][DataFusion] Make to_timestamp function parses timestamp without timezone offset as local#8161
ARROW-9961: [Rust][DataFusion] Make to_timestamp function parses timestamp without timezone offset as local#8161alamb wants to merge 1 commit intoapache:masterfrom
Conversation
…stamp without timezone offset as local
|
FYI @jhorstmann / @andygrove |
|
@alamb I think you copy and pasted the same result twice in the PR description? |
jorgecarleitao
left a comment
There was a problem hiding this comment.
LGTM.
Thanks a lot for taking this, @alamb; datetime conversions are always nasty.
@andygrove -- I don't think so -- the timestamps are (very) subtlety different. Maybe this screenshot helps to show it better
|
|
I guess one choice has to be made, whether to default to local timezone or utc, and it's always possible to set the local timezone to utc to avoid these issues. My only concern is that this could cause tests to fail if they use local timestamps and are then run on machines with different time zones. This can probably only be solved with careful review, and by hopefully having the ci run with different time zones than your local machines. One (not directly related) issue I noticed while trying this out, is that the local patterns seem to require the millisecond part, while for utc timestamps with "Z" they are optional: |
|
Thank you @jhorstmann
Yes -- I agree. I have been bitten by this kind of issue in the past. The other fun one was when the local timezone offset changes due to daylight savings time. The other work around is to use a timestamp with an explicit timezone offset (eg ending in
This is a good find and I will fix that in a subsequent PR. |
…actional seconds As pointed out on @jhorstmann: #8161 (comment) > One (not directly related) issue I noticed while trying this out, is that the local patterns seem to require the millisecond part, while for utc timestamps with "Z" they are optional: ``` > select to_timestamp('2020-09-12T10:30:00') from test limit 1; ArrowError(ExternalError(General("Error parsing \'2020-09-12T10:30:00\' as timestamp"))) > select to_timestamp('2020-09-12T10:30:00Z') from test limit 1; +-------------------------------------------+ | totimestamp(Utf8("2020-09-12T10:30:00Z")) | +-------------------------------------------+ | 1599906600000000000 | +-------------------------------------------+ ``` This PR allows parsing local timestamp patterns without the fractional seconds and tests for same Closes #8179 from alamb/alamb/ARROW-9986-fractional-secs Authored-by: alamb <[email protected]> Signed-off-by: Andy Grove <[email protected]>

The TO_TIMESTAMP function added in #8142 supports parsing timestamps without a specified timezone, such as
2020-09-08T13:42:29.190855Such timestamps are supposed to be interpreted as in the local timezone, but instead are interpreted as UTC.
This PR corrects that logical error
Here is a query before this PR showing the problem on my machine (in
America/New_Yorktimezone) - note the values of the timestamp are the same for both theZand nonZtimestamps:Here is a query after this PR: