Date time64 extended range#9404
Date time64 extended range#9404alexey-milovidov merged 76 commits intoClickHouse:masterfrom Enmk:DateTime64_extended_range
Conversation
base/common/DateLUTImpl.h
Outdated
There was a problem hiding this comment.
static constexpr const unsigned seconds_in_day = 86400; ?
src/Functions/DateTimeTransforms.h
Outdated
There was a problem hiding this comment.
Do we only change tabs here?
|
Good news are that most of DateTime functions are not slowed down. |
|
Parsing does not work correctly: |
|
Fixed. |
This is due to the limitations of DateTime / Date. Won't be fixed in this PR. |
|
|
|
Now performance is almost break even. |
|
@KrishnaPG I also have an idea that we can make a LUT of 16777216 second intervals (it is about 194 days) and then assume that there is at most one time transition during this interval (daylight saving time or global change). In the LUT cell we will record an offset to the beginning of the year from beginning of the interval, offset to the transition and the amount of transition if any, year number and a flag if it is a leap year. Then everything is calculated with simple arithmetic and a few branches. Unfortunately it looks less efficient than our current LUT by days (but I did not try). For example, the "round time to midnight" is performed in about 300 000 000 iterations per second on single CPU core and I'm not going to lose this performance. |
|
Another idea is to place a pointer to virtual table in the LUT cells. Basically the idea is to make LUTCell a polymorphic class and allocate it in LUT with placement new. It will make simple cases very cheap in cost of one indirect function call. |
Clickhouse DateTime64 extended range ClickHouse/ClickHouse#9404
Clickhouse DateTime64 extended range ClickHouse/ClickHouse#9404
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Extended range of
DateTime64to properly support dates from year 1925 to 2283. Improved support ofDateTimearound zero date (1970-01-01)....
Detailed description / Documentation draft:
The Year 1925 is a starting point because most of the timezones switched to saner (mostly 15-minutes based) offsets somewhere during 1924 or before. And that significantly simplifies implementation.
2238 is to simplify arithmetics for sanitizing LUT index access; there are less than 0x1ffff days from 1925.
As a collateral benefit, Date now correctly supports dates up to 2149 (instead of 2106).
Progress:
closes #7316