Skip to content

Saturate date/datetime to zero.#29953

Merged
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
amosbird:saturatetime
Jan 3, 2022
Merged

Saturate date/datetime to zero.#29953
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
amosbird:saturatetime

Conversation

@amosbird
Copy link
Copy Markdown
Collaborator

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Now date time conversion functions that generates time before 1970-01-01 00:00:00 will be saturated to zero instead of overflow.

#2327
#3470
#12377
#17354
#18511
#22928
#25284
#26137
#27150
#29835

Detailed description / Documentation draft:
.

@robot-clickhouse robot-clickhouse added the pr-improvement Pull request with some product improvements label Oct 10, 2021
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strange.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a comment says that 1970-01-01 is used as the date part: https://github.com/ClickHouse/ClickHouse/blob/c6e02a534f97720471e67c5d9cfddbc0fef62b1f/src/Functions/DateTimeTransforms.h#L239

However the old behavior apparently uses 1970-01-02 which most likely to avoid overflow. After this PR it's ok.

Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov Oct 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend not changing toTime as it has more chance to break some use cases.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I'll change the comment instead.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check for western time zones, like America/Los_Angeles.

@alexey-milovidov alexey-milovidov self-assigned this Oct 10, 2021
@qoega
Copy link
Copy Markdown
Member

qoega commented Oct 10, 2021

Is it pr-backward-incompatible?

@alexey-milovidov
Copy link
Copy Markdown
Member

Overflow has implementation defined behaviour in ClickHouse, and this behaviour is changed.
You can say that the change is backward compatible, although some users may rely on the specific behaviour on overflow.

@amosbird amosbird force-pushed the saturatetime branch 2 times, most recently from b413a93 to 9e0d0ae Compare October 11, 2021 09:29
@alexey-milovidov
Copy link
Copy Markdown
Member

Perf test is showing a small slowdown.

@amosbird
Copy link
Copy Markdown
Collaborator Author

amosbird commented Oct 14, 2021

Perf test is showing a small slowdown.

Maybe we can build another lookup table in DateTimeLUT for this purpose. Not sure if it's appropriate.

@alexey-milovidov
Copy link
Copy Markdown
Member

Maybe we can optimize something to compensate this slowdown...

@amosbird amosbird force-pushed the saturatetime branch 3 times, most recently from 027a2a7 to 38b1e2f Compare November 1, 2021 10:35
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alesapin Could you explain why we choose 10000 month for TTL? It leads to implementation defined date overflow.

@alexey-milovidov
Copy link
Copy Markdown
Member

There is one warning from gcc.

@amosbird
Copy link
Copy Markdown
Collaborator Author

There is one warning from gcc.

Fixed.

@alexey-milovidov
Copy link
Copy Markdown
Member

@amosbird
Copy link
Copy Markdown
Collaborator Author

7% performance decrease: https://clickhouse-test-reports.s3.yandex.net/29953/903df9df6b44d9b31e2992c513633713da0bfadc/performance_comparison/report.html#fail1

Maybe it's ok to merge?

It's interesting that my local clickhouse benchmark comparison shows the opposite result:

tests/performance/parallel_index.xml:


This PR, queries 22, QPS: 0.896, RPS: 939902.487, MiB/s: 7.171, result RPS: 0.896, result MiB/s: 0.000.
Upstream, queries 24, QPS: 0.732, RPS: 712294.468, MiB/s: 5.434, result RPS: 0.732, result MiB/s: 0.000.

0.000%          0.973 sec.      1.205 sec.
10.000%         1.025 sec.      1.245 sec.
20.000%         1.030 sec.      1.288 sec.
30.000%         1.053 sec.      1.297 sec.
40.000%         1.093 sec.      1.351 sec.
50.000%         1.097 sec.      1.362 sec.
60.000%         1.110 sec.      1.386 sec.
70.000%         1.175 sec.      1.405 sec.
80.000%         1.197 sec.      1.429 sec.
90.000%         1.237 sec.      1.503 sec.
95.000%         1.265 sec.      1.522 sec.
99.000%         1.284 sec.      1.560 sec.
99.900%         1.284 sec.      1.560 sec.
99.990%         1.284 sec.      1.560 sec.

Difference at 99.5% confidence: mean difference is 0.25035227, but confidence interval is 0.08870240

@alexey-milovidov
Copy link
Copy Markdown
Member

The changes in performance look insignificant. Ok to merge.

@alexey-milovidov alexey-milovidov merged commit cb19d6f into ClickHouse:master Jan 3, 2022
@alexey-milovidov
Copy link
Copy Markdown
Member

@amosbird Please close the issues that were fixed.

@amosbird
Copy link
Copy Markdown
Collaborator Author

amosbird commented Jan 4, 2022

Only one issue left #25284 .

select toStartOfQuarter(toDateTime(0, 'America/Los_Angeles'));
select toStartOfYear(toDateTime(0, 'America/Los_Angeles'));
select toTime(toDateTime(0, 'America/Los_Angeles'), 'America/Los_Angeles');
select toStartOfMinute(toDateTime(0, 'America/Los_Angeles'));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@amosbird Any reason that you decided not to saturate toStartOfMinute and friends?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be.

┌─toStartOfMinute(toDateTime(0, 'America/Los_Angeles'))─┐
│                                   1969-12-31 16:00:00 │
└───────────────────────────────────────────────────────┘

Copy link
Copy Markdown
Member

@azat azat Feb 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it does not always work.

toStartOfHour

Does not work for partial hours timezones, i.e. America/Paramaribo

┌─toStartOfHour(toDateTime(0, 'America/Paramaribo'))─┐
│                                2106-02-07 02:58:16 │
└────────────────────────────────────────────────────┘

toStartOfMinute

Does not work for partial minutes timezones, Africa/Monrovia - which is very special timezone, #27080 for the reference.

$ TZ=Africa/Monrovia date -d@1
Wed Dec 31 11:15:31 PM MMT 1969
                 ^^
┌─toStartOfMinute(toDateTime(0, 'Africa/Monrovia'))─┐
│                               2106-02-07 06:27:46 │
└───────────────────────────────────────────────────┘

And this breaks monotonicity of the function.

It can be fixed to avoid breaking monotonicity, but this will make the result of functions incorrect and also breaks backward compatibility for negative timestamps.

Also note, that it had been changed already in 21.4 select toDateTime(1, 'Africa/Monrovia') dt, toStartOfMinute(dt):

azat added a commit to azat/ClickHouse that referenced this pull request Aug 10, 2022
… flakiness

It is possible for toStartOfMinute() to give different result for 0 and
59, for partial timezones (timezone that does not starts from 00:00,
like Africa/Monrovia).

Before ClickHouse#36656 it fails for another reason, because of overflows [1], but
now it fails because it simply return different minutes.

  [1]: ClickHouse#29953 (comment)

Simply pin the UTC there.

Fixes: ClickHouse#37786
Signed-off-by: Azat Khuzhin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants