Skip to content

Improve compatibility with non-whole-minute timezone offsets#27080

Merged
alexey-milovidov merged 8 commits intoClickHouse:masterfrom
Algunenano:timezone_00719
Aug 11, 2021
Merged

Improve compatibility with non-whole-minute timezone offsets#27080
alexey-milovidov merged 8 commits intoClickHouse:masterfrom
Algunenano:timezone_00719

Conversation

@Algunenano
Copy link
Copy Markdown
Member

@Algunenano Algunenano commented Aug 2, 2021

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Improve compatibility with non-whole-minute timezone offsets

Detailed description / Documentation draft:
DateLUT was assuming timezone offsets are whole minutes but that's not true, for example Liberia used to be -0:43:08 and later became -0:44:30 (it currently is 00:00, but that's less exciting).

Closes #27079
It appeared as part of the CI (in #26978)

@robot-clickhouse robot-clickhouse added pr-bugfix Pull request with bugfix, not backported by default pr-improvement Pull request with some product improvements and removed pr-bugfix Pull request with bugfix, not backported by default labels Aug 2, 2021
@Algunenano
Copy link
Copy Markdown
Member Author

2 things in CI:

Let's see if a new commit fixing (2) also shows the address issues or it has been something temporary.

@amosbird
Copy link
Copy Markdown
Collaborator

amosbird commented Aug 3, 2021

for example Liberia used to be -0:43:08

That's interesting...

@Algunenano
Copy link
Copy Markdown
Member Author

Rebased with master. No new changes (it was clean already)

@alexey-milovidov
Copy link
Copy Markdown
Member

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How the opposite is possible if we are dividing the number of seconds after midnight (always positive)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was probably thinking about negative values of t (with Datetime64), which will lead to negative time and negative res; but that doesn't see to be supported anyway.

@Algunenano
Copy link
Copy Markdown
Member Author

We have 2x performance degradation of toSecond function or somewhat similar:

It might be because Asia/Kolkata is at UTC+5:30 and that won't use the fast path (offset_is_whole_number_of_hours_during_epoch). I'll see if there is something better to do in this situation,

@alexey-milovidov
Copy link
Copy Markdown
Member

I'm not sure about this change.
We did not support non whole 15 minutes offsets - this is deliberate choice.

... but if we can avoid any performance degradation.

@alexey-milovidov alexey-milovidov self-assigned this Aug 7, 2021
@alexey-milovidov alexey-milovidov added the st-discussion When the implementation aspects are not clear or when the PR is on hold due to questions. label Aug 7, 2021
@Algunenano
Copy link
Copy Markdown
Member Author

I think the performance degradation can be solved in a similar way as the whole hour thing, by adding another flag like offset_is_whole_number_of_minutes_during_epoch. It's not great, but it should limit the performance impact to the offsets were the responses were wrong. I'll give it a try (likely on Monday).

@alexey-milovidov
Copy link
Copy Markdown
Member

Ok, sounds good.

@alexey-milovidov alexey-milovidov removed the st-discussion When the implementation aspects are not clear or when the PR is on hold due to questions. label Aug 8, 2021
@Algunenano
Copy link
Copy Markdown
Member Author

Rebased and applied the improvements from the comments.

Performance of the affected query:

  • Initially (before changes): QPS: 12.465, RPS: 623806343.190, MiB/s: 4759.265, result RPS: 12.465
  • After initial changes: QPS: 3.516, RPS: 175981641.031, MiB/s: 1342.633, result RPS: 3.516
  • Now: QPS: 12.487, RPS: 624930109.263, MiB/s: 4767.838, result RPS: 12.487

Looking at how things work internally (DateTimeTransforms.h), it looks like the fast path (common offsets) of some of the functions (toSecond, toMinute) could be vectorized using the same architecture as other functions (accepting an array instead of a single element), but that's something for a different time.

@alexey-milovidov
Copy link
Copy Markdown
Member

alexey-milovidov commented Aug 9, 2021

Ok.

Theoretically it can also be auto-vectorized, because the function is inlined. But it depends if compiler will be able to move "unlikely" branch check out of the loop.

@Algunenano
Copy link
Copy Markdown
Member Author

I added an invalid optimization (ok for minutes, not for hours). I'm looking into it and will try to unify calls/code if possible. Sadly, changing DateLutImpl.h requires recompiling ~1300 files so it takes time to verify and benchmark

Instead of mixing multiple rounding, use toStartOfMinuteInterval for all
minute related calculations
@alexey-milovidov
Copy link
Copy Markdown
Member

GithubException: 500 null

Looks like GitHub API is down.

@alexey-milovidov
Copy link
Copy Markdown
Member

@Mergifyio update

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Aug 11, 2021

Command update: success

Branch has been successfully updated

@alexey-milovidov
Copy link
Copy Markdown
Member

@Mergifyio update

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Aug 11, 2021

Command update: success

Branch has been successfully updated

Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, everything is alright :)

@alexey-milovidov alexey-milovidov merged commit 1239a91 into ClickHouse:master Aug 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect responses with timezone offsets that are not whole hours

4 participants