Skip to content

[WIP] Optimization of GROUP BY with respect to table sorting key.#9113

Merged
CurtizJ merged 40 commits intoClickHouse:masterfrom
dimarub2000:group_by_in_order_optimization
Jun 6, 2020
Merged

[WIP] Optimization of GROUP BY with respect to table sorting key.#9113
CurtizJ merged 40 commits intoClickHouse:masterfrom
dimarub2000:group_by_in_order_optimization

Conversation

@dimarub2000
Copy link
Copy Markdown
Contributor

@dimarub2000 dimarub2000 commented Feb 14, 2020

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Optimization of GROUP BY with respect to table sorting key.

Detailed description / Documentation draft:

If a table is sorted by some key, we can effectively read data in order. If GROUP BY expression contains at least prefix of sorting key or injective functions it can be performed more efficiently: in-between result of aggreagtion can be finalized and sent to client when a new key is read from table.

@azat
Copy link
Copy Markdown
Member

azat commented Mar 29, 2020

What is the status of this one? Does it postponed? (I have some stuff that will require this)

@alexey-milovidov
Copy link
Copy Markdown
Member

The task is active, worst case estimate is May 2020, but @dimarub2000 should target for April.

@dimarub2000 dimarub2000 force-pushed the group_by_in_order_optimization branch from 001f119 to 6cee50a Compare March 30, 2020 17:00
@blinkov blinkov added the pr-performance Pull request with some performance improvements label Apr 1, 2020
@blinkov blinkov requested review from 4ertus2 and CurtizJ April 2, 2020 05:32
@dimarub2000 dimarub2000 changed the title [WIP] Optimization of GROUP BY with respect to table sorting key. (No optimization done yet) [WIP] Optimization of GROUP BY with respect to table sorting key. Apr 18, 2020
@CurtizJ CurtizJ self-assigned this Apr 28, 2020
@dimarub2000 dimarub2000 marked this pull request as ready for review May 12, 2020 14:54
@dimarub2000
Copy link
Copy Markdown
Contributor Author

It's OK that performance tests have failed, master doesn't have my setting yet

@akuzm
Copy link
Copy Markdown
Contributor

akuzm commented May 31, 2020

It's OK that performance tests have failed, master doesn't have my setting yet

By the way, if you want to run a performance comparison with master in CI, you can add just a setting that does nothing as a separate PR, and then compare to that. Or maybe I should just make it a warning...

@dimarub2000
Copy link
Copy Markdown
Contributor Author

It's OK that performance tests have failed, master doesn't have my setting yet

By the way, if you want to run a performance comparison with master in CI, you can add just a setting that does nothing as a separate PR, and then compare to that. Or maybe I should just make it a warning...

Ок, good to know, btw results can be seen in test and ya.make commit

@CurtizJ
Copy link
Copy Markdown
Member

CurtizJ commented Jun 4, 2020

Waiting for some additional tests, because now optimize_aggregation_in_order is disabled by default.

@alexey-milovidov
Copy link
Copy Markdown
Member

Performance tests Ok.

@alexey-milovidov
Copy link
Copy Markdown
Member

00746_sql_fuzzy - fixed in master.

@CurtizJ CurtizJ merged commit 5c42408 into ClickHouse:master Jun 6, 2020
@dimarub2000 dimarub2000 deleted the group_by_in_order_optimization branch June 6, 2020 17:42
azat added a commit to azat/ClickHouse that referenced this pull request May 9, 2022
row_begin was wrong, and before this patch aggregator processing
{row_end, row_end} range, in other words, zero range.

Fixes: ClickHouse#9113 (cc @dimarub2000)
v2: add static_cast to fix UBSan
Signed-off-by: Azat Khuzhin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-performance Pull request with some performance improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants