Skip to content

Optimize single NOT NULL count() aggregation#82104

Merged
alexey-milovidov merged 1 commit intoClickHouse:masterfrom
amosbird:aggregate-simple-count
Jun 19, 2025
Merged

Optimize single NOT NULL count() aggregation#82104
alexey-milovidov merged 1 commit intoClickHouse:masterfrom
amosbird:aggregate-simple-count

Conversation

@amosbird
Copy link
Copy Markdown
Collaborator

Changelog category (leave one):

  • Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

When the aggregation query contains only a single COUNT() function on a NOT NULL column, the aggregation logic is fully inlined during hash table probing. This avoids allocating and maintaining any aggregation state, significantly reducing memory usage and CPU overhead. This partially addresses #81982.

This optimization improves performance for queries like:

SELECT COUNT(*) FROM table GROUP BY key

SELECT COUNT(col) FROM table GROUP BY key  -- when `col` is NOT NULL

ClickBench Q15

HEAD:

Queries executed: 104.

localhost:9000, queries: 104, QPS: 4.881, RPS: 488124765.213, MiB/s: 3724.096, result RPS: 48.814, result MiB/s: 0.001.

0%              0.173 sec.
10%             0.184 sec.
20%             0.191 sec.
30%             0.193 sec.
40%             0.197 sec.
50%             0.200 sec.
60%             0.205 sec.
70%             0.210 sec.
80%             0.216 sec.
90%             0.220 sec.
95%             0.223 sec.
99%             0.228 sec.
99.9%           0.233 sec.
99.99%          0.233 sec.

This PR

Queries executed: 106.

localhost:9000, queries: 106, QPS: 6.301, RPS: 630035513.823, MiB/s: 4806.790, result RPS: 63.005, result MiB/s: 0.001.

0%              0.136 sec.
10%             0.143 sec.
20%             0.145 sec.
30%             0.150 sec.
40%             0.153 sec.
50%             0.156 sec.
60%             0.158 sec.
70%             0.159 sec.
80%             0.162 sec.
90%             0.166 sec.
95%             0.173 sec.
99%             0.179 sec.
99.9%           0.195 sec.
99.99%          0.195 sec.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh bot commented Jun 18, 2025

Workflow [PR], commit [b56c215]

Summary:

@clickhouse-gh clickhouse-gh bot added the pr-performance Pull request with some performance improvements label Jun 18, 2025
@amosbird amosbird force-pushed the aggregate-simple-count branch 5 times, most recently from aaca1c7 to a967a23 Compare June 19, 2025 07:34
The aggregation logic is inlined, meaning each row is aggregated
immediately during hash table probing. And there's no need to allocate
and maintain full aggregation state.
@amosbird amosbird force-pushed the aggregate-simple-count branch from a967a23 to b56c215 Compare June 19, 2025 09:54
@amosbird
Copy link
Copy Markdown
Collaborator Author

copied-2025-06-19-21_28_00_728
copied-2025-06-19-21_28_58_756
copied-2025-06-19-21_28_28_882

@alexey-milovidov alexey-milovidov self-assigned this Jun 19, 2025
@alexey-milovidov alexey-milovidov added this pull request to the merge queue Jun 19, 2025
Merged via the queue into ClickHouse:master with commit f8deb30 Jun 19, 2025
237 of 238 checks passed
@alexey-milovidov alexey-milovidov deleted the aggregate-simple-count branch June 19, 2025 20:09
@robot-clickhouse robot-clickhouse added the pr-synced-to-cloud The PR is synced to the cloud repo label Jun 19, 2025
@incfly
Copy link
Copy Markdown
Contributor

incfly commented Aug 18, 2025

Hi, I am thinking to implement similar optimization for Sum as @rschu1ze mentioned it would be beneficial for ClickBench as well. Any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-performance Pull request with some performance improvements pr-synced-to-cloud The PR is synced to the cloud repo

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants