Skip to content

RFC: Fix "Chunk should have AggregatedChunkInfo in GroupingAggregatedTransform"#33637

Merged
CurtizJ merged 3 commits intoClickHouse:masterfrom
azat:fix-optimize_aggregation_in_order
Jan 22, 2022
Merged

RFC: Fix "Chunk should have AggregatedChunkInfo in GroupingAggregatedTransform"#33637
CurtizJ merged 3 commits intoClickHouse:masterfrom
azat:fix-optimize_aggregation_in_order

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Jan 14, 2022

Changelog category (leave one):

  • Bug Fix (user-visible misbehaviour in official stable or prestable release)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix Chunk should have AggregatedChunkInfo in GroupingAggregatedTransform (in case of optimize_aggregation_in_order=1)

Detailed description / Documentation draft:
In case of optimize_aggregation_in_order=1 there will be ChunkInfoWithAllocatedBytes.
Also note, that this is the case only when MergeTree has only one part, since in this case there will be no local FinishAggregatingInOrderTransform.

Details

one part

EXPLAIN PIPELINE
SELECT count()
FROM remote('127.{1,2}', currentDatabase(), test_data)
GROUP BY key
SETTINGS optimize_aggregation_in_order = 1, max_threads = 1

┌─explain──────────────────────────────────┐
│ (Expression)                             │
│ ExpressionTransform                      │
│   (MergingAggregated)                    │
│   MergingAggregatedBucketTransform       │
│     GroupingAggregatedTransform 21    │
│       (SettingQuotaAndLimits)            │
│         (Union)                          │
│           (Aggregating)                  │
│           FinalizeAggregatedTransform    │
│             AggregatingInOrderTransform  │
│               (Expression)               │
│               ExpressionTransform        │
│                 (SettingQuotaAndLimits)  │
│                   (ReadFromMergeTree)    │
│                   MergeTreeInOrder 01 │
│           (ReadFromRemote)               │
└──────────────────────────────────────────┘

multiple parts

EXPLAIN PIPELINE
SELECT count()
FROM remote('127.{1,2}', currentDatabase(), test_data)
GROUP BY key
SETTINGS optimize_aggregation_in_order = 1, max_threads = 1

┌─explain─────────────────────────────────────────────┐
│ (Expression)                                        │
│ ExpressionTransform                                 │
│   (MergingAggregated)                               │
│   MergingAggregatedBucketTransform                  │
│     GroupingAggregatedTransform 21               │
│       (SettingQuotaAndLimits)                       │
│         (Union)                                     │
│           (Aggregating)                             │
│           MergingAggregatedBucketTransform          │
│             FinishAggregatingInOrderTransform 21 │
│               AggregatingInOrderTransform × 2       │
│                 (Expression)                        │
│                 ExpressionTransform × 2             │
│                   (SettingQuotaAndLimits)           │
│                     (ReadFromMergeTree)             │
│                     MergeTreeInOrder × 2 01      │
│           (ReadFromRemote)                          │
└─────────────────────────────────────────────────────┘

Cc: @CurtizJ
Cc: @KochetovNicolai

Fixes: #30692
Fixes: #31172
Fixes: #30266

azat added 3 commits January 14, 2022 17:43
Note, that this is not complete fix, see the next two patches.

Signed-off-by: Azat Khuzhin <[email protected]>
This is just a micro optimization and it should not affect anything,
real fixes are in separate patches (previous and next).

Signed-off-by: Azat Khuzhin <[email protected]>
…form"

In case of optimize_aggregation_in_order there will be
ChunkInfoWithAllocatedBytes.

Signed-off-by: Azat Khuzhin <[email protected]>
@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label Jan 14, 2022
@azat azat marked this pull request as ready for review January 14, 2022 15:26
@azat
Copy link
Copy Markdown
Member Author

azat commented Jan 21, 2022

@KochetovNicolai or @CurtizJ maybe you can take a look?

@CurtizJ CurtizJ self-assigned this Jan 21, 2022
Copy link
Copy Markdown
Member

@CurtizJ CurtizJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CurtizJ
Copy link
Copy Markdown
Member

CurtizJ commented Jan 22, 2022

Test failures are unrelated to PR changes.

@CurtizJ CurtizJ merged commit 6861ada into ClickHouse:master Jan 22, 2022
@azat azat deleted the fix-optimize_aggregation_in_order branch January 24, 2022 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

3 participants