Skip to content

Aggregation-in-order from query plan#43592

Merged
KochetovNicolai merged 10 commits intomasterfrom
aggregating-in-order-from-query-plan
Dec 6, 2022
Merged

Aggregation-in-order from query plan#43592
KochetovNicolai merged 10 commits intomasterfrom
aggregating-in-order-from-query-plan

Conversation

@KochetovNicolai
Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Implement aggregation-in-order optimization on top of query plan. It is enabled by default (but works only together with optimize_aggregation_in_order, which is disabled by default). Set query_plan_aggregation_in_order = 0 to use previous AST-based version.

@robot-ch-test-poll robot-ch-test-poll added the pr-improvement Pull request with some product improvements label Nov 23, 2022
@KochetovNicolai KochetovNicolai marked this pull request as ready for review November 28, 2022 15:54
@nickitat nickitat self-assigned this Nov 28, 2022
bool group_by_use_nulls;

InputOrderInfoPtr group_by_info;
SortDescription sort_description_for_merging;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need a comment explaining the purpose of these two. maybe just a link to comment under optimisation

if (enforce_aggregation_in_order)
reading->enforceAggregationInOrder();
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good also to call updateInputStream on parent steps to propagate sorting

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.
Don't want to do it is this pr.

/// We would like to have:
/// (a, b, c) - a sort description for reading from table (it's into input_order)
/// (a, c) - a sort description for merging (an input of AggregatingInOrderTransfrom is sorted by this GROUP BY keys)
/// (a, c, d) - a group by soer description (an input of FinishAggregatingInOrderTransform is sorted by all GROUP BY keys)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// (a, c, d) - a group by soer description (an input of FinishAggregatingInOrderTransform is sorted by all GROUP BY keys)
/// (a, c, d) - a group by sort description (an input of FinishAggregatingInOrderTransform is sorted by all GROUP BY keys)

MatchedTrees::Matches matches;
FixedColumns fixed_key_columns;

/// For every column in PK find any macth from GROUP BY key.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// For every column in PK find any macth from GROUP BY key.
/// For every column in PK find any match from GROUP BY key.

Comment on lines +722 to +723
//std::cerr << "------- matching " << static_cast<const void *>(node) << " " << node->result_name
// << " to " << static_cast<const void *>(match.node) << " " << (match.node ? match.node->result_name : "") << std::endl;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//std::cerr << "------- matching " << static_cast<const void *>(node) << " " << node->result_name
// << " to " << static_cast<const void *>(match.node) << " " << (match.node ? match.node->result_name : "") << std::endl;

ErrorCodes::LOGICAL_ERROR,
"Cannot enforce aggregation in order for ReadFromRemote step up to stage {}",
QueryProcessingStage::toString(stage));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
context.setSetting("enable_memory_bound_merging_of_aggregation_results", true);

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather separate enforceAggregationInOrder and enforceSorting calls.

@KochetovNicolai
Copy link
Copy Markdown
Member Author

Checked query from perftest and don't see any difference:

--- New
localhost:9000, queries 114, QPS: 0.363, RPS: 36284788.405, MiB/s: 276.831, result RPS: 0.000, result MiB/s: 0.000.

0.000%		2.556 sec.	
10.000%		2.629 sec.	
20.000%		2.673 sec.	
30.000%		2.692 sec.	
40.000%		2.721 sec.	
50.000%		2.745 sec.	
60.000%		2.765 sec.	
70.000%		2.783 sec.	
80.000%		2.811 sec.	
90.000%		2.889 sec.	
95.000%		2.956 sec.	
99.000%		3.108 sec.	
99.900%		3.125 sec.	
99.990%		3.125 sec.	

--- Old
localhost:9000, queries 242, QPS: 0.367, RPS: 36722549.794, MiB/s: 280.171, result RPS: 0.000, result MiB/s: 0.000.

0.000%		2.530 sec.	
10.000%		2.614 sec.	
20.000%		2.648 sec.	
30.000%		2.668 sec.	
40.000%		2.688 sec.	
50.000%		2.710 sec.	
60.000%		2.735 sec.	
70.000%		2.751 sec.	
80.000%		2.778 sec.	
90.000%		2.818 sec.	
95.000%		2.883 sec.	
99.000%		3.128 sec.	
99.900%		3.203 sec.	
99.990%		3.203 sec.

@KochetovNicolai KochetovNicolai merged commit 52c1355 into master Dec 6, 2022
@KochetovNicolai KochetovNicolai deleted the aggregating-in-order-from-query-plan branch December 6, 2022 16:11
@KochetovNicolai
Copy link
Copy Markdown
Member Author

Merged cause only comments were changed since last checks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants