Add operator GROUPING SETS by MaxTheHuman · Pull Request #24172 · ClickHouse/ClickHouse

MaxTheHuman · 2021-05-16T19:52:21Z

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

Not for changelog (changelog entry is not required)

l1t1 · 2021-05-17T04:52:22Z

nice job

akuzm · 2021-05-24T17:18:04Z

The tests are missing, did you forget git add?

…ng-sets-dev

KochetovNicolai · 2021-05-31T11:50:53Z

src/Processors/QueryPlan/AggregatingStep.cpp

+                break;
+
+            size_t first_column_to_add = (i == 0 ? 1 : 0);
+            auto adding_column_action = ActionsDAG::makeAddingColumnActions(pipeline.getHeader().getByPosition(params.keys_vector[first_column_to_add][0]));


That's suspicious.
What if we have two sets with the same prefix?
For (a, b), (a, c) we add const column a for both, replacing normal column with default one.

KochetovNicolai · 2021-05-31T11:53:49Z

src/Processors/QueryPlan/AggregatingStep.cpp

+            expression_transforms.push_back(std::make_shared<ExpressionTransform>(aggregating_transforms[i]->getOutputs().front().getHeader(), actions));
+
+        }
+        pipeline.addParallelTransforms(std::move(aggregating_transforms));


The same could be done with addSimpleTransform, cause aggregating one input and one output port.
But maybe this method will be useful later, when we remove pipeline.resize(1);

KochetovNicolai

So far so good.

Also, we need to remove all debug output

…ng-sets-dev

…ter grouping sets

taylor12805 · 2021-07-08T08:26:01Z

I try to reproduce the test case from website: https://oracle-base.com/articles/misc/rollup-cube-grouping-functions-and-grouping-sets

CREATE TABLE default.dimension_tab (fact_1_idInt32,fact_2_idInt32,fact_3_idInt32,fact_4_idInt32,sales_valueDecimal(9, 2) ) ENGINE = MergeTree ORDER BY tuple() SETTINGS index_granularity = 8192

INSERT INTO default.dimension_tab SELECT rand32(number) % 2 + 1 AS fact_1_id, rand32(number) % 5 + 1 AS fact_2_id, rand32(number) % 10 + 1 AS fact_3_id, rand32(number) % 10 + 1 AS fact_4_id, rand32(number) % 100 AS sales_value FROM system.numbers limit 1000

SELECT fact_1_id, fact_2_id, fact_3_id, SUM(sales_value) AS sales_value FROM default.dimension_tab GROUP BY GROUPING SETS((fact_1_id, fact_2_id), (fact_1_id, fact_3_id)) ORDER BY fact_1_id, fact_2_id, fact_3_id

In Oracle, the result is

FACT_1_ID FACT_2_ID FACT_3_ID SALES_VALUE GROUPING_ID

     1          1                4363.55           1
     1          2                4794.76           1
     1          3                4718.25           1
     1          4                5387.45           1
     1          5                5027.34           1
     1                     1      2737.4           2
     1                     2     1854.29           2
     1                     3     2090.96           2
     1                     4     2605.17           2
     1                     5     2590.93           2
     1                     6      2506.9           2
     1                     7     1839.85           2
     1                     8     2953.04           2
     1                     9     2778.75           2
     1                    10     2334.06           2
     2          1                5652.84           1
     2          2                4583.02           1
     2          3                5555.77           1
     2          4                5936.67           1
     2          5                4508.74           1
     2                     1     3512.69           2
     2                     2     2847.94           2
     2                     3      2972.5           2
     2                     4     2534.06           2
     2                     5     3115.99           2
     2                     6     2775.85           2
     2                     7     2208.19           2
     2                     8     2358.55           2
     2                     9     1884.11           2
     2                    10     2027.16           2

The result in this implementation:

FACT_1_ID FACT_2_ID FACT_3_ID SALES_VALUE GROUPING_ID

  0	        0      	1	4820
  0	        0      	2	5222
  0      	0      	3	5130
  0	        0	4	5426
  0      	0	5	4782
  0       	0	6	4115
  0       	0	7	5752
  0	        0	8	5385
  0      	0	9	4772
  0      	0	10	5533
  0      	1	0	8935
  0      	2	0	10974
  0      	3	0	10515
  0      	4	0	10198
  0       	5	0	10315
  1       	0	0	25256
  2      	0	0	25681

The result of this implementation is quite different from the result in Oracle. I suppose there is some issue with this implementation. Correct me if I am wrong

taylor12805 · 2021-07-23T11:07:58Z

@KochetovNicolai Hi, KochetovNicolai, I'm trying to fix grouping sets multiple elements related issue, i wonder if we should support no parenthesis
element inside grouping sets like: GROUPING SETS(A, (A, B), (A, C)) or should only contains elements inside parenthesis like: GROUPING SETS((A), (A, B), (A, C))

KochetovNicolai · 2021-07-26T09:01:06Z

@taylor12805 Hi!
I think we need to support all the cases. Query with group by grouping sets (x, (x), (y, z), ()) works in Postrges, so I think our syntax should also support it.

I'm trying to fix grouping sets multiple elements related issue

Do you need some help with it? I think I may also finish this pr, but I would really appreciate any assistance from you.

taylor12805 · 2021-07-26T11:36:04Z

@taylor12805 Hi!
I think we need to support all the cases. Query with group by grouping sets (x, (x), (y, z), ()) works in Postrges, so I think our syntax should also support it.

I'm trying to fix grouping sets multiple elements related issue

Do you need some help with it? I think I may also finish this pr, but I would really appreciate any assistance from you.

@KochetovNicolai Hi!
I am just fixed the issue of multiple grouping sets not working correctly in AggregatingStep. There are still some remaining work in TreeRewriter part. How is urs? Let me know anywhere I can give a hand

sensatyaki · 2021-08-19T03:04:29Z

What code changes have been done to handle the empty set with () parenthesis, at the parsing level? For example ((A),()).

CLAassistant · 2021-09-28T10:45:31Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ KochetovNicolai
❌ MaxTheHuman
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

novikd · 2021-12-17T19:19:26Z

GROUPING SETS are added in #26869

MaxTheHuman added 13 commits February 9, 2021 21:40

init commit with parsing and BAD realisation

1f32690

rm unnessesary data committed by mistake

088a2ae

remove another unnessesary files

e6332b2

revert changes made to cube transform

f1f48f1

fix typo

e637bf2

erase blank line to restore initial state

c944d19

fixes

a5290e0

fix typos

3f79dfb

feat grouping-sets: initial changes

50a7e64

development

939be1e

grouping sets development

1fc3f45

grouping sets development

0dd912e

grouping sets development

8dfce15

robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label May 16, 2021

alexey-milovidov added the can be tested label May 16, 2021

grouping sets cleanup

c83c95d

MaxTheHuman and others added 2 commits May 17, 2021 15:03

grouping sets: fix 'Port already connected' error

786e7f5

Merge branch 'master' into grouping-sets-dev

c6b2917

This was referenced May 19, 2021

ROLLUP, CUBE and GROUPING SETS #322

Closed

GROUPING SETS initial commit with suboptimal implementation #20274

Closed

GROUPING aggregate function #19426

Closed

MaxTheHuman added 3 commits May 21, 2021 02:35

grouping sets dev: fix errors, something works

e1cccd6

grouping sets: make simple aggregation with grouping sets to work

d5c9e72

grouping sets: fix

dfcbd5a

KochetovNicolai self-assigned this May 25, 2021

MaxTheHuman added 3 commits May 27, 2021 22:11

Merge branch 'master' of github.com:ClickHouse/ClickHouse into groupi…

ea1bb41

…ng-sets-dev

grouping sets: add tests, fix bug

049fb24

Merge branch 'master' of github.com:ClickHouse/ClickHouse into groupi…

47e5768

…ng-sets-dev

KochetovNicolai reviewed May 31, 2021

View reviewed changes

MaxTheHuman and others added 5 commits June 3, 2021 02:00

Merge branch 'master' of github.com:ClickHouse/ClickHouse into groupi…

699f114

…ng-sets-dev

Merge branch 'master' of github.com:ClickHouse/ClickHouse into groupi…

8b6bd2c

…ng-sets-dev

grouping-sets: rearrange result columns so that resize is possible af…

ac330cd

…ter grouping sets

make grouping sets work with total

2626d38

Merge branch 'master' into MaxTheHuman-grouping-sets-dev

956100b

taylor12805 mentioned this pull request Jul 28, 2021

Grouping sets dev #26869

Merged

alexey-milovidov removed the can be tested label Dec 9, 2021

novikd closed this Dec 17, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add operator GROUPING SETS#24172

Add operator GROUPING SETS#24172
MaxTheHuman wants to merge 27 commits intoClickHouse:masterfrom
MaxTheHuman:grouping-sets-dev

MaxTheHuman commented May 16, 2021

Uh oh!

l1t1 commented May 17, 2021

Uh oh!

akuzm commented May 24, 2021

Uh oh!

KochetovNicolai May 31, 2021

Uh oh!

KochetovNicolai May 31, 2021

Uh oh!

KochetovNicolai left a comment

Uh oh!

taylor12805 commented Jul 8, 2021 •

edited

Loading

Uh oh!

taylor12805 commented Jul 23, 2021

Uh oh!

KochetovNicolai commented Jul 26, 2021

Uh oh!

taylor12805 commented Jul 26, 2021

Uh oh!

sensatyaki commented Aug 19, 2021

Uh oh!

CLAassistant commented Sep 28, 2021

Uh oh!

novikd commented Dec 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Conversation

MaxTheHuman commented May 16, 2021

Uh oh!

l1t1 commented May 17, 2021

Uh oh!

akuzm commented May 24, 2021

Uh oh!

KochetovNicolai May 31, 2021

Choose a reason for hiding this comment

Uh oh!

KochetovNicolai May 31, 2021

Choose a reason for hiding this comment

Uh oh!

KochetovNicolai left a comment

Choose a reason for hiding this comment

Uh oh!

taylor12805 commented Jul 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taylor12805 commented Jul 23, 2021

Uh oh!

KochetovNicolai commented Jul 26, 2021

Uh oh!

taylor12805 commented Jul 26, 2021

Uh oh!

sensatyaki commented Aug 19, 2021

Uh oh!

CLAassistant commented Sep 28, 2021

Uh oh!

novikd commented Dec 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

taylor12805 commented Jul 8, 2021 •

edited

Loading