Skip to content

[RFC] Fix data skipping indexes for columns with additional actions during merge#11162

Merged
alexey-milovidov merged 4 commits intoClickHouse:masterfrom
azat:data-skip-index-merging-params-fix
May 31, 2020
Merged

[RFC] Fix data skipping indexes for columns with additional actions during merge#11162
alexey-milovidov merged 4 commits intoClickHouse:masterfrom
azat:data-skip-index-merging-params-fix

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented May 23, 2020

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
If data skipping index is dependent on columns that are going to be modified during background merge (for SummingMergeTree, AggregatingMergeTree as well as for TTL GROUP BY), it was calculated incorrectly. This issue is fixed by moving index calculation after merge so the index is calculated on merged data.

Detailed description / Documentation draft:
For example for AggregatingMergeTree if the column in the data skipping
index is under some aggregate function, then data skipping index will be
incorrect.

Fix this by calculating data skipping indexes after aggregation has been
done.

P.S. looks like all versions affected

@azat azat changed the title RFC] Fix data skipping indexes for columns with additional actions during merge [RFC] Fix data skipping indexes for columns with additional actions during merge May 23, 2020
@blinkov blinkov added the pr-bugfix Pull request with bugfix, not backported by default label May 23, 2020
azat added 3 commits May 24, 2020 16:12
…merge

For example for AggregatingMergeTree if the column in the data skipping
index is under some aggregate function, then data skipping index will be
incorrect.

Fix this by calculating data skipping indexes after aggregation has been
done.
@azat
Copy link
Copy Markdown
Member Author

azat commented May 24, 2020

Rebased, previous HEAD - 7d196d08e4af015664675beac6d2f6714db673a5 (contains tests)

@azat azat force-pushed the data-skip-index-merging-params-fix branch from 7d196d0 to c36708c Compare May 24, 2020 13:13
@abyss7 abyss7 self-assigned this May 26, 2020
@alexey-milovidov
Copy link
Copy Markdown
Member

@abyss7 No work done on review, unassigned.

@alexey-milovidov alexey-milovidov self-assigned this May 31, 2020
@alexey-milovidov
Copy link
Copy Markdown
Member

Ok.

I've edited the changelog entry to make it more comprehensible (so the user can understand the scope of this issue):

From:

Fix data skipping indexes for columns with additional actions during merge

To:

If data skipping index is dependent on columns that are going to be modified during background merge (for SummingMergeTree, AggregatingMergeTree as well as for TTL GROUP BY), it was calculated incorrectly. This issue is fixed by moving index calculation after merge.

@alexey-milovidov alexey-milovidov merged commit 8accde7 into ClickHouse:master May 31, 2020
alexey-milovidov added a commit that referenced this pull request May 31, 2020
@azat azat deleted the data-skip-index-merging-params-fix branch May 31, 2020 18:13
@azat
Copy link
Copy Markdown
Member Author

azat commented May 31, 2020

I've edited the changelog entry to make it more comprehensible (so the user can understand the scope of this issue):

I always forget making changelog entry "user friendly", thanks!

KochetovNicolai pushed a commit that referenced this pull request Jun 8, 2020
[RFC] Fix data skipping indexes for columns with additional actions during merge

(cherry picked from commit 8accde7)
KochetovNicolai pushed a commit that referenced this pull request Jun 8, 2020
[RFC] Fix data skipping indexes for columns with additional actions during merge

(cherry picked from commit 8accde7)
KochetovNicolai pushed a commit that referenced this pull request Jun 8, 2020
[RFC] Fix data skipping indexes for columns with additional actions during merge

(cherry picked from commit 8accde7)
KochetovNicolai added a commit that referenced this pull request Jun 8, 2020
KochetovNicolai added a commit that referenced this pull request Jun 9, 2020
@KochetovNicolai
Copy link
Copy Markdown
Member

@alexey-milovidov
Copy link
Copy Markdown
Member

@KochetovNicolai Should we also revert from 20.4 and master?

@KochetovNicolai
Copy link
Copy Markdown
Member

As I see from tests, it works for 20.4 and master.
Probably, some refactoring wasn't backported for other branches.

@azat
Copy link
Copy Markdown
Member Author

azat commented Jun 11, 2020

This PR should not be backported without #11259

KochetovNicolai added a commit that referenced this pull request Jun 11, 2020
@KochetovNicolai
Copy link
Copy Markdown
Member

I've decided to not backport this pr at all.

alesapin added a commit that referenced this pull request Jun 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default pr-no-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants