Skip to content

Fix max granules size in MergeTreeDataWriter #19123

Merged
alesapin merged 2 commits intomasterfrom
additional_check_in_writer
Jan 16, 2021
Merged

Fix max granules size in MergeTreeDataWriter #19123
alesapin merged 2 commits intomasterfrom
additional_check_in_writer

Conversation

@alesapin
Copy link
Copy Markdown
Member

@alesapin alesapin commented Jan 15, 2021

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix bug in merge tree data writer which can lead to marks with bigger size than fixed granularity size. Fixes #18913.

@robot-clickhouse robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label Jan 15, 2021
@alesapin
Copy link
Copy Markdown
Member Author

alesapin commented Jan 15, 2021

Ok, I know how to fix it now, but the error caused by the code complexity.... It's sad.

@robot-clickhouse robot-clickhouse added pr-bugfix Pull request with bugfix, not backported by default and removed pr-not-for-changelog This PR should not be mentioned in the changelog labels Jan 15, 2021

client.query("SYSTEM STOP MERGES t")

# These blocks size taken from the real table which reproduces the error
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug description here.

@alesapin alesapin changed the title Additional check for huge granules in MergeTreeDataWriter Fix max granules size in MergeTreeDataWriter Jan 15, 2021
void MergeTreeDataPartWriterWide::adjustLastMarkIfNeedAndFlushToDisk(size_t new_rows_in_last_mark)
{
/// We don't want to split already written granules to smaller
if (rows_written_in_last_mark > new_rows_in_last_mark)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional self-check. We have quite a lot of them, but I think it's better than possible data corruption.

else /// We still can write some rows from new block into previous granule.
adjustLastMarkIfNeedAndFlushToDisk(index_granularity_for_block - rows_written_in_last_mark);
else /// We still can write some rows from new block into previous granule. So the granule size will be block granularity size.
adjustLastMarkIfNeedAndFlushToDisk(index_granularity_for_block);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole fix is here.

@alesapin
Copy link
Copy Markdown
Member Author

I think it's better to merge it to avoid strange failures in debug tests.

@alesapin alesapin merged commit 67fd381 into master Jan 16, 2021
@alesapin alesapin deleted the additional_check_in_writer branch January 16, 2021 07:17
alexey-milovidov added a commit that referenced this pull request Jan 16, 2021
Backport #19123 to 21.1: Fix max granules size in MergeTreeDataWriter
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect mark is found while running debug tests

2 participants