Fix SIGSEGV for external GROUP BY and overflow row#23962
Merged
alexey-milovidov merged 2 commits intoClickHouse:masterfrom May 10, 2021
Merged
Fix SIGSEGV for external GROUP BY and overflow row#23962alexey-milovidov merged 2 commits intoClickHouse:masterfrom
alexey-milovidov merged 2 commits intoClickHouse:masterfrom
Conversation
Overflow row is used for GROUP BY if all of the above is true: - WITH TOTALS is requested - max_rows_to_group_by > 0 - group_by_overflow_mode = any - totals_mode != after_having_exclusive And in case of overflow row and external GROUP BY, once the temporary file dumps to disk it resets without_key data variant to nullptr, so any subsequent dump to disk will cause SIGSEGV. Fix this, by recreating without_key data variant after dumping to disk, instead of reseting to nullptr. And also add sanity check (LOGICAL_ERROR) to make error more deterministic in case of such error. Found with fuzzer [1]. [1]: https://clickhouse-test-reports.s3.yandex.net/23929/e7027e052998540ee660d186727e20f9555b729d/fuzzer_ubsan/report.html#fail1
alexey-milovidov
approved these changes
May 8, 2021
Member
|
Is total row calculated correctly if data was flushed to disk? |
Change aggregation function to uniqCombined() to check this.
Member
Author
Yes (I've changed the aggregate function in test to uniqCombined) |
Member
Author
Unrelated
Broken in upstream/master |
This was referenced May 10, 2021
robot-clickhouse
pushed a commit
that referenced
this pull request
May 10, 2021
robot-clickhouse
pushed a commit
that referenced
this pull request
May 10, 2021
robot-clickhouse
pushed a commit
that referenced
this pull request
May 10, 2021
kitaisreal
added a commit
that referenced
this pull request
May 11, 2021
Backport #23962 to 21.3: Fix SIGSEGV for external GROUP BY and overflow row
kitaisreal
added a commit
that referenced
this pull request
May 11, 2021
Backport #23962 to 21.5: Fix SIGSEGV for external GROUP BY and overflow row
kitaisreal
added a commit
that referenced
this pull request
May 11, 2021
Backport #23962 to 21.4: Fix SIGSEGV for external GROUP BY and overflow row
robot-clickhouse
pushed a commit
that referenced
this pull request
May 28, 2021
alexey-milovidov
added a commit
that referenced
this pull request
Jun 12, 2021
Backport #23962 to 20.8: Fix SIGSEGV for external GROUP BY and overflow row
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix SIGSEGV for external GROUP BY and overflow row (i.e. queries like
SELECT FROM GROUP BY WITH TOTALS SETTINGS max_bytes_before_external_group_by>0, max_rows_to_group_by>0, group_by_overflow_mode='any', totals_mode='before_having')Detailed description / Documentation draft:
Overflow row is used for GROUP BY if all of the above is true:
And in case of overflow row and external GROUP BY, once the temporary
file dumps to disk it resets without_key data variant to nullptr, so any
subsequent dump to disk will cause SIGSEGV.
Fix this, by recreating without_key data variant after dumping to disk,
instead of reseting to nullptr.
And also add sanity check (LOGICAL_ERROR) to make error more
deterministic in case of such error.
Found with fuzzer 1.