Cherry pick #99200 to 26.3: Implement automatic buckets in Map data type MergeTree serialization#100754
Merged
robot-ch-test-poll merged 33 commits intobackport/26.3/99200from Mar 27, 2026
Merged
Conversation
…s. Enable new serialization by default for testing
…in MergeTree writer, split dynamic structure and synamic subcolumns logic, split taking dynamic structure and statistics logic
When `optimize_functions_to_subcolumns = 1` and a query has `GROUP BY m` on a Map column, an expression like `m['key']` in the HAVING clause was being rewritten to a subcolumn reference `m.key_<key>`. This subcolumn was not part of GROUP BY, so the analyzer raised an exception. Root cause: the `always_optimize_identifiers` set (now renamed to `identifiers_with_filter_optimization`) was populated for any occurrence of `arrayElement(m, key)` regardless of which clause it appeared in. This bypassed the "full column also read" guard even for HAVING. Fix: add clause-context tracking via `in_where_prewhere_stack`. The `needChildVisit` callback sets the top of the stack to `true` only when descending into the WHERE or PREWHERE child of a QueryNode. The `identifiers_with_filter_optimization` set is now populated only when the occurrence is inside WHERE or PREWHERE. HAVING and other clauses no longer trigger the exception. Renamed for clarity: - `transformers_always_optimize_with_full_column` → `transformers_optimize_in_filter_with_full_column` - `always_optimize_identifiers` → `identifiers_with_filter_optimization` Added regression test: `04040_map_subcolumn_having_bug`. Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
…a upper bound limit for number of buckets
Implement automatic buckets in Map data type MergeTree serialization
…e into cherrypick/26.3/99200
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Original pull-request #99200
Do not merge this PR manually
This pull-request is a first step of an automated backporting.
It contains changes similar to calling
git cherry-picklocally.If you intend to continue backporting the changes, then resolve all conflicts if any.
Otherwise, if you do not want to backport them, then just close this pull-request.
The check results does not matter at this step - you can safely ignore them.
Troubleshooting
If the conflicts were resolved in a wrong way
If this cherry-pick PR is completely screwed by a wrong conflicts resolution, and you want to recreate it:
pr-cherrypicklabel from the PRYou also need to check the Original pull-request for
pr-backports-createdlabel, and delete if it's presented thereThe PR source
The PR is created in the CI job