Skip to content

Cherry pick #99200 to 26.3: Implement automatic buckets in Map data type MergeTree serialization#100754

Merged
robot-ch-test-poll merged 33 commits intobackport/26.3/99200from
cherrypick/26.3/99200
Mar 27, 2026
Merged

Cherry pick #99200 to 26.3: Implement automatic buckets in Map data type MergeTree serialization#100754
robot-ch-test-poll merged 33 commits intobackport/26.3/99200from
cherrypick/26.3/99200

Conversation

@robot-clickhouse-ci-1
Copy link
Copy Markdown
Contributor

Original pull-request #99200

Do not merge this PR manually

This pull-request is a first step of an automated backporting.
It contains changes similar to calling git cherry-pick locally.
If you intend to continue backporting the changes, then resolve all conflicts if any.
Otherwise, if you do not want to backport them, then just close this pull-request.

The check results does not matter at this step - you can safely ignore them.

Troubleshooting

If the conflicts were resolved in a wrong way

If this cherry-pick PR is completely screwed by a wrong conflicts resolution, and you want to recreate it:

  • delete the pr-cherrypick label from the PR
  • delete this branch from the repository

You also need to check the Original pull-request for pr-backports-created label, and delete if it's presented there

The PR source

The PR is created in the CI job

Avogar and others added 30 commits March 10, 2026 14:34
…s. Enable new serialization by default for testing
…in MergeTree writer, split dynamic structure and synamic subcolumns logic, split taking dynamic structure and statistics logic
When `optimize_functions_to_subcolumns = 1` and a query has `GROUP BY m`
on a Map column, an expression like `m['key']` in the HAVING clause was
being rewritten to a subcolumn reference `m.key_<key>`. This subcolumn was
not part of GROUP BY, so the analyzer raised an exception.

Root cause: the `always_optimize_identifiers` set (now renamed to
`identifiers_with_filter_optimization`) was populated for any occurrence
of `arrayElement(m, key)` regardless of which clause it appeared in.
This bypassed the "full column also read" guard even for HAVING.

Fix: add clause-context tracking via `in_where_prewhere_stack`. The
`needChildVisit` callback sets the top of the stack to `true` only when
descending into the WHERE or PREWHERE child of a QueryNode. The
`identifiers_with_filter_optimization` set is now populated only when the
occurrence is inside WHERE or PREWHERE. HAVING and other clauses no longer
trigger the exception.

Renamed for clarity:
- `transformers_always_optimize_with_full_column` →
  `transformers_optimize_in_filter_with_full_column`
- `always_optimize_identifiers` →
  `identifiers_with_filter_optimization`

Added regression test: `04040_map_subcolumn_having_bug`.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Implement automatic buckets in Map data type MergeTree serialization
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added pr-cherrypick Cherry-pick of merge-commit before backporting. Do not use manually - automated use only! do not test disable testing on pull request labels Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do not test disable testing on pull request pr-cherrypick Cherry-pick of merge-commit before backporting. Do not use manually - automated use only!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants