[#67770]: Parallel replicas support for Merge tables by matt-metivier · Pull Request #95128 · ClickHouse/ClickHouse

matt-metivier · 2026-01-26T03:48:14Z

Changes

Added parallel_replicas_allow_merge_tables setting (default true) to allow Merge tables to use parallel replicas.
Updated findParallelReplicasQuery to recognize StorageMerge and TABLE_FUNCTION nodes as valid candidates for parallel execution.
Updated PlannerJoinTree and InterpreterSelectQuery to respect the new setting and enable parallel replicas for Merge tables.
Added stateless test 03725_parallel_replicas_merge_table to verify behavior.

Closes #67770

Changelog category (leave one):

New Feature

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Added support for parallel replicas with Merge tables.

clickhouse-gh · 2026-01-26T03:51:04Z

Workflow [PR], commit [2b38544]

Summary: ❌

job_name	test_name	status	info
Fast test		failure
	Build ClickHouse	failure
Build (arm_tidy)		failure
	Build ClickHouse	failure	cidb
Build (amd_debug)		dropped
Build (amd_asan)		dropped
Build (amd_tsan)		dropped
Build (amd_msan)		dropped
Build (amd_ubsan)		dropped
Build (amd_binary)		dropped
Build (arm_asan)		dropped
Build (arm_binary)		dropped

…rge() table function Enable parallel replicas for StorageMerge and merge() table function. When enabled via parallel_replicas_allow_merge_tables setting, queries to Merge tables or merge() function distribute across replicas. Key changes: - Add parallel_replicas_allow_merge_tables setting (disabled by default) - Allow TABLE_FUNCTION nodes with StorageMerge in parallel replicas eligibility check - Use empty StorageID for table functions to skip table existence check on replicas - Disable parallel replicas coordination for child tables on followers to prevent duplicate announcement errors from the coordinator Limitation: Only works when Merge table/function maps to a single underlying MergeTree table per replica. Multiple matched tables cause coordinator errors.

matt-metivier · 2026-01-30T15:29:25Z

@devcrafter Im planning to spend this weekend working on this hopefully, but if you've suggestion, please let me know

devcrafter · 2026-02-02T18:27:48Z

@matt-metivier Here is my consideration with some explanations which can help

Parallel replicas is a mechanism to parallelize query execution for a replicated MergeTree table among cluster nodes. So, the reading from a table will be parallelized among nodes and read data will be process to some mergeable stage on each node. The parallelization is done by reading different ranges of parts by different nodes.
The reading from a table is coordinated by parallel replicas coordinator on initiator node.
The table snapshot (what table parts are considered to be read during query execution) is taken from first announcement request from a replica, - in case parallel_replicas_local_plan = true, it's local replicas, otherwise first replicas we've got the announcement from.

I can image the following ...
Since merge table/merge() table function works like UNION of table which match a particular regular expression, we can treat such query as UNION of queries from the corresponding tables.
On initiator, we can decide which tables participate in the query. For each MergeTree table, we'll need to build an execution plan with parallel replicas and UNION them (also, with others plans for non-MT tables). So, each MT table we'll have separate parallel replicas coordinator. It should be very similar if you'll create a UNION query where table are mentioned explicitly.

Please let me know if you have other considerations or questions.

@KochetovNicolai Please comment if you have something to add

KochetovNicolai · 2026-02-03T11:19:32Z

I did not read the current implementation, but I find it difficult to implement parallel replicas for MergeTables with the current infrastructure. I think the best we can do now, as @devcrafter mentioned, is to treat StorageMerge as a UNION and enable parallel replicas for each branch independently (each branch will have its own connection, coordinator, tasks, etc). This way, we could parallelize only the reading. Changes in findParallelReplicasQuery are likely not needed (or, I don't understand why).

Offloading other steps, for example aggregation, will require changes in both planner and protocol (to support multiplexing between many storages).

…UNION-per-table approach

…plicas-merge-tables # Conflicts: # src/Core/Settings.cpp # src/Planner/PlannerJoinTree.cpp

alexey-milovidov added the can be tested Allows running workflows for external contributors label Jan 26, 2026

matt-metivier force-pushed the 67770-parallel-replicas-merge-tables branch from 99f79be to 6866414 Compare January 26, 2026 03:53

matt-metivier mentioned this pull request Jan 26, 2026

[#67770]: parallel replicas merge tables #95117

Closed

matt-metivier force-pushed the 67770-parallel-replicas-merge-tables branch from 6866414 to 1532efd Compare January 26, 2026 03:58

clickhouse-gh bot added the pr-feature Pull request with new product feature label Jan 26, 2026

matt-metivier force-pushed the 67770-parallel-replicas-merge-tables branch 9 times, most recently from b98b363 to 979a616 Compare January 27, 2026 01:09

matt-metivier force-pushed the 67770-parallel-replicas-merge-tables branch from 979a616 to 406eccc Compare January 27, 2026 01:41

devcrafter self-assigned this Jan 29, 2026

matt-metivier marked this pull request as draft January 30, 2026 14:24

matt-metivier added 2 commits February 7, 2026 17:47

[ClickHouse#67770]: Rewrite parallel replicas for Merge tables using …

2bdf803

…UNION-per-table approach

Merge remote-tracking branch 'upstream/master' into 67770-parallel-re…

2b38544

…plicas-merge-tables # Conflicts: # src/Core/Settings.cpp # src/Planner/PlannerJoinTree.cpp

alexey-milovidov mentioned this pull request Feb 9, 2026

Parallel replicas are not implemented for Merge tables #67770

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[#67770]: Parallel replicas support for Merge tables#95128

[#67770]: Parallel replicas support for Merge tables#95128
matt-metivier wants to merge 3 commits intoClickHouse:masterfrom
matt-metivier:67770-parallel-replicas-merge-tables

matt-metivier commented Jan 26, 2026 •

edited

Loading

Uh oh!

clickhouse-gh bot commented Jan 26, 2026 •

edited

Loading

Uh oh!

matt-metivier commented Jan 30, 2026

Uh oh!

devcrafter commented Feb 2, 2026

Uh oh!

KochetovNicolai commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

matt-metivier commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Uh oh!

clickhouse-gh bot commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matt-metivier commented Jan 30, 2026

Uh oh!

devcrafter commented Feb 2, 2026

Uh oh!

KochetovNicolai commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

matt-metivier commented Jan 26, 2026 •

edited

Loading

clickhouse-gh bot commented Jan 26, 2026 •

edited

Loading