Skip to content

Unify the way how all the modes for parallel replicas are enabled and used #63521

@nikitamikhaylov

Description

@nikitamikhaylov

Use case

There are several modes of parallel replicas and tons of settings. In order to make it usable we need to do the following:

  • All the modes should work on top the MergeTree tables without additional Distributed on top.
  • Introduce a setting parallel_replicas_mode with Enum type with values read_tasks, key_hash, key_range and sample_offset. Later on we will introduce another mode auto.
    • This is kind of backward-incompatible change, because for sample-offset mode you will need to set: use_parallel_replicas=true, parallel_replicas_mode='sample_offset', max_parallel_replicas=X instead of just max_parallel_replicas=X. I think this is Ok, because this mode is not popular among our users + we can document it well.
    • Same for other modes, but all other ones are considered experimental.

Additional context

Take a look at comment in this PR: #63151

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions