Skip to content

Extend load_balancing first_or_random to first_2th_or_random #11565

@qixiaogang

Description

@qixiaogang

Use case
In https://clickhouse.tech/docs/en/operations/settings/settings/#load_balancing-first_or_random, first_or_random works for 1 replica per AZ, but doesn't work for 2 replicas per AZ.
Let's say, there are two AZs (A and B), and 1 shard and 2 replicas in each AZ. 4 hosts are like below.

A_shard1_replicas1
A_shard1_replicas2
B_shard1_replicas1
B_shard1_replicas2

Configure load_balancing = first_or_random
In AZ A, remote_servers.xml is

<yandex>
    <remote_servers>
        <cross-az>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>A_shard1_replicas1</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>A_shard1_replicas2</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>B_shard1_replicas1</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>B_shard1_replicas2</host>
                    <port>9000</port>
                </replica>
            </shard>
        </cross-az>
    </remote_servers>
</yandex>

In AZ B, remote_servers.xml is

<yandex>
    <remote_servers>
        <cross-az>
            <shard>
                <internal_replication>true</internal_replication>
                <replica>
                    <host>B_shard1_replicas1</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>B_shard1_replicas2</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>A_shard1_replicas1</host>
                    <port>9000</port>
                </replica>
                <replica>
                    <host>A_shard1_replicas2</host>
                    <port>9000</port>
                </replica>
            </shard>
        </cross-az>
    </remote_servers>
</yandex>

In AZ A, if A_shard1_replicas1 is unavailable, then the first_or_random algorithm chooses randomly from the left 3 hosts, but it is better to choose A_shard1_replicas2.

Describe the solution you'd like
In AZ A, we want first_2th_or_random load_balance, which will act as below:

  • If A_shard1_replicas1 and A_shard1_replicas2 are available, randomly choose one of them,
  • If one of A_shard1_replicas1 and A_shard1_replicas2 is available, choose the available one,
  • If A_shard1_replicas1 and A_shard1_replicas2 are unavailable, randomly choose one of B_shard1_replicas1 and B_shard1_replicas2

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions