-
Notifications
You must be signed in to change notification settings - Fork 24.5k
Description
Describe the bug
When we create a 10 node, 5 shard cluster in redis 7.2.7, the cluster nodes file does not always stabilize after cluster creation and continuously updates the cluster nodes file with a flip flopping shard-id for one of the shards.
To reproduce
See attached script that can be used to create a 10 node cluster. After cluster is created, monitor the node0/cluster.nodes.conf file. If the problem has been re-created, you will see the shard-id for one of shards flip between two values every few seconds.
Something like this:
Starts with shard-id c8dc...
$ cat node0/cluster.nodes.conf | grep -E "44aba9ff402bc0c1c5c30f9b4ed5cbe09257d03a|c8d701e355e383137f6624529f51a69a8c00fba8"
c8d701e355e383137f6624529f51a69a8c00fba8 127.0.0.1:6703@16703,,tls-port=0,shard-id=c8dcc477811f4e4c6b88b0c68031ee1cddfbe764 master - 0 1742243123451 4 connected 9830-13106
44aba9ff402bc0c1c5c30f9b4ed5cbe09257d03a 127.0.0.1:6708@16708,,tls-port=0,shard-id=c8dcc477811f4e4c6b88b0c68031ee1cddfbe764 slave c8d701e355e383137f6624529f51a69a8c00fba8 0 1742243120000 4 connected
Switches to e4f89...
$ cat node0/cluster.nodes.conf | grep -E "44aba9ff402bc0c1c5c30f9b4ed5cbe09257d03a|c8d701e355e383137f6624529f51a69a8c00fba8"
c8d701e355e383137f6624529f51a69a8c00fba8 127.0.0.1:6703@16703,,tls-port=0,shard-id=e4f8949f01637a67c70f747ba1e710f2ccffb844 master - 0 1742243123451 4 connected 9830-13106
44aba9ff402bc0c1c5c30f9b4ed5cbe09257d03a 127.0.0.1:6708@16708,,tls-port=0,shard-id=e4f8949f01637a67c70f747ba1e710f2ccffb844 slave c8d701e355e383137f6624529f51a69a8c00fba8 0 1742243127588 4 connected
Switches back to c8dc...
$ cat node0/cluster.nodes.conf | grep -E "44aba9ff402bc0c1c5c30f9b4ed5cbe09257d03a|c8d701e355e383137f6624529f51a69a8c00fba8"
c8d701e355e383137f6624529f51a69a8c00fba8 127.0.0.1:6703@16703,,tls-port=0,shard-id=c8dcc477811f4e4c6b88b0c68031ee1cddfbe764 master - 0 1742243127000 4 connected 9830-13106
44aba9ff402bc0c1c5c30f9b4ed5cbe09257d03a 127.0.0.1:6708@16708,,tls-port=0,shard-id=c8dcc477811f4e4c6b88b0c68031ee1cddfbe764 slave c8d701e355e383137f6624529f51a69a8c00fba8 0 1742243130000 4 connected
This just keeps repeating.
Expected behavior
I would expect that once the cluster is up, the shard-id shouldn't keep changing in the cluster nodes file. This is causing excessive updates to the file.
Additional information
Script can be run from directory containing the redis-server and redis-cli binaries and will create nodeX directories for 10 nodes and run the redis-cli command to create the cluster and add the replicas to the primaries.