[WIP] replication: handling PINGs as out-of-band data#8440
[WIP] replication: handling PINGs as out-of-band data#8440soloestoy wants to merge 3 commits intoredis:unstablefrom
Conversation
In master, do not feed PINGs to replication backlog, then it would not affect replication offset, but still send PINGs to replicas to keep alive. In replica, do not increase offset when receive PINGs, then replica can have same offset with master.
4a2d6e7 to
f430a20
Compare
|
@soloestoy i don't think this is the right approach. i don't like the fact that the master decides not to just write use sporadic writes directly to the slaves bypassing the replication buffers, and that the replicas just delete certain commands from their replication offset, and that these two behaviors are not necessarily in sync (i.e. the master can mistakenly write a ping into the replication buffers and the replica will decide to exclude it on it's side causing repl-offset mismatch). i think the right approach which was discussed a few times in the past (don't remember where) is to use multiplexing of several distinct byte streams: imagine that every payload that is transferred on the socket between the master and replicas carries a type field and a length.
one of the main advantages of this approach is that it let's us multiplex the replica output buffers to the replica during the rdb transfer, so that we don't have to buffer them in the master (who is also suffering from CoW). the key difference between this design and what this PR currently has is that this is a major change to the replication protocol (maybe harder to design, and implement in a backwards compatible way), but it's a structured protocol. @ShooterIT FYI (i remember recently discussing that with you) |
|
@oranagra I agree multiplexing replication packet is better, the key point is same, distinguish data stream and control stream. This PR is just like a demo, and you see I marked it as WIP, the PING command flag |
|
I agree that I don't think this is a building block to a long term solution. This really only solves the singular problem of having meaningless data in the replication stream, like pings, but doesn't allow us to solve some other interesting problems. We could have the conversation here, but I think an issue to think through the design would be more useful. |
|
may be related, ping goes into the replication stream, one test fails |
|
@enjoy-binbin for now, we just need to change the tests to avoid this issue in affected tests. see some tests that change |
|
odd, i did see in the reason and the fix is in #11609 |
|
|
This feature tries to handle PINGs as out-of-band data in replication stream. The main idea and implementation is simple:
master just propagate PINGs to replica client's reply buffer, but do not feed replication backlog, then the replication offset would not be affected.
when replica receive PINGs, just execute it to keep alive, but do not increase the master client's
reploff, then the replica can have the same offset with master. And do not proxy the PINGs to it's sub-replicas, just remove PINGs from replication stream.now replica do not proxy PINGs, so it should send PINGs to sub-replicas by itself.
master accept replica only when it support this feature, or the replication offset will be wrong, so replica need send
replconf capa ping-out-of-bandto master.