[WIP] replication: handling PINGs as out-of-band data by soloestoy · Pull Request #8440 · redis/redis

soloestoy · 2021-02-02T12:44:58Z

This feature tries to handle PINGs as out-of-band data in replication stream. The main idea and implementation is simple:

master just propagate PINGs to replica client's reply buffer, but do not feed replication backlog, then the replication offset would not be affected.
when replica receive PINGs, just execute it to keep alive, but do not increase the master client's reploff, then the replica can have the same offset with master. And do not proxy the PINGs to it's sub-replicas, just remove PINGs from replication stream.
now replica do not proxy PINGs, so it should send PINGs to sub-replicas by itself.
master accept replica only when it support this feature, or the replication offset will be wrong, so replica need send replconf capa ping-out-of-band to master.

In master, do not feed PINGs to replication backlog, then it would not affect replication offset, but still send PINGs to replicas to keep alive. In replica, do not increase offset when receive PINGs, then replica can have same offset with master.

oranagra · 2021-02-02T13:09:42Z

@soloestoy i don't think this is the right approach.
it reminds me the meaningful-offset feature of early 6.0 releases too much.

i don't like the fact that the master decides not to just write use sporadic writes directly to the slaves bypassing the replication buffers, and that the replicas just delete certain commands from their replication offset, and that these two behaviors are not necessarily in sync (i.e. the master can mistakenly write a ping into the replication buffers and the replica will decide to exclude it on it's side causing repl-offset mismatch).

i think the right approach which was discussed a few times in the past (don't remember where) is to use multiplexing of several distinct byte streams:

imagine that every payload that is transferred on the socket between the master and replicas carries a type field and a length.
so when the master writes data to the replica it can flag each payload as one of:

command stream - what we have today (which is incrementing the replication offset, and is written into the replication backlog)
chit chat - e.g. pings replconf and other configuration exchange
full sync - this is the rdb content that's generated by bgsave

one of the main advantages of this approach is that it let's us multiplex the replica output buffers to the replica during the rdb transfer, so that we don't have to buffer them in the master (who is also suffering from CoW).

the key difference between this design and what this PR currently has is that this is a major change to the replication protocol (maybe harder to design, and implement in a backwards compatible way), but it's a structured protocol.
the current code relies on the coincidence that pings will be treated the same on both sides (in each by a different trigger).

@ShooterIT FYI (i remember recently discussing that with you)

soloestoy · 2021-02-02T13:23:50Z

@oranagra I agree multiplexing replication packet is better, the key point is same, distinguish data stream and control stream. This PR is just like a demo, and you see I marked it as WIP, the PING command flag out-of-replication is just like type filed. As you said we need design a new protocol using in replication indeed.

madolson · 2021-02-02T17:27:26Z

I agree that I don't think this is a building block to a long term solution. This really only solves the singular problem of having meaningless data in the replication stream, like pings, but doesn't allow us to solve some other interesting problems. We could have the conversation here, but I think an issue to think through the design would be more useful.

enjoy-binbin · 2022-11-10T04:11:16Z

may be related, ping goes into the replication stream, one test fails

test-freebsd

*** [err]: FLUSHDB / FLUSHALL should replicate in tests/integration/replication.tcl
2022-11-10T00:30:17.0558340Z Expected 'ping' to match 'flushall' (context: type source line 861 file /Users/runner/work/redis/redis/tests/test_helper.tcl cmd {assert_match [lindex $patterns $j] [read_from_replication_stream $s]} proc ::assert_replication_stream level 1)

oranagra · 2022-11-12T17:54:41Z

@enjoy-binbin for now, we just need to change the tests to avoid this issue in affected tests. see some tests that change repl-ping-replica-period

enjoy-binbin · 2022-11-13T15:41:49Z

odd, i did see in attach_to_replication_stream, it will set repl-ping-replica-period to 3600

the reason and the fix is in #11609

CLAassistant · 2024-03-24T23:19:41Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

soloestoy requested a review from oranagra February 2, 2021 12:44

soloestoy added 3 commits February 2, 2021 20:49

replication: replica can send PINGs to sub-replicas by itself

9cecaa6

replication: master accept replica only when it support ping-out-of-band

f430a20

soloestoy force-pushed the ping-out-of-repl-band branch from 4a2d6e7 to f430a20 Compare February 2, 2021 12:50

oranagra mentioned this pull request Jul 6, 2021

Replication backlog and replicas use one global shared replication buffer #9166

Merged

soloestoy mentioned this pull request Jul 22, 2021

PSYNC2: make partial sync possible after master reboot #8015

Merged

oranagra mentioned this pull request Aug 14, 2022

Cleanup / refactor replication.c #11125

Open

oranagra mentioned this pull request Dec 1, 2022

[BUG] 64gb client-output-buffer-limit isn't enough for maxmemory 96gb. #11558

Closed

naglera mentioned this pull request Jan 2, 2023

Second Channel For RDB #11678

Closed

oranagra mentioned this pull request Feb 8, 2023

Implementing the WAITAOF command (issue #10505) #11713

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] replication: handling PINGs as out-of-band data#8440

[WIP] replication: handling PINGs as out-of-band data#8440
soloestoy wants to merge 3 commits intoredis:unstablefrom
soloestoy:ping-out-of-repl-band

soloestoy commented Feb 2, 2021 •

edited

Loading

Uh oh!

oranagra commented Feb 2, 2021

Uh oh!

soloestoy commented Feb 2, 2021 •

edited

Loading

Uh oh!

madolson commented Feb 2, 2021

Uh oh!

enjoy-binbin commented Nov 10, 2022

Uh oh!

oranagra commented Nov 12, 2022

Uh oh!

enjoy-binbin commented Nov 13, 2022 •

edited

Loading

Uh oh!

CLAassistant commented Mar 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

soloestoy commented Feb 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra commented Feb 2, 2021

Uh oh!

soloestoy commented Feb 2, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

madolson commented Feb 2, 2021

Uh oh!

enjoy-binbin commented Nov 10, 2022

Uh oh!

oranagra commented Nov 12, 2022

Uh oh!

enjoy-binbin commented Nov 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CLAassistant commented Mar 24, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

soloestoy commented Feb 2, 2021 •

edited

Loading

soloestoy commented Feb 2, 2021 •

edited

Loading

enjoy-binbin commented Nov 13, 2022 •

edited

Loading