Skip to content

Rdb channel for full sync#12109

Closed
naglera wants to merge 74 commits intoredis:unstablefrom
naglera:rdb-channel
Closed

Rdb channel for full sync#12109
naglera wants to merge 74 commits intoredis:unstablefrom
naglera:rdb-channel

Conversation

@naglera
Copy link
Contributor

@naglera naglera commented Apr 27, 2023

In this PR we introduce the main benefit of rdb channel by continuously steaming the COB (client output buffers) in parallel to the RDB and thus keeping the master's side COB small AND accelerating the overall sync process. By streaming the replication data to the replica during the full sync, we reduce

  1. Memory load from the master's node.
  2. CPU load from the master's main process.
    Latest performance tests Rdb channel for full sync #12109 (comment)

Updated replica state machine

image

Closes #11678

naglera added 3 commits April 27, 2023 08:46
In this PR we introduce the main benefit of rdb channel by continuously
steaming the COB in parallel to the RDB and thus keeping the primary
side COB small AND accelerating the overall sync process.
By streaming the replication data to the replica during the full sync,
we reduce
1. Memory load from the primary node.
2. CPU load from the primary main process (will be introduced in later
   PR).
This opens up possibilities to future improvements with better
TLS connection handling and removal of the the need to pipeline the RDB
from the child process to the main.
In this PR we introduce the main benefit of rdb channel by continuously
steaming the COB in parallel to the RDB and thus keeping the primary
side COB small AND accelerating the overall sync process.
By streaming the replication data to the replica during the full sync,
we reduce
1. Memory load from the primary node.
2. CPU load from the primary main process (will be introduced in later
   PR).
This opens up possibilities to future improvements with better
TLS connection handling and removal of the the need to pipeline the RDB
from the child process to the main.
Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
i didn't really dive into the derails and code, but i do have a few comments about a few things i noticed.

p.s. still not certain about using a second socket rather than multiplexing yet, each has its pros and cons.

naglera and others added 4 commits May 4, 2023 11:00
1. The replica side mmap replication buffer has been replaced with a linked list
replication buffer.
2. The replica buffer limit now depends on the client-output-buffer-limit type replica.
3. Buffering policy changed to not cancel sync when replication buffer is full. We
instead continue to sync without reading from the replication data socket, so when the
replica side reaches replication buffer limits, the primary replication buffer will
take part in the replication data buffering.
Fix memory leak
Added void param to isOngoingRdbChannelSync
Copy link
Contributor

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally circled around to this again. It still feels like there is a lot of code and it feels like there should be less, and I think there is some duplication between regular and two channel sync

naglera added 2 commits May 23, 2023 09:46
Removed adlist method and primary block size.
Use sync write for sending rdb end offset.
Removed replconf connected sub command. The replica will send ack
instead.
Fixed git diff issue.
Refactored replicationAbortSyncTransfer + replicationAbortSyncTransfer.
@naglera
Copy link
Contributor Author

naglera commented Jun 5, 2023

Replication buffer memory during bgsave with rdb-channel on vs rdb-channel off. The upper graph shows primary's (in blue) and replica's replication buffer size during bgsave, with repl-rdb-channel off.

Figure_1

Explanation

  • The rdb channel (on the lower graph) allows us to transfer most of the incremental data memory load to the replica.
  • After about 15sec, the primary replication buffer start growing, this is the point where the replica start synchronies loading the snapshot into db.

Test details

  • Initialized the primary database with 3GB.
  • I used redis-benchmark to continuously set a single key with a random value of 8 bytes.
  • The primary's buffer size is measured by mem_total_replication_buffers, and the replica' buffer size by replicas_replication_buffers.

@soloestoy
Copy link
Contributor

Have you tried using multiplexing to achieve it? The current method of using two channels is a bit too complicated, I think multiplexing would be much simpler and can also solve the problem with PING.

@naglera
Copy link
Contributor Author

naglera commented Jul 2, 2023

HI @soloestoy, I have considered multiplexing. Although multiplexing also allows the replication data and rdb to be sent simultaneously, the key point is that using another channel, the child process can write directly to the replica. That completely eliminates the need for a pipeline to redis main process. In addition to removing a lot of pipeline complex code, this will increase the responsiveness of the main process during synchronization.
The design is better explained in #11678

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR.
I reviewed the code, without the tests for now and added many comments.

Here are a few top level notes (each also has a thread of it's own in the comments, but i wanna draw the attention to these first):

  • The review took long and it could be that some comments are duplicates.
  • I think that before merging this we better refactor replication.c into two files (master related code and replica related code, i'll discuss this with the core team.
  • I think we should change the state machine / connection sequence to a different way to do capability exchange and fallback to old mechanism without a reconnect.
  • I think we're mixing several different aspects in REPLCONF that should be kept separate
  • I feel that the two connections should be coupled in some way (redis will be aware that they're a pair). this could help us make sure the end-offset doesn't disappear from the backlock if it is small.
  • i feel the terminology isn't consistent (rdb channel, vs second channel and so on), and that we need to decide on it and sort it out.

@naglera
Copy link
Contributor Author

naglera commented Jan 1, 2024

Hi @oranagra, @madolson, sorry for the delay. I'm still working on previous comments. At the meantime I have some exciting results I want to share. I worked on reviving connset structure and handlers, in order to directly stream online changes from the child process to the replica, without pipeline to main process. Here are some of the results.

Data

Explanation

These graphs demonstrate performance improvements during full sync sessions using rdb-channel + streaming rdb directly from the background process to the replica.

First graph- with at most 50 clients and light weight commands, we saw 5%-7.5% improvement in write latency during sync session.
Two graphs below- full sync was tested during heavy read commands from the primary (such as sdiff, sunion on large sets). In that case, the child process writes to the replica without sharing CPU with the loaded main process. As a result, this not only improves client response time, but may also shorten sync time by about 50%. The shorter sync time results in less memory being used to store replication diffs (>60% in some of the tested cases).

Test setup

Both primary and replica in the performance tests ran on the same machine. RDB size in all tests is 3.7gb. I generated write load using redis-benchmark ./redis-benchmark -r 100000 -n 6000000 lpush my_list __rand_int__.


I will soon create a second PR for this change (on top of this PR), to avoid making this PR more complex then it already is.

@oranagra
Copy link
Member

oranagra commented Jan 1, 2024

nice results.
i imagine that we can improve the latency spikes (p99) with the old approach too (split some memory copying to smaller bulks).
and the huge improvement in time is probably due to having redis fully utilizing it's core, and another core is completely free (which may not always be the case, either because redis isn't busy, or because all other core are).
it'll still be an improvement even if these conditions were not true, but probably not that noticeable.

anyway, regardless of this being a 100% improvement or an 10% improvement, it is a good one, and now that it's possible to do (due to the separate channel which i mainly wanted to move the memory to the other end), i'd like to proceed.

but what are the benefits of a second PR?
before we merge this one, the diff will show both of them.
i think we can incrementally review and merge both of them in this one.

@naglera
Copy link
Contributor Author

naglera commented Jan 1, 2024

I wanted second PR just because this change can live without pipeline removal. Beside making this PR shorter there is no benefit. Lets continue on this PR then.

naglera added 5 commits March 12, 2024 12:58
…sync connection

This is necessery for the case in which the RDB is loaded before psync
establshed. We do that by protecting the RDB client for short grace
period (5sec) that will allow the replica main channel to finish
handshake.
@CLAassistant
Copy link

CLAassistant commented Mar 24, 2024

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 2 committers have signed the CLA.

❌ naglera
❌ amitnagl
You have signed the CLA already but the status is still pending? Let us recheck it.

@naglera naglera closed this Sep 5, 2024
@naglera naglera deleted the rdb-channel branch September 5, 2024 09:56
@tezc tezc mentioned this pull request Jan 8, 2025
tezc added a commit that referenced this pull request Jan 13, 2025
This PR is based on:

#12109
valkey-io/valkey#60

Closes: #11678

**Motivation**

During a full sync, when master is delivering RDB to the replica,
incoming write commands are kept in a replication buffer in order to be
sent to the replica once RDB delivery is completed. If RDB delivery
takes a long time, it might create memory pressure on master. Also, once
a replica connection accumulates replication data which is larger than
output buffer limits, master will kill replica connection. This may
cause a replication failure.

The main benefit of the rdb channel replication is streaming incoming
commands in parallel to the RDB delivery. This approach shifts
replication stream buffering to the replica and reduces load on master.
We do this by opening another connection for RDB delivery. The main
channel on replica will be receiving replication stream while rdb
channel is receiving the RDB.

This feature also helps to reduce master's main process CPU load. By
opening a dedicated connection for the RDB transfer, the bgsave process
has access to the new connection and it will stream RDB directly to the
replicas. Before this change, due to TLS connection restriction, the
bgsave process was writing RDB bytes to a pipe and the main process was
forwarding
it to the replica. This is no longer necessary, the main process can
avoid these expensive socket read/write syscalls. It also means RDB
delivery to replica will be faster as it avoids this step.

In summary, replication will be faster and master's performance during
full syncs will improve.


**Implementation steps**

1. When replica connects to the master, it sends 'rdb-channel-repl' as
part of capability exchange to let master to know replica supports rdb
channel.
2. When replica lacks sufficient data for PSYNC, master sends
+RDBCHANNELSYNC reply with replica's client id. As the next step, the
replica opens a new connection (rdb-channel) and configures it against
the master with the appropriate capabilities and requirements. It also
sends given client id back to master over rdbchannel, so that master can
associate these channels. (initial replica connection will be referred
as main-channel) Then, replica requests fullsync using the RDB channel.
3. Prior to forking, master attaches the replica's main channel to the
replication backlog to deliver replication stream starting at the
snapshot end offset.
4. The master main process sends replication stream via the main
channel, while the bgsave process sends the RDB directly to the replica
via the rdb-channel. Replica accumulates replication stream in a local
buffer, while the RDB is being loaded into the memory.
5. Once the replica completes loading the rdb, it drops the rdb channel
and streams the accumulated replication stream into the db. Sync is
completed.

**Some details**
- Currently, rdbchannel replication is supported only if
`repl-diskless-sync` is enabled on master. Otherwise, replication will
happen over a single connection as in before.
- On replica, there is a limit to replication stream buffering. Replica
uses a new config `replica-full-sync-buffer-limit` to limit number of
bytes to accumulate. If it is not set, replica inherits
`client-output-buffer-limit <replica>` hard limit config. If we reach
this limit, replica stops accumulating. This is not a failure scenario
though. Further accumulation will happen on master side. Depending on
the configured limits on master, master may kill the replica connection.

**API changes in INFO output:**

1. New replica state: `send_bulk_and_stream`. Indicates full sync is
still in progress for this replica. It is receiving replication stream
and rdb in parallel.
```
slave0:ip=127.0.0.1,port=5002,state=send_bulk_and_stream,offset=0,lag=0
```
Replica state changes in steps:
- First, replica sends psync and receives +RDBCHANNELSYNC
:`state=wait_bgsave`
- After replica connects with rdbchannel and delivery starts:
`state=send_bulk_and_stream`
 - After full sync: `state=online`

2. On replica side, replication stream buffering metrics:
- replica_full_sync_buffer_size: Currently accumulated replication
stream data in bytes.
- replica_full_sync_buffer_peak: Peak number of bytes that this instance
accumulated in the lifetime of the process.

```
replica_full_sync_buffer_size:20485             
replica_full_sync_buffer_peak:1048560
```

**API changes in CLIENT LIST**

In `client list` output, rdbchannel clients will have 'C' flag in
addition to 'S' replica flag:
```
id=11 addr=127.0.0.1:39108 laddr=127.0.0.1:5001 fd=14 name= age=5 idle=5 flags=SC db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=0 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=1920 events=r cmd=psync user=default redir=-1 resp=2 lib-name= lib-ver= io-thread=0
```

**Config changes:**
- `replica-full-sync-buffer-limit`: Controls how much replication data
replica can accumulate during rdbchannel replication. If it is not set,
a value of 0 means replica will inherit `client-output-buffer-limit
<replica>` hard limit config to limit accumulated data.
- `repl-rdb-channel` config is added as a hidden config. This is mostly
for testing as we need to support both rdbchannel replication and the
older single connection replication (to keep compatibility with older
versions and rdbchannel replication will not be enabled if
repl-diskless-sync is not enabled). it affects both the master (not to
respond to rdb channel requests), and the replica (not to declare
capability)

**Internal API changes:**
Changes that were introduced to Redis replication:
- New replication capability is added to replconf command: `capa
rdb-channel-repl`. Indicates replica is capable of rdb channel
replication. Replica sends it when it connects to master along with
other capabilities.
- If replica needs fullsync, master replies `+RDBCHANNELSYNC
<client-id>` to the replica's PSYNC request.
- When replica opens rdbchannel connection, as part of replconf command,
it sends `rdb-channel 1` to let master know this is rdb channel. Also,
it sends `main-ch-client-id <client-id>` as part of replconf command so
master can associate channels.
  
**Testing:**
As rdbchannel replication is enabled by default, we run whole test suite
with it. Though, as we need to support both rdbchannel and single
connection replication, we'll be running some tests twice with
`repl-rdb-channel yes/no` config.

**Replica state diagram**
```
* * Replica state machine *
 *
 * Main channel state
 * ┌───────────────────┐
 * │RECEIVE_PING_REPLY │
 * └────────┬──────────┘
 *          │ +PONG
 * ┌────────▼──────────┐
 * │SEND_HANDSHAKE     │                     RDB channel state
 * └────────┬──────────┘            ┌───────────────────────────────┐
 *          │+OK                ┌───► RDB_CH_SEND_HANDSHAKE         │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_AUTH_REPLY │        │    REPLCONF main-ch-client-id <clientid>
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │ RDB_CH_RECEIVE_AUTH_REPLY     │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_PORT_REPLY │        │                  │ +OK
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │  RDB_CH_RECEIVE_REPLCONF_REPLY│
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_IP_REPLY   │        │                  │ +OK
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │ RDB_CH_RECEIVE_FULLRESYNC     │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_CAPA_REPLY │        │                  │+FULLRESYNC
 * └────────┬──────────┘        │                  │Rdb delivery
 *          │                   │   ┌──────────────▼────────────────┐
 * ┌────────▼──────────┐        │   │ RDB_CH_RDB_LOADING            │
 * │SEND_PSYNC         │        │   └──────────────┬────────────────┘
 * └─┬─────────────────┘        │                  │ Done loading
 *   │PSYNC (use cached-master) │                  │
 * ┌─▼─────────────────┐        │                  │
 * │RECEIVE_PSYNC_REPLY│        │    ┌────────────►│ Replica streams replication
 * └─┬─────────────────┘        │    │             │ buffer into memory
 *   │                          │    │             │
 *   │+RDBCHANNELSYNC client-id │    │             │
 *   ├──────┬───────────────────┘    │             │
 *   │      │ Main channel           │             │
 *   │      │ accumulates repl data  │             │
 *   │   ┌──▼────────────────┐       │     ┌───────▼───────────┐
 *   │   │ REPL_TRANSFER     ├───────┘     │    CONNECTED      │
 *   │   └───────────────────┘             └────▲───▲──────────┘
 *   │                                          │   │
 *   │                                          │   │
 *   │  +FULLRESYNC    ┌───────────────────┐    │   │
 *   ├────────────────► REPL_TRANSFER      ├────┘   │
 *   │                 └───────────────────┘        │
 *   │  +CONTINUE                                   │
 *   └──────────────────────────────────────────────┘
 */
 ```
 -----
 This PR also contains changes and ideas from: 
valkey-io/valkey#837
valkey-io/valkey#1173
valkey-io/valkey#804
valkey-io/valkey#945
valkey-io/valkey#989
---------

Co-authored-by: Yuan Wang <[email protected]>
Co-authored-by: debing.sun <[email protected]>
Co-authored-by: Moti Cohen <[email protected]>
Co-authored-by: naglera <[email protected]>
Co-authored-by: Amit Nagler <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Binbin <[email protected]>
Co-authored-by: Viktor Söderqvist <[email protected]>
Co-authored-by: Ping Xie <[email protected]>
Co-authored-by: Ran Shidlansik <[email protected]>
Co-authored-by: ranshid <[email protected]>
Co-authored-by: xbasel <[email protected]>
funny-dog pushed a commit to funny-dog/redis that referenced this pull request Sep 17, 2025
This PR is based on:

redis#12109
valkey-io/valkey#60

Closes: redis#11678

**Motivation**

During a full sync, when master is delivering RDB to the replica,
incoming write commands are kept in a replication buffer in order to be
sent to the replica once RDB delivery is completed. If RDB delivery
takes a long time, it might create memory pressure on master. Also, once
a replica connection accumulates replication data which is larger than
output buffer limits, master will kill replica connection. This may
cause a replication failure.

The main benefit of the rdb channel replication is streaming incoming
commands in parallel to the RDB delivery. This approach shifts
replication stream buffering to the replica and reduces load on master.
We do this by opening another connection for RDB delivery. The main
channel on replica will be receiving replication stream while rdb
channel is receiving the RDB.

This feature also helps to reduce master's main process CPU load. By
opening a dedicated connection for the RDB transfer, the bgsave process
has access to the new connection and it will stream RDB directly to the
replicas. Before this change, due to TLS connection restriction, the
bgsave process was writing RDB bytes to a pipe and the main process was
forwarding
it to the replica. This is no longer necessary, the main process can
avoid these expensive socket read/write syscalls. It also means RDB
delivery to replica will be faster as it avoids this step.

In summary, replication will be faster and master's performance during
full syncs will improve.


**Implementation steps**

1. When replica connects to the master, it sends 'rdb-channel-repl' as
part of capability exchange to let master to know replica supports rdb
channel.
2. When replica lacks sufficient data for PSYNC, master sends
+RDBCHANNELSYNC reply with replica's client id. As the next step, the
replica opens a new connection (rdb-channel) and configures it against
the master with the appropriate capabilities and requirements. It also
sends given client id back to master over rdbchannel, so that master can
associate these channels. (initial replica connection will be referred
as main-channel) Then, replica requests fullsync using the RDB channel.
3. Prior to forking, master attaches the replica's main channel to the
replication backlog to deliver replication stream starting at the
snapshot end offset.
4. The master main process sends replication stream via the main
channel, while the bgsave process sends the RDB directly to the replica
via the rdb-channel. Replica accumulates replication stream in a local
buffer, while the RDB is being loaded into the memory.
5. Once the replica completes loading the rdb, it drops the rdb channel
and streams the accumulated replication stream into the db. Sync is
completed.

**Some details**
- Currently, rdbchannel replication is supported only if
`repl-diskless-sync` is enabled on master. Otherwise, replication will
happen over a single connection as in before.
- On replica, there is a limit to replication stream buffering. Replica
uses a new config `replica-full-sync-buffer-limit` to limit number of
bytes to accumulate. If it is not set, replica inherits
`client-output-buffer-limit <replica>` hard limit config. If we reach
this limit, replica stops accumulating. This is not a failure scenario
though. Further accumulation will happen on master side. Depending on
the configured limits on master, master may kill the replica connection.

**API changes in INFO output:**

1. New replica state: `send_bulk_and_stream`. Indicates full sync is
still in progress for this replica. It is receiving replication stream
and rdb in parallel.
```
slave0:ip=127.0.0.1,port=5002,state=send_bulk_and_stream,offset=0,lag=0
```
Replica state changes in steps:
- First, replica sends psync and receives +RDBCHANNELSYNC
:`state=wait_bgsave`
- After replica connects with rdbchannel and delivery starts:
`state=send_bulk_and_stream`
 - After full sync: `state=online`

2. On replica side, replication stream buffering metrics:
- replica_full_sync_buffer_size: Currently accumulated replication
stream data in bytes.
- replica_full_sync_buffer_peak: Peak number of bytes that this instance
accumulated in the lifetime of the process.

```
replica_full_sync_buffer_size:20485             
replica_full_sync_buffer_peak:1048560
```

**API changes in CLIENT LIST**

In `client list` output, rdbchannel clients will have 'C' flag in
addition to 'S' replica flag:
```
id=11 addr=127.0.0.1:39108 laddr=127.0.0.1:5001 fd=14 name= age=5 idle=5 flags=SC db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=0 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=1920 events=r cmd=psync user=default redir=-1 resp=2 lib-name= lib-ver= io-thread=0
```

**Config changes:**
- `replica-full-sync-buffer-limit`: Controls how much replication data
replica can accumulate during rdbchannel replication. If it is not set,
a value of 0 means replica will inherit `client-output-buffer-limit
<replica>` hard limit config to limit accumulated data.
- `repl-rdb-channel` config is added as a hidden config. This is mostly
for testing as we need to support both rdbchannel replication and the
older single connection replication (to keep compatibility with older
versions and rdbchannel replication will not be enabled if
repl-diskless-sync is not enabled). it affects both the master (not to
respond to rdb channel requests), and the replica (not to declare
capability)

**Internal API changes:**
Changes that were introduced to Redis replication:
- New replication capability is added to replconf command: `capa
rdb-channel-repl`. Indicates replica is capable of rdb channel
replication. Replica sends it when it connects to master along with
other capabilities.
- If replica needs fullsync, master replies `+RDBCHANNELSYNC
<client-id>` to the replica's PSYNC request.
- When replica opens rdbchannel connection, as part of replconf command,
it sends `rdb-channel 1` to let master know this is rdb channel. Also,
it sends `main-ch-client-id <client-id>` as part of replconf command so
master can associate channels.
  
**Testing:**
As rdbchannel replication is enabled by default, we run whole test suite
with it. Though, as we need to support both rdbchannel and single
connection replication, we'll be running some tests twice with
`repl-rdb-channel yes/no` config.

**Replica state diagram**
```
* * Replica state machine *
 *
 * Main channel state
 * ┌───────────────────┐
 * │RECEIVE_PING_REPLY │
 * └────────┬──────────┘
 *          │ +PONG
 * ┌────────▼──────────┐
 * │SEND_HANDSHAKE     │                     RDB channel state
 * └────────┬──────────┘            ┌───────────────────────────────┐
 *          │+OK                ┌───► RDB_CH_SEND_HANDSHAKE         │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_AUTH_REPLY │        │    REPLCONF main-ch-client-id <clientid>
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │ RDB_CH_RECEIVE_AUTH_REPLY     │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_PORT_REPLY │        │                  │ +OK
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │  RDB_CH_RECEIVE_REPLCONF_REPLY│
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_IP_REPLY   │        │                  │ +OK
 * └────────┬──────────┘        │   ┌──────────────▼────────────────┐
 *          │+OK                │   │ RDB_CH_RECEIVE_FULLRESYNC     │
 * ┌────────▼──────────┐        │   └──────────────┬────────────────┘
 * │RECEIVE_CAPA_REPLY │        │                  │+FULLRESYNC
 * └────────┬──────────┘        │                  │Rdb delivery
 *          │                   │   ┌──────────────▼────────────────┐
 * ┌────────▼──────────┐        │   │ RDB_CH_RDB_LOADING            │
 * │SEND_PSYNC         │        │   └──────────────┬────────────────┘
 * └─┬─────────────────┘        │                  │ Done loading
 *   │PSYNC (use cached-master) │                  │
 * ┌─▼─────────────────┐        │                  │
 * │RECEIVE_PSYNC_REPLY│        │    ┌────────────►│ Replica streams replication
 * └─┬─────────────────┘        │    │             │ buffer into memory
 *   │                          │    │             │
 *   │+RDBCHANNELSYNC client-id │    │             │
 *   ├──────┬───────────────────┘    │             │
 *   │      │ Main channel           │             │
 *   │      │ accumulates repl data  │             │
 *   │   ┌──▼────────────────┐       │     ┌───────▼───────────┐
 *   │   │ REPL_TRANSFER     ├───────┘     │    CONNECTED      │
 *   │   └───────────────────┘             └────▲───▲──────────┘
 *   │                                          │   │
 *   │                                          │   │
 *   │  +FULLRESYNC    ┌───────────────────┐    │   │
 *   ├────────────────► REPL_TRANSFER      ├────┘   │
 *   │                 └───────────────────┘        │
 *   │  +CONTINUE                                   │
 *   └──────────────────────────────────────────────┘
 */
 ```
 -----
 This PR also contains changes and ideas from: 
valkey-io/valkey#837
valkey-io/valkey#1173
valkey-io/valkey#804
valkey-io/valkey#945
valkey-io/valkey#989
---------

Co-authored-by: Yuan Wang <[email protected]>
Co-authored-by: debing.sun <[email protected]>
Co-authored-by: Moti Cohen <[email protected]>
Co-authored-by: naglera <[email protected]>
Co-authored-by: Amit Nagler <[email protected]>
Co-authored-by: Madelyn Olson <[email protected]>
Co-authored-by: Binbin <[email protected]>
Co-authored-by: Viktor Söderqvist <[email protected]>
Co-authored-by: Ping Xie <[email protected]>
Co-authored-by: Ran Shidlansik <[email protected]>
Co-authored-by: ranshid <[email protected]>
Co-authored-by: xbasel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Second Channel For RDB

8 participants