Rdb channel for full sync by naglera · Pull Request #12109 · redis/redis

naglera · 2023-04-27T11:38:57Z

In this PR we introduce the main benefit of rdb channel by continuously steaming the COB (client output buffers) in parallel to the RDB and thus keeping the master's side COB small AND accelerating the overall sync process. By streaming the replication data to the replica during the full sync, we reduce

Memory load from the master's node.
CPU load from the master's main process.
Latest performance tests Rdb channel for full sync #12109 (comment)

Updated replica state machine

Closes #11678

In this PR we introduce the main benefit of rdb channel by continuously steaming the COB in parallel to the RDB and thus keeping the primary side COB small AND accelerating the overall sync process. By streaming the replication data to the replica during the full sync, we reduce 1. Memory load from the primary node. 2. CPU load from the primary main process (will be introduced in later PR). This opens up possibilities to future improvements with better TLS connection handling and removal of the the need to pipeline the RDB from the child process to the main.

oranagra

Thanks.
i didn't really dive into the derails and code, but i do have a few comments about a few things i noticed.

p.s. still not certain about using a second socket rather than multiplexing yet, each has its pros and cons.

src/replication.c

redis.conf

src/config.c

src/networking.c

1. The replica side mmap replication buffer has been replaced with a linked list replication buffer. 2. The replica buffer limit now depends on the client-output-buffer-limit type replica. 3. Buffering policy changed to not cancel sync when replication buffer is full. We instead continue to sync without reading from the replication data socket, so when the replica side reaches replication buffer limits, the primary replication buffer will take part in the replication data buffering.

Fix memory leak Added void param to isOngoingRdbChannelSync

madolson

Finally circled around to this again. It still feels like there is a lot of code and it feels like there should be less, and I think there is some duplication between regular and two channel sync

src/replication.c

src/rdb.c

src/replication.c

src/server.h

src/replication.c

Removed adlist method and primary block size. Use sync write for sending rdb end offset. Removed replconf connected sub command. The replica will send ack instead. Fixed git diff issue. Refactored replicationAbortSyncTransfer + replicationAbortSyncTransfer.

naglera · 2023-06-05T10:18:35Z

Replication buffer memory during bgsave with rdb-channel on vs rdb-channel off. The upper graph shows primary's (in blue) and replica's replication buffer size during bgsave, with repl-rdb-channel off.

Explanation

The rdb channel (on the lower graph) allows us to transfer most of the incremental data memory load to the replica.
After about 15sec, the primary replication buffer start growing, this is the point where the replica start synchronies loading the snapshot into db.

Test details

Initialized the primary database with 3GB.
I used redis-benchmark to continuously set a single key with a random value of 8 bytes.
The primary's buffer size is measured by mem_total_replication_buffers, and the replica' buffer size by replicas_replication_buffers.

src/replication.c

soloestoy · 2023-06-27T11:32:11Z

Have you tried using multiplexing to achieve it? The current method of using two channels is a bit too complicated, I think multiplexing would be much simpler and can also solve the problem with PING.

naglera · 2023-07-02T09:01:42Z

HI @soloestoy, I have considered multiplexing. Although multiplexing also allows the replication data and rdb to be sent simultaneously, the key point is that using another channel, the child process can write directly to the replica. That completely eliminates the need for a pipeline to redis main process. In addition to removing a lot of pipeline complex code, this will increase the responsiveness of the main process during synchronization.
The design is better explained in #11678

oranagra

Thank you for the PR.
I reviewed the code, without the tests for now and added many comments.

Here are a few top level notes (each also has a thread of it's own in the comments, but i wanna draw the attention to these first):

The review took long and it could be that some comments are duplicates.
I think that before merging this we better refactor replication.c into two files (master related code and replica related code, i'll discuss this with the core team.
I think we should change the state machine / connection sequence to a different way to do capability exchange and fallback to old mechanism without a reconnect.
I think we're mixing several different aspects in REPLCONF that should be kept separate
I feel that the two connections should be coupled in some way (redis will be aware that they're a pair). this could help us make sure the end-offset doesn't disappear from the backlock if it is small.
i feel the terminology isn't consistent (rdb channel, vs second channel and so on), and that we need to decide on it and sort it out.

src/server.h

src/replication.c

src/rdb.c

naglera · 2024-01-01T12:48:04Z

Hi @oranagra, @madolson, sorry for the delay. I'm still working on previous comments. At the meantime I have some exciting results I want to share. I worked on reviving connset structure and handlers, in order to directly stream online changes from the child process to the replica, without pipeline to main process. Here are some of the results.

Data

Explanation

These graphs demonstrate performance improvements during full sync sessions using rdb-channel + streaming rdb directly from the background process to the replica.

First graph- with at most 50 clients and light weight commands, we saw 5%-7.5% improvement in write latency during sync session.
Two graphs below- full sync was tested during heavy read commands from the primary (such as sdiff, sunion on large sets). In that case, the child process writes to the replica without sharing CPU with the loaded main process. As a result, this not only improves client response time, but may also shorten sync time by about 50%. The shorter sync time results in less memory being used to store replication diffs (>60% in some of the tested cases).

Test setup

Both primary and replica in the performance tests ran on the same machine. RDB size in all tests is 3.7gb. I generated write load using redis-benchmark ./redis-benchmark -r 100000 -n 6000000 lpush my_list __rand_int__.

I will soon create a second PR for this change (on top of this PR), to avoid making this PR more complex then it already is.

oranagra · 2024-01-01T15:45:15Z

nice results.
i imagine that we can improve the latency spikes (p99) with the old approach too (split some memory copying to smaller bulks).
and the huge improvement in time is probably due to having redis fully utilizing it's core, and another core is completely free (which may not always be the case, either because redis isn't busy, or because all other core are).
it'll still be an improvement even if these conditions were not true, but probably not that noticeable.

anyway, regardless of this being a 100% improvement or an 10% improvement, it is a good one, and now that it's possible to do (due to the separate channel which i mainly wanted to move the memory to the other end), i'd like to proceed.

but what are the benefits of a second PR?
before we merge this one, the diff will show both of them.
i think we can incrementally review and merge both of them in this one.

naglera · 2024-01-01T16:17:15Z

I wanted second PR just because this change can live without pipeline removal. Beside making this PR shorter there is no benefit. Lets continue on this PR then.

Stop using stat_repl_processed_bytes to follow streaming progress, instread keep record of the buffer's peak.

…t rdb-channel sync along with master side disk sync

src/server.h

src/dict.c

src/server.c

src/object.c

src/replication.c

…sync connection This is necessery for the case in which the RDB is loaded before psync establshed. We do that by protecting the RDB client for short grace period (5sec) that will allow the replica main channel to finish handshake.

src/server.h

src/networking.c

src/rio.c

src/replication.c

src/server.h

Co-authored-by: debing.sun <[email protected]>

src/networking.c

CLAassistant · 2024-03-24T23:08:22Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 2 committers have signed the CLA.

❌ naglera
❌ amitnagl
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

This PR is based on: #12109 valkey-io/valkey#60 Closes: #11678 **Motivation** During a full sync, when master is delivering RDB to the replica, incoming write commands are kept in a replication buffer in order to be sent to the replica once RDB delivery is completed. If RDB delivery takes a long time, it might create memory pressure on master. Also, once a replica connection accumulates replication data which is larger than output buffer limits, master will kill replica connection. This may cause a replication failure. The main benefit of the rdb channel replication is streaming incoming commands in parallel to the RDB delivery. This approach shifts replication stream buffering to the replica and reduces load on master. We do this by opening another connection for RDB delivery. The main channel on replica will be receiving replication stream while rdb channel is receiving the RDB. This feature also helps to reduce master's main process CPU load. By opening a dedicated connection for the RDB transfer, the bgsave process has access to the new connection and it will stream RDB directly to the replicas. Before this change, due to TLS connection restriction, the bgsave process was writing RDB bytes to a pipe and the main process was forwarding it to the replica. This is no longer necessary, the main process can avoid these expensive socket read/write syscalls. It also means RDB delivery to replica will be faster as it avoids this step. In summary, replication will be faster and master's performance during full syncs will improve. **Implementation steps** 1. When replica connects to the master, it sends 'rdb-channel-repl' as part of capability exchange to let master to know replica supports rdb channel. 2. When replica lacks sufficient data for PSYNC, master sends +RDBCHANNELSYNC reply with replica's client id. As the next step, the replica opens a new connection (rdb-channel) and configures it against the master with the appropriate capabilities and requirements. It also sends given client id back to master over rdbchannel, so that master can associate these channels. (initial replica connection will be referred as main-channel) Then, replica requests fullsync using the RDB channel. 3. Prior to forking, master attaches the replica's main channel to the replication backlog to deliver replication stream starting at the snapshot end offset. 4. The master main process sends replication stream via the main channel, while the bgsave process sends the RDB directly to the replica via the rdb-channel. Replica accumulates replication stream in a local buffer, while the RDB is being loaded into the memory. 5. Once the replica completes loading the rdb, it drops the rdb channel and streams the accumulated replication stream into the db. Sync is completed. **Some details** - Currently, rdbchannel replication is supported only if `repl-diskless-sync` is enabled on master. Otherwise, replication will happen over a single connection as in before. - On replica, there is a limit to replication stream buffering. Replica uses a new config `replica-full-sync-buffer-limit` to limit number of bytes to accumulate. If it is not set, replica inherits `client-output-buffer-limit <replica>` hard limit config. If we reach this limit, replica stops accumulating. This is not a failure scenario though. Further accumulation will happen on master side. Depending on the configured limits on master, master may kill the replica connection. **API changes in INFO output:** 1. New replica state: `send_bulk_and_stream`. Indicates full sync is still in progress for this replica. It is receiving replication stream and rdb in parallel. ``` slave0:ip=127.0.0.1,port=5002,state=send_bulk_and_stream,offset=0,lag=0 ``` Replica state changes in steps: - First, replica sends psync and receives +RDBCHANNELSYNC :`state=wait_bgsave` - After replica connects with rdbchannel and delivery starts: `state=send_bulk_and_stream` - After full sync: `state=online` 2. On replica side, replication stream buffering metrics: - replica_full_sync_buffer_size: Currently accumulated replication stream data in bytes. - replica_full_sync_buffer_peak: Peak number of bytes that this instance accumulated in the lifetime of the process. ``` replica_full_sync_buffer_size:20485 replica_full_sync_buffer_peak:1048560 ``` **API changes in CLIENT LIST** In `client list` output, rdbchannel clients will have 'C' flag in addition to 'S' replica flag: ``` id=11 addr=127.0.0.1:39108 laddr=127.0.0.1:5001 fd=14 name= age=5 idle=5 flags=SC db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=0 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=1920 events=r cmd=psync user=default redir=-1 resp=2 lib-name= lib-ver= io-thread=0 ``` **Config changes:** - `replica-full-sync-buffer-limit`: Controls how much replication data replica can accumulate during rdbchannel replication. If it is not set, a value of 0 means replica will inherit `client-output-buffer-limit <replica>` hard limit config to limit accumulated data. - `repl-rdb-channel` config is added as a hidden config. This is mostly for testing as we need to support both rdbchannel replication and the older single connection replication (to keep compatibility with older versions and rdbchannel replication will not be enabled if repl-diskless-sync is not enabled). it affects both the master (not to respond to rdb channel requests), and the replica (not to declare capability) **Internal API changes:** Changes that were introduced to Redis replication: - New replication capability is added to replconf command: `capa rdb-channel-repl`. Indicates replica is capable of rdb channel replication. Replica sends it when it connects to master along with other capabilities. - If replica needs fullsync, master replies `+RDBCHANNELSYNC <client-id>` to the replica's PSYNC request. - When replica opens rdbchannel connection, as part of replconf command, it sends `rdb-channel 1` to let master know this is rdb channel. Also, it sends `main-ch-client-id <client-id>` as part of replconf command so master can associate channels. **Testing:** As rdbchannel replication is enabled by default, we run whole test suite with it. Though, as we need to support both rdbchannel and single connection replication, we'll be running some tests twice with `repl-rdb-channel yes/no` config. **Replica state diagram** ``` * * Replica state machine * * * Main channel state * ┌───────────────────┐ * │RECEIVE_PING_REPLY │ * └────────┬──────────┘ * │ +PONG * ┌────────▼──────────┐ * │SEND_HANDSHAKE │ RDB channel state * └────────┬──────────┘ ┌───────────────────────────────┐ * │+OK ┌───► RDB_CH_SEND_HANDSHAKE │ * ┌────────▼──────────┐ │ └──────────────┬────────────────┘ * │RECEIVE_AUTH_REPLY │ │ REPLCONF main-ch-client-id <clientid> * └────────┬──────────┘ │ ┌──────────────▼────────────────┐ * │+OK │ │ RDB_CH_RECEIVE_AUTH_REPLY │ * ┌────────▼──────────┐ │ └──────────────┬────────────────┘ * │RECEIVE_PORT_REPLY │ │ │ +OK * └────────┬──────────┘ │ ┌──────────────▼────────────────┐ * │+OK │ │ RDB_CH_RECEIVE_REPLCONF_REPLY│ * ┌────────▼──────────┐ │ └──────────────┬────────────────┘ * │RECEIVE_IP_REPLY │ │ │ +OK * └────────┬──────────┘ │ ┌──────────────▼────────────────┐ * │+OK │ │ RDB_CH_RECEIVE_FULLRESYNC │ * ┌────────▼──────────┐ │ └──────────────┬────────────────┘ * │RECEIVE_CAPA_REPLY │ │ │+FULLRESYNC * └────────┬──────────┘ │ │Rdb delivery * │ │ ┌──────────────▼────────────────┐ * ┌────────▼──────────┐ │ │ RDB_CH_RDB_LOADING │ * │SEND_PSYNC │ │ └──────────────┬────────────────┘ * └─┬─────────────────┘ │ │ Done loading * │PSYNC (use cached-master) │ │ * ┌─▼─────────────────┐ │ │ * │RECEIVE_PSYNC_REPLY│ │ ┌────────────►│ Replica streams replication * └─┬─────────────────┘ │ │ │ buffer into memory * │ │ │ │ * │+RDBCHANNELSYNC client-id │ │ │ * ├──────┬───────────────────┘ │ │ * │ │ Main channel │ │ * │ │ accumulates repl data │ │ * │ ┌──▼────────────────┐ │ ┌───────▼───────────┐ * │ │ REPL_TRANSFER ├───────┘ │ CONNECTED │ * │ └───────────────────┘ └────▲───▲──────────┘ * │ │ │ * │ │ │ * │ +FULLRESYNC ┌───────────────────┐ │ │ * ├────────────────► REPL_TRANSFER ├────┘ │ * │ └───────────────────┘ │ * │ +CONTINUE │ * └──────────────────────────────────────────────┘ */ ``` ----- This PR also contains changes and ideas from: valkey-io/valkey#837 valkey-io/valkey#1173 valkey-io/valkey#804 valkey-io/valkey#945 valkey-io/valkey#989 --------- Co-authored-by: Yuan Wang <[email protected]> Co-authored-by: debing.sun <[email protected]> Co-authored-by: Moti Cohen <[email protected]> Co-authored-by: naglera <[email protected]> Co-authored-by: Amit Nagler <[email protected]> Co-authored-by: Madelyn Olson <[email protected]> Co-authored-by: Binbin <[email protected]> Co-authored-by: Viktor Söderqvist <[email protected]> Co-authored-by: Ping Xie <[email protected]> Co-authored-by: Ran Shidlansik <[email protected]> Co-authored-by: ranshid <[email protected]> Co-authored-by: xbasel <[email protected]>

This PR is based on: redis#12109 valkey-io/valkey#60 Closes: redis#11678 **Motivation** During a full sync, when master is delivering RDB to the replica, incoming write commands are kept in a replication buffer in order to be sent to the replica once RDB delivery is completed. If RDB delivery takes a long time, it might create memory pressure on master. Also, once a replica connection accumulates replication data which is larger than output buffer limits, master will kill replica connection. This may cause a replication failure. The main benefit of the rdb channel replication is streaming incoming commands in parallel to the RDB delivery. This approach shifts replication stream buffering to the replica and reduces load on master. We do this by opening another connection for RDB delivery. The main channel on replica will be receiving replication stream while rdb channel is receiving the RDB. This feature also helps to reduce master's main process CPU load. By opening a dedicated connection for the RDB transfer, the bgsave process has access to the new connection and it will stream RDB directly to the replicas. Before this change, due to TLS connection restriction, the bgsave process was writing RDB bytes to a pipe and the main process was forwarding it to the replica. This is no longer necessary, the main process can avoid these expensive socket read/write syscalls. It also means RDB delivery to replica will be faster as it avoids this step. In summary, replication will be faster and master's performance during full syncs will improve. **Implementation steps** 1. When replica connects to the master, it sends 'rdb-channel-repl' as part of capability exchange to let master to know replica supports rdb channel. 2. When replica lacks sufficient data for PSYNC, master sends +RDBCHANNELSYNC reply with replica's client id. As the next step, the replica opens a new connection (rdb-channel) and configures it against the master with the appropriate capabilities and requirements. It also sends given client id back to master over rdbchannel, so that master can associate these channels. (initial replica connection will be referred as main-channel) Then, replica requests fullsync using the RDB channel. 3. Prior to forking, master attaches the replica's main channel to the replication backlog to deliver replication stream starting at the snapshot end offset. 4. The master main process sends replication stream via the main channel, while the bgsave process sends the RDB directly to the replica via the rdb-channel. Replica accumulates replication stream in a local buffer, while the RDB is being loaded into the memory. 5. Once the replica completes loading the rdb, it drops the rdb channel and streams the accumulated replication stream into the db. Sync is completed. **Some details** - Currently, rdbchannel replication is supported only if `repl-diskless-sync` is enabled on master. Otherwise, replication will happen over a single connection as in before. - On replica, there is a limit to replication stream buffering. Replica uses a new config `replica-full-sync-buffer-limit` to limit number of bytes to accumulate. If it is not set, replica inherits `client-output-buffer-limit <replica>` hard limit config. If we reach this limit, replica stops accumulating. This is not a failure scenario though. Further accumulation will happen on master side. Depending on the configured limits on master, master may kill the replica connection. **API changes in INFO output:** 1. New replica state: `send_bulk_and_stream`. Indicates full sync is still in progress for this replica. It is receiving replication stream and rdb in parallel. ``` slave0:ip=127.0.0.1,port=5002,state=send_bulk_and_stream,offset=0,lag=0 ``` Replica state changes in steps: - First, replica sends psync and receives +RDBCHANNELSYNC :`state=wait_bgsave` - After replica connects with rdbchannel and delivery starts: `state=send_bulk_and_stream` - After full sync: `state=online` 2. On replica side, replication stream buffering metrics: - replica_full_sync_buffer_size: Currently accumulated replication stream data in bytes. - replica_full_sync_buffer_peak: Peak number of bytes that this instance accumulated in the lifetime of the process. ``` replica_full_sync_buffer_size:20485 replica_full_sync_buffer_peak:1048560 ``` **API changes in CLIENT LIST** In `client list` output, rdbchannel clients will have 'C' flag in addition to 'S' replica flag: ``` id=11 addr=127.0.0.1:39108 laddr=127.0.0.1:5001 fd=14 name= age=5 idle=5 flags=SC db=0 sub=0 psub=0 ssub=0 multi=-1 watch=0 qbuf=0 qbuf-free=0 argv-mem=0 multi-mem=0 rbs=1024 rbp=0 obl=0 oll=0 omem=0 tot-mem=1920 events=r cmd=psync user=default redir=-1 resp=2 lib-name= lib-ver= io-thread=0 ``` **Config changes:** - `replica-full-sync-buffer-limit`: Controls how much replication data replica can accumulate during rdbchannel replication. If it is not set, a value of 0 means replica will inherit `client-output-buffer-limit <replica>` hard limit config to limit accumulated data. - `repl-rdb-channel` config is added as a hidden config. This is mostly for testing as we need to support both rdbchannel replication and the older single connection replication (to keep compatibility with older versions and rdbchannel replication will not be enabled if repl-diskless-sync is not enabled). it affects both the master (not to respond to rdb channel requests), and the replica (not to declare capability) **Internal API changes:** Changes that were introduced to Redis replication: - New replication capability is added to replconf command: `capa rdb-channel-repl`. Indicates replica is capable of rdb channel replication. Replica sends it when it connects to master along with other capabilities. - If replica needs fullsync, master replies `+RDBCHANNELSYNC <client-id>` to the replica's PSYNC request. - When replica opens rdbchannel connection, as part of replconf command, it sends `rdb-channel 1` to let master know this is rdb channel. Also, it sends `main-ch-client-id <client-id>` as part of replconf command so master can associate channels. **Testing:** As rdbchannel replication is enabled by default, we run whole test suite with it. Though, as we need to support both rdbchannel and single connection replication, we'll be running some tests twice with `repl-rdb-channel yes/no` config. **Replica state diagram** ``` * * Replica state machine * * * Main channel state * ┌───────────────────┐ * │RECEIVE_PING_REPLY │ * └────────┬──────────┘ * │ +PONG * ┌────────▼──────────┐ * │SEND_HANDSHAKE │ RDB channel state * └────────┬──────────┘ ┌───────────────────────────────┐ * │+OK ┌───► RDB_CH_SEND_HANDSHAKE │ * ┌────────▼──────────┐ │ └──────────────┬────────────────┘ * │RECEIVE_AUTH_REPLY │ │ REPLCONF main-ch-client-id <clientid> * └────────┬──────────┘ │ ┌──────────────▼────────────────┐ * │+OK │ │ RDB_CH_RECEIVE_AUTH_REPLY │ * ┌────────▼──────────┐ │ └──────────────┬────────────────┘ * │RECEIVE_PORT_REPLY │ │ │ +OK * └────────┬──────────┘ │ ┌──────────────▼────────────────┐ * │+OK │ │ RDB_CH_RECEIVE_REPLCONF_REPLY│ * ┌────────▼──────────┐ │ └──────────────┬────────────────┘ * │RECEIVE_IP_REPLY │ │ │ +OK * └────────┬──────────┘ │ ┌──────────────▼────────────────┐ * │+OK │ │ RDB_CH_RECEIVE_FULLRESYNC │ * ┌────────▼──────────┐ │ └──────────────┬────────────────┘ * │RECEIVE_CAPA_REPLY │ │ │+FULLRESYNC * └────────┬──────────┘ │ │Rdb delivery * │ │ ┌──────────────▼────────────────┐ * ┌────────▼──────────┐ │ │ RDB_CH_RDB_LOADING │ * │SEND_PSYNC │ │ └──────────────┬────────────────┘ * └─┬─────────────────┘ │ │ Done loading * │PSYNC (use cached-master) │ │ * ┌─▼─────────────────┐ │ │ * │RECEIVE_PSYNC_REPLY│ │ ┌────────────►│ Replica streams replication * └─┬─────────────────┘ │ │ │ buffer into memory * │ │ │ │ * │+RDBCHANNELSYNC client-id │ │ │ * ├──────┬───────────────────┘ │ │ * │ │ Main channel │ │ * │ │ accumulates repl data │ │ * │ ┌──▼────────────────┐ │ ┌───────▼───────────┐ * │ │ REPL_TRANSFER ├───────┘ │ CONNECTED │ * │ └───────────────────┘ └────▲───▲──────────┘ * │ │ │ * │ │ │ * │ +FULLRESYNC ┌───────────────────┐ │ │ * ├────────────────► REPL_TRANSFER ├────┘ │ * │ └───────────────────┘ │ * │ +CONTINUE │ * └──────────────────────────────────────────────┘ */ ``` ----- This PR also contains changes and ideas from: valkey-io/valkey#837 valkey-io/valkey#1173 valkey-io/valkey#804 valkey-io/valkey#945 valkey-io/valkey#989 --------- Co-authored-by: Yuan Wang <[email protected]> Co-authored-by: debing.sun <[email protected]> Co-authored-by: Moti Cohen <[email protected]> Co-authored-by: naglera <[email protected]> Co-authored-by: Amit Nagler <[email protected]> Co-authored-by: Madelyn Olson <[email protected]> Co-authored-by: Binbin <[email protected]> Co-authored-by: Viktor Söderqvist <[email protected]> Co-authored-by: Ping Xie <[email protected]> Co-authored-by: Ran Shidlansik <[email protected]> Co-authored-by: ranshid <[email protected]> Co-authored-by: xbasel <[email protected]>

naglera added 3 commits April 27, 2023 08:46

fix CI workflow run comments

b9bddab

oranagra reviewed Apr 30, 2023

View reviewed changes

src/replication.c Outdated Show resolved Hide resolved

redis.conf Outdated Show resolved Hide resolved

src/config.c Outdated Show resolved Hide resolved

siddharth-69 reviewed May 1, 2023

View reviewed changes

src/networking.c Outdated Show resolved Hide resolved

naglera and others added 4 commits May 4, 2023 11:00

Merge branch 'unstable' into rdb-channel

9d464e2

Rename repl_data_buf to pending_repl_data

d976a81

Fix memory leak Added void param to isOngoingRdbChannelSync

void instead of empty params

6ef6736

madolson reviewed May 23, 2023

View reviewed changes

naglera added 2 commits May 23, 2023 09:46

Fix comments

3efab3a

Removed adlist method and primary block size. Use sync write for sending rdb end offset. Removed replconf connected sub command. The replica will send ack instead. Fixed git diff issue. Refactored replicationAbortSyncTransfer + replicationAbortSyncTransfer.

Fix indentation

f0252d8

soloestoy reviewed Jun 27, 2023

View reviewed changes

src/replication.c Show resolved Hide resolved

src/replication.c Outdated Show resolved Hide resolved

oranagra reviewed Jul 17, 2023

View reviewed changes

sundb reviewed Jul 19, 2023

View reviewed changes

src/rdb.c Outdated Show resolved Hide resolved

naglera added 9 commits January 4, 2024 11:25

Remove mentions of second channel

07f24f5

replica buffer will use the same mechanism as other client reply buffers

33bfd16

Decrement pending_repl_data.len during replica buffer streaming.

21c797e

Stop using stat_repl_processed_bytes to follow streaming progress, instread keep record of the buffer's peak.

rename primary_can_sync_using_rdb_channel-> master_supports_rdb_channel

b7c1688

Remove sendReplicationOffsetToReplicas(), since we dont want to permi…

03d35b6

…t rdb-channel sync along with master side disk sync

Move sendCurentOffsetToReplica to replication.c

be1f66f

Fixed comments, rename methods and states

400664d

Use -FULLSYNCNEEDED instead of empty bulk

c8de526

Mostly renaming and comment fixes

bab299e

Rename peer & unpeer => add & remove

da22d0d

oranagra reviewed Mar 12, 2024

View reviewed changes

naglera added 5 commits March 12, 2024 12:58

remove getLongLongFromObjectOrReply and fix identations

83a471d

rename REPLCONF identify => set-rdb-conn-id

616455e

Use identity to hash dict integers and store keys as plain text

5c4c824

update replica state machine diagram

75e68e1

oranagra reviewed Mar 13, 2024

View reviewed changes

src/server.h Outdated Show resolved Hide resolved

src/server.h Outdated Show resolved Hide resolved

src/networking.c Outdated Show resolved Hide resolved

src/networking.c Outdated Show resolved Hide resolved

src/networking.c Outdated Show resolved Hide resolved

sundb reviewed Mar 13, 2024

View reviewed changes

src/networking.c Outdated Show resolved Hide resolved

src/rio.c Show resolved Hide resolved

src/replication.c Outdated Show resolved Hide resolved

src/server.h Outdated Show resolved Hide resolved

naglera and others added 7 commits March 13, 2024 19:09

Update src/server.h

baa932c

Co-authored-by: debing.sun <[email protected]>

Fix CI workflow run comments

0597fce

Test edge cases of master connection peering

9b320d0

use CLIENT_PROTECTED_RDB_CHANNEL flag instead of CLIENT_PROTECTED

fab702a

renaming and minor comments

8d9a057

Use radix tree for waiting replicas for psync

3ad61d0

debug command for wait_before_rdb_client_free

7511f26

sundb reviewed Mar 14, 2024

View reviewed changes

src/networking.c Outdated Show resolved Hide resolved

oranagra reviewed Mar 17, 2024

View reviewed changes

src/networking.c Show resolved Hide resolved

naglera added 4 commits March 18, 2024 15:20

Merge branch 'unstable' into rdb-channel

5c74298

Remove lookupClientByIDGeneric

6ff7b42

merge from unstable fixes

e63daac

Avoid reading from replica's rdb client after it marked as closed asap

dbfd9fe

naglera mentioned this pull request Jul 10, 2024

Dual channel replication valkey-io/valkey#60

Merged

naglera closed this Sep 5, 2024

naglera deleted the rdb-channel branch September 5, 2024 09:56

tezc mentioned this pull request Jan 8, 2025

Rdb channel replication #13732

Merged

Conversation

naglera commented Apr 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

naglera commented Jun 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Explanation

Test details

Uh oh!

Uh oh!

Uh oh!

soloestoy commented Jun 27, 2023

Uh oh!

naglera commented Jul 2, 2023

Uh oh!

oranagra left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

naglera commented Jan 1, 2024

Data

Explanation

Test setup

Uh oh!

oranagra commented Jan 1, 2024

Uh oh!

naglera commented Jan 1, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CLAassistant commented Mar 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

naglera commented Apr 27, 2023 •

edited

Loading

naglera commented Jun 5, 2023 •

edited

Loading

CLAassistant commented Mar 24, 2024 •

edited

Loading