Start AOFRW before streaming repl buffer during fullsync by tezc · Pull Request #13758 · redis/redis

tezc · 2025-01-20T08:18:30Z

During fullsync, before loading RDB on the replica, we stop aof child to prevent copy-on-write disaster.
Once rdb is loaded, aof is started again and it will trigger aof rewrite. With #13732 , for rdbchannel replication, this behavior was changed. Currently, we start aof after replication buffer is streamed to db. This PR changes it back to start aof just after rdb is loaded (before repl buffer is streamed)

Both approaches may have pros and cons. If we start aof before streaming repl buffers, we may still face with copy-on-write issues as repl buffers potentially include large amount of changes. If we wait until replication buffer drained, it means we are delaying starting aof persistence.

Additional changes are introduced as part of this PR:

Interface change:
Added mem_replica_full_sync_buffer field to the INFO MEMORY command reply. During full sync, it shows total memory consumed by accumulated replication stream buffer on replica. Added same metric to MEMORY STATS command reply as replica.fullsync.buffer field.
Fixes:
- Count repl stream buffer size of replica as part of 'memory overhead' calculation for fields in "INFO MEMORY" and "MEMORY STATS" outputs. Before this PR, repl buffer was not counted as part of memory overhead calculation, causing misreports for fields like used_memory_overhead and used_memory_dataset in "INFO STATS" and for overhead.total field in "MEMORY STATS" command reply.
- Dismiss replication stream buffers memory of replica in the fork to reduce COW impact during a fork.
- Fixed a few time sensitive flaky tests, deleted a noop statement, fixed some comments and fail messages in rdbchannel tests.

src/object.c

src/server.h

tests/integration/replication-rdbchannel.tcl

oranagra

LGTM.
please list the interface changes in the PR description.
@YaacovHazan please comment on how do we formally approve them?

YaacovHazan · 2025-02-04T08:04:53Z

@oranagra This is an internal mechanism in Redis, and we don't need to get approval for that... we do need to communicate the change, so we are good to merge it

During fullsync, before loading RDB on the replica, we stop aof child to prevent copy-on-write disaster. Once rdb is loaded, aof is started again and it will trigger aof rewrite. With redis#13732 , for rdbchannel replication, this behavior was changed. Currently, we start aof after replication buffer is streamed to db. This PR changes it back to start aof just after rdb is loaded (before repl buffer is streamed) Both approaches may have pros and cons. If we start aof before streaming repl buffers, we may still face with copy-on-write issues as repl buffers potentially include large amount of changes. If we wait until replication buffer drained, it means we are delaying starting aof persistence. Additional changes are introduced as part of this PR: - Interface change: Added `mem_replica_full_sync_buffer` field to the `INFO MEMORY` command reply. During full sync, it shows total memory consumed by accumulated replication stream buffer on replica. Added same metric to `MEMORY STATS` command reply as `replica.fullsync.buffer` field. - Fixes: - Count repl stream buffer size of replica as part of 'memory overhead' calculation for fields in "INFO MEMORY" and "MEMORY STATS" outputs. Before this PR, repl buffer was not counted as part of memory overhead calculation, causing misreports for fields like `used_memory_overhead` and `used_memory_dataset` in "INFO STATS" and for `overhead.total` field in "MEMORY STATS" command reply. - Dismiss replication stream buffers memory of replica in the fork to reduce COW impact during a fork. - Fixed a few time sensitive flaky tests, deleted a noop statement, fixed some comments and fail messages in rdbchannel tests.

tezc added 2 commits January 14, 2025 09:54

Start AOFRW before streaming repl buffer during fullsync

ccac249

Start AOFRW before streaming repl buffer during fullsync

c84d39a

tezc requested a review from oranagra January 20, 2025 08:18

oranagra reviewed Jan 20, 2025

View reviewed changes

src/object.c Outdated Show resolved Hide resolved

src/server.h Show resolved Hide resolved

tests/integration/replication-rdbchannel.tcl Show resolved Hide resolved

tezc added 2 commits January 21, 2025 01:57

minor

8e41641

memory stats, mem_replica_full_sync_buffer, test fix

ac37dc8

oranagra approved these changes Jan 28, 2025

View reviewed changes

tezc added release-notes indication that this issue needs to be mentioned in the release notes state:needs-doc-pr requires a PR to redis-doc repository labels Jan 28, 2025

tezc merged commit 09f8a2f into redis:unstable Feb 4, 2025
19 checks passed

tezc deleted the revert-aofrw branch February 4, 2025 18:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Start AOFRW before streaming repl buffer during fullsync#13758

Start AOFRW before streaming repl buffer during fullsync#13758
tezc merged 4 commits intoredis:unstablefrom
tezc:revert-aofrw

tezc commented Jan 20, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oranagra left a comment

Uh oh!

YaacovHazan commented Feb 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tezc commented Jan 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oranagra left a comment

Choose a reason for hiding this comment

Uh oh!

YaacovHazan commented Feb 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tezc commented Jan 20, 2025 •

edited

Loading