Kill disk-based fork child when all replicas drop and 'save' is not enabled#7819
Kill disk-based fork child when all replicas drop and 'save' is not enabled#7819oranagra merged 2 commits intoredis:unstablefrom
Conversation
|
Hi @ShooterIT, looks like the test produce some false positives. Looking at the test code it seems very timing dependent. Would you consider changing the simple delays with a more accurate check that probes the server state instead? |
|
for the record, failed on CI for MacOS: |
|
it looks like the one that fails is this one: which is odd considering we already know the it's in progress by the previous wait_for_condition. maybe there was a race, and only one slave connected? meanwhile, just to be on the safe side, i'm taking this commit out of 6.0.9 |
|
Copy that @oranagra @yossigo I could reproduce it on oranagra:6.0.9 branch but i didn't reproduce on unstable(i also don't find failed daily action in this test last month ), do we improve sync implementation in next commits? |
|
ohh, now it all makes sense. |
…nabled (redis#7819) When all replicas waiting for a bgsave get disconnected (possibly due to output buffer limit), It may be good to kill the bgsave child. in diskless replication it already happens, but in disk-based, the child may still serve some purpose (for persistence). By killing the child, we prevent it from eating COW memory in vain, and we also allow a new child fork sooner for the next full synchronization or bgsave. We do that only if rdb persistence wasn't enabled in the configuration. Btw, now, rdbRemoveTempFile in killRDBChild won't block server, so we can killRDBChild safely.
#7717
If there is no any other slave waiting dumping RDB finished, the current child process need not continue to dump RDB, then we kill it. So child process won't use more memory, and we also can fork a new child process asap to dump rdb for next full synchronization or bgsave. But we also need to check if users enable 'save' RDB, if enable, we should not remove directly since that means RDB is important for users to keep data safe.
Btw, now,
rdbRemoveTempFileinkillRDBChildwon't block server, so we cankillRDBChildsafely.