Replica keep serving data during repl-diskless-load=swapdb for better availability#9323
Conversation
9d16f89 to
ecf3f43
Compare
|
Relates to #4624, resolving it in some setups |
47b7d2e to
409c72a
Compare
oranagra
left a comment
There was a problem hiding this comment.
@eduardobr thank you, i think this PR shows the potential of that concept quite well, and it's nice to see that the code was nearly prepared for this concept and you were able to apply it nicely without too many changes.
I did a quick review and gave a few comments, the main issue to be resolved is probably around the module API and complex modules with out of keyspace data.
Few other things i must note:
- During loading redis is not highly responsive, it only processes events for a short period once in some 2mb of RDB data. so the clients working with the replica may get degraded performance, even beyond the fact that the CPU is heavily invested in loading.
- I think we wanna have that mode enabled only when a replica was previously in sync with a master (not depending on whatever the database was empty or not though), when a new replica comes up, we'll want to avoid using swapdb, so that we will respond with -LOADING.
oranagra
left a comment
There was a problem hiding this comment.
LGTM with minor fix.
what's left is to handle the modules issue.
oranagra
left a comment
There was a problem hiding this comment.
@eduardobr thank you.. your changes seem great!
i gave a few minor comments mostly about renaming "repl async" to "repl async load".
and raised a few questions about ordering.
i think you can proceed with adjusting the tests.
|
@eduardobr are you gonna finish this? any major issues? IIRC it's only renaming / doc and some minor changes. |
@oranagra I can move on, was just waiting for some answer on the questions you raised, so I can finish in one seating. |
3c6ad5a to
8ee0c56
Compare
8ee0c56 to
ca922aa
Compare
|
@oranagra Things got quite complicated due to many conflicting changes In relation to "Delay to discard cached master when full synchronization #9398", at replication.c In relation to "Slot-to-keys using dict entry metadata #9356", it hit this PR in some points and even required moving this to server.h: Additionally, raised question about a missing bit: |
|
For #9356, if you have uncertain things, we can ask @zuiderkwast for suggestions, he is a friendly man! |
I'm here to help. 😄 I tried to move everything cluster-related to Isn't it enough to add |
@zuiderkwast Thank you ;)
If it lived in one of these 2, one of them wouldn't be able to access it. I'm afraid I'm still not familiar with the code base as a whole and C best practices to know if that was a decent solution though. I believe that can be also a separated commit after the storm of changes and rebases done here if necessary, what do you think? But most importantly, I believe it would be great to have your review on the correctness of this change after I integrated your PR, for example: into Which inlining, is: And that's what I follow here after swapping tempDb with main db: Does it make sense, is the part removed now unnecessary? |
oranagra
left a comment
There was a problem hiding this comment.
I Re-reviewed the whole thing from scratch, here are my new comments:
Co-authored-by: Oran Agra <[email protected]>
madolson
left a comment
There was a problem hiding this comment.
Approving based on documentation, didn't read the implementation details.
|
@eduardobr merged 🎉 |
Thank YOU for the patience 😁 |
|
@eduardobr i noticed some sporadic timeout errors in the valgrind runs and they did not reproduce they day before / after so it might be nothing. |
During diskless replication, the check for broken EOF mark is misplaced and should be earlier. Now we do not swap db, we do proper cleanup and correctly raise module events on this kind of failure. This issue existed prior to #9323, but before, the side effect was not restoring backup and not raising the correct module events on this failure.
|
another timeout in the test under valgrind: |
|
i think the timeout is not a fault of the tests, hope it'll get solved by #9767 |
We saw some tests sporadically time out on valgrind (namely the ones from #9323). Increasing valgrind timeout from 20 mins to 40 mins in CI. And fixing an outdated help message.
| "module_fork_in_progress:%d\r\n" | ||
| "module_fork_last_cow_size:%zu\r\n", | ||
| (int)server.loading, | ||
| (int)(server.loading && !server.async_loading), |
There was a problem hiding this comment.
somehow i think it should not be added !server.async_loading here? (a bit confused?)
btw, i think async_loading should also require a document, maybe missing, i added it in redis/redis-doc#1686
There was a problem hiding this comment.
It is indeed a bit confusing that in Redis both flags are set in this mode, but in info, it's either one or the other.
But we concluded that for each one (users or developers) it makes sense to look at it this way.
In redis#9323, when `repl-diskless-load` is enabled and set to `swapdb`, if the master replication ID hasn't changed, we can load data-set asynchronously, and serving read commands during the full resync. In `diskless loading short read` test, after a loading successfully, we will wait for the loading to stop and continue the for loop. After the introduction of `async_loading`, we also need to check it. Otherwise the next loop will start too soon, may trigger a timing issue.
In #9323, when `repl-diskless-load` is enabled and set to `swapdb`, if the master replication ID hasn't changed, we can load data-set asynchronously, and serving read commands during the full resync. In `diskless loading short read` test, after a loading successfully, we will wait for the loading to stop and continue the for loop. After the introduction of `async_loading`, we also need to check it. Otherwise the next loop will start too soon, may trigger a timing issue.
In redis#9323, when `repl-diskless-load` is enabled and set to `swapdb`, if the master replication ID hasn't changed, we can load data-set asynchronously, and serving read commands during the full resync. In `diskless loading short read` test, after a loading successfully, we will wait for the loading to stop and continue the for loop. After the introduction of `async_loading`, we also need to check it. Otherwise the next loop will start too soon, may trigger a timing issue.
…ading (#9878) ## background Till now CONFIG SET was blocked during loading. (In the not so distant past, GET was disallowed too) We recently (not released yet) added an async-loading mode, see #9323, and during that time it'll serve CONFIG SET and any other command. And now we realized (#9770) that some configs, and commands are dangerous during async-loading. ## changes * Allow most CONFIG SET during loading (both on async-loading and normal loading) * Allow CONFIG REWRITE and CONFIG RESETSTAT during loading * Block a few config during loading (`appendonly`, `repl-diskless-load`, and `dir`) * Block a few commands during loading (list below) ## the blocked commands: * SAVE - obviously we don't wanna start a foregreound save during loading 8-) * BGSAVE - we don't mind to schedule one, but we don't wanna fork now * BGREWRITEAOF - we don't mind to schedule one, but we don't wanna fork now * MODULE - we obviously don't wanna unload a module during replication / rdb loading (MODULE HELP and MODULE LIST are not blocked) * SYNC / PSYNC - we're in the middle of RDB loading from master, must not allow sync requests now. * REPLICAOF / SLAVEOF - we're in the middle of replicating, maybe it makes sense to let the user abort it, but he couldn't do that so far, i don't wanna take any risk of bugs due to odd state. * CLUSTER - only allow [HELP, SLOTS, NODES, INFO, MYID, LINKS, KEYSLOT, COUNTKEYSINSLOT, GETKEYSINSLOT, RESET, REPLICAS, COUNT_FAILURE_REPORTS], for others, preserve the status quo ## other fixes * processEventsWhileBlocked had an issue when being nested, this could happen with a busy script during async loading (new), but also in a busy script during AOF loading (old). this lead to a crash in the scenario described in #6988
The bug is introduced by #9323. (released in 7.0 RC1) The define of `REDISMODULE_OPTIONS_HANDLE_IO_ERRORS` and `REDISMODULE_OPTION_NO_IMPLICIT_SIGNAL_MODIFIED` have the same value. This will result in skipping `signalModifiedKey()` after `RM_CloseKey()` if the module has set `REDISMODULE_OPTIONS_HANDLE_REPL_ASYNC_LOAD` option. The implication is missing WATCH and client side tracking invalidations. Other changes: - add `no-implicit-signal-modified` to the options in INFO modules Co-authored-by: Oran Agra <[email protected]>
For diskless replication in swapdb mode, considering we already spend replica memory having a backup of current db to restore in case of failure, we can have the following benefits by instead swapping database only in case we succeeded in transferring db from master:
LOADINGresponse during failed and successful synchronization for cases where the replica is already up and running with data.General notes:
backupDbbecomestempDbfor clarity.async_loadingto differentiate from the blocking loadingredisDbas it's more natural to access it from both server.db and the tempDb that is passed around.Considerations for review:
Additional for Release Notes
Instead, we have the new RedisModuleEvent_ReplAsyncLoad holding 3 sub-events: STARTED, ABORTED and COMPLETED.