Sort out mess around propagation and MULTI/EXEC#9890
Merged
oranagra merged 22 commits intoredis:unstablefrom Dec 22, 2021
Merged
Sort out mess around propagation and MULTI/EXEC#9890oranagra merged 22 commits intoredis:unstablefrom
oranagra merged 22 commits intoredis:unstablefrom
Conversation
guybe7
commented
Dec 3, 2021
Contributor
|
"Sort out mess" 👍 😁 |
Member
|
@guybe7 thanks.. Few requests while i review:
|
Collaborator
Author
|
@oranagra i can't figure it out, i ran it locally (build with SANITIZER=address and ran the tests) and it passed |
Member
|
I've re triggered the tests and got the same outcome. |
oranagra
reviewed
Dec 5, 2021
82d8f81 to
07f8a39
Compare
oranagra
reviewed
Dec 8, 2021
oranagra
reviewed
Dec 9, 2021
oranagra
reviewed
Dec 9, 2021
Member
|
@soloestoy i'll be happy if you can have a look and see if you can spot any issues with the new approach. |
Contributor
OK, I'll check in a day or two, too busy this week... |
bugfixes: 1. CONFIG SET can create evictions, sending notifications which can cause to dirty++ with modules. we need to prevent it from propagating to AOF/replicas 2. we need to set current_client in RM_Call. buggy scenario: - CONFIG SET maxmemory, notifications, module hook calls RM_Call - assertion in lookupKey crashes, because current_client has CONFIG SET, which isn't CMD_WRITE 3. minor: in eviction, call propagateDeletion after notification, like active-expire and all commands (we always send a notification before propagating the command)
MeirShpilraien
pushed a commit
to RedisGears/RedisGears
that referenced
this pull request
Feb 13, 2022
This PR is for future compatability to Redis 7. Till Redis 7, if commands would have executed from background thread, then Redis would not have wrap them with multi exec and the commands would not have executed atomically on replica. Gears solved it by propagating multi exec by itself. Thanks to this PR: redis/redis#9890 It is no longer needed to replicate multi exec, Redis takes care of this for us.
oranagra
added a commit
that referenced
this pull request
Apr 18, 2022
…10573) RM_Yield was missing a call to protectClient to prevent redis from processing future commands of the yielding client. Adding tests that fail without this fix. This would be complicated to solve since nested calls to RM_Call used to replace the current_client variable with the module temp client. It looks like it's no longer necessary to do that, since it was added back in #9890 to solve two issues, both already gone: 1. call to CONFIG SET maxmemory could trigger a module hook calling RM_Call. although this specific issue is gone, arguably other hooks like keyspace notification, can do the same. 2. an assertion in lookupKey that checks the current command of the current client, introduced in #9572 and removed in #10248
guybe7
added a commit
to guybe7/redis
that referenced
this pull request
Apr 20, 2022
If was first added in redis#9890 to solve the problem of CONFIG SET maxmemory causing eviction inside MULTI/EXEC, but that problem is already fixed (CONFIG SET doesn't evict directly, it just schedules a later eviction) Keep that condition may hide bugs (i.e. performEvictions should always expect to have an empty server.also_propagate)
oranagra
pushed a commit
that referenced
this pull request
Apr 24, 2022
If was first added in #9890 to solve the problem of CONFIG SET maxmemory causing eviction inside MULTI/EXEC, but that problem is already fixed (CONFIG SET doesn't evict directly, it just schedules a later eviction) Keep that condition may hide bugs (i.e. performEvictions should always expect to have an empty server.also_propagate)
oranagra
pushed a commit
that referenced
this pull request
Apr 25, 2022
oranagra
added a commit
that referenced
this pull request
Aug 24, 2022
…#11176) Redis 7.0 has #9890 which added an assertion when the propagation queue was not flushed and we got to beforeSleep. But it turns out that when processCommands calls getNodeByQuery and decides to reject the command, it can lead to a key that was lazy expired and is deleted without later flushing the propagation queue. This change prevents lazy expiry from deleting the key at this stage (not as part of a command being processed in `call`)
oranagra
added a commit
that referenced
this pull request
Sep 21, 2022
…#11176) Redis 7.0 has #9890 which added an assertion when the propagation queue was not flushed and we got to beforeSleep. But it turns out that when processCommands calls getNodeByQuery and decides to reject the command, it can lead to a key that was lazy expired and is deleted without later flushing the propagation queue. This change prevents lazy expiry from deleting the key at this stage (not as part of a command being processed in `call`) (cherry picked from commit c789fb0)
This was referenced Feb 8, 2023
oranagra
pushed a commit
that referenced
this pull request
Feb 14, 2023
Starting from Redis 7.0 (#9890) we started wrapping everything a command propagates with MULTI/EXEC. The problem is that both SCAN and RANDOMKEY can lazy-expire arbitrary keys (similar behavior to active-expire), and put DELs in a transaction. Fix: When these commands are called without a parent exec-unit (e.g. not in EVAL or MULTI) we avoid wrapping their DELs in a transaction (for the same reasons active-expire and eviction avoids a transaction) This PR adds a per-command flag that indicates that the command may touch arbitrary keys (not the ones in the arguments), and uses that flag to avoid the MULTI-EXEC. For now, this flag is internal, since we're considering other solutions for the future. Note for cluster mode: if SCAN/RANDOMKEY is inside EVAL/MULTI it can still cause the same situation (as it always did), but it won't cause a CROSSSLOT because replicas and AOF do not perform slot checks. The problem with the above is mainly for 3rd party ecosystem tools that propagate commands from master to master, or feed an AOF file with redis-cli into a master. This PR aims to fix the regression in redis 7.0, and we opened #11792 to try to handle the bigger problem with lazy expire better for another release.
oranagra
pushed a commit
that referenced
this pull request
Feb 28, 2023
Starting from Redis 7.0 (#9890) we started wrapping everything a command propagates with MULTI/EXEC. The problem is that both SCAN and RANDOMKEY can lazy-expire arbitrary keys (similar behavior to active-expire), and put DELs in a transaction. Fix: When these commands are called without a parent exec-unit (e.g. not in EVAL or MULTI) we avoid wrapping their DELs in a transaction (for the same reasons active-expire and eviction avoids a transaction) This PR adds a per-command flag that indicates that the command may touch arbitrary keys (not the ones in the arguments), and uses that flag to avoid the MULTI-EXEC. For now, this flag is internal, since we're considering other solutions for the future. Note for cluster mode: if SCAN/RANDOMKEY is inside EVAL/MULTI it can still cause the same situation (as it always did), but it won't cause a CROSSSLOT because replicas and AOF do not perform slot checks. The problem with the above is mainly for 3rd party ecosystem tools that propagate commands from master to master, or feed an AOF file with redis-cli into a master. This PR aims to fix the regression in redis 7.0, and we opened #11792 to try to handle the bigger problem with lazy expire better for another release. (cherry picked from commit fd82bcc)
madolson
pushed a commit
to madolson/redis
that referenced
this pull request
Apr 19, 2023
…redis#11176) Redis 7.0 has redis#9890 which added an assertion when the propagation queue was not flushed and we got to beforeSleep. But it turns out that when processCommands calls getNodeByQuery and decides to reject the command, it can lead to a key that was lazy expired and is deleted without later flushing the propagation queue. This change prevents lazy expiry from deleting the key at this stage (not as part of a command being processed in `call`)
enjoy-binbin
pushed a commit
to enjoy-binbin/redis
that referenced
this pull request
Jul 31, 2023
…edis#10573) RM_Yield was missing a call to protectClient to prevent redis from processing future commands of the yielding client. Adding tests that fail without this fix. This would be complicated to solve since nested calls to RM_Call used to replace the current_client variable with the module temp client. It looks like it's no longer necessary to do that, since it was added back in redis#9890 to solve two issues, both already gone: 1. call to CONFIG SET maxmemory could trigger a module hook calling RM_Call. although this specific issue is gone, arguably other hooks like keyspace notification, can do the same. 2. an assertion in lookupKey that checks the current command of the current client, introduced in redis#9572 and removed in redis#10248
enjoy-binbin
pushed a commit
to enjoy-binbin/redis
that referenced
this pull request
Jul 31, 2023
If was first added in redis#9890 to solve the problem of CONFIG SET maxmemory causing eviction inside MULTI/EXEC, but that problem is already fixed (CONFIG SET doesn't evict directly, it just schedules a later eviction) Keep that condition may hide bugs (i.e. performEvictions should always expect to have an empty server.also_propagate)
enjoy-binbin
pushed a commit
to enjoy-binbin/redis
that referenced
this pull request
Jul 31, 2023
This case is interesting because it originates from cron, rather than from another command. The idea came from looking at redis#9890 and redis#10573, and I was wondering if RM_Call would work properly when `server.current_client == NULL`
enjoy-binbin
pushed a commit
to enjoy-binbin/redis
that referenced
this pull request
Jul 31, 2023
…redis#11176) Redis 7.0 has redis#9890 which added an assertion when the propagation queue was not flushed and we got to beforeSleep. But it turns out that when processCommands calls getNodeByQuery and decides to reject the command, it can lead to a key that was lazy expired and is deleted without later flushing the propagation queue. This change prevents lazy expiry from deleting the key at this stage (not as part of a command being processed in `call`)
enjoy-binbin
pushed a commit
to enjoy-binbin/redis
that referenced
this pull request
Jul 31, 2023
Starting from Redis 7.0 (redis#9890) we started wrapping everything a command propagates with MULTI/EXEC. The problem is that both SCAN and RANDOMKEY can lazy-expire arbitrary keys (similar behavior to active-expire), and put DELs in a transaction. Fix: When these commands are called without a parent exec-unit (e.g. not in EVAL or MULTI) we avoid wrapping their DELs in a transaction (for the same reasons active-expire and eviction avoids a transaction) This PR adds a per-command flag that indicates that the command may touch arbitrary keys (not the ones in the arguments), and uses that flag to avoid the MULTI-EXEC. For now, this flag is internal, since we're considering other solutions for the future. Note for cluster mode: if SCAN/RANDOMKEY is inside EVAL/MULTI it can still cause the same situation (as it always did), but it won't cause a CROSSSLOT because replicas and AOF do not perform slot checks. The problem with the above is mainly for 3rd party ecosystem tools that propagate commands from master to master, or feed an AOF file with redis-cli into a master. This PR aims to fix the regression in redis 7.0, and we opened redis#11792 to try to handle the bigger problem with lazy expire better for another release.
Merged
show1999
added a commit
to show1999/redis
that referenced
this pull request
Aug 12, 2025
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The mess:
Some parts use alsoPropagate for late propagation, others using an immediate one (propagate()), causing edge cases, ugly/hacky code, and the tendency for bugs
The basic idea is that all commands are propagated via alsoPropagate (i.e. added to a list) and the top-most call() is responsible for going over that list and actually propagating them (and wrapping them in MULTI/EXEC if there's more than one command). This is done in the new function, propagatePendingCommands.
Callers to propagatePendingCommands:
afterCommandafterCommand.afterCommandisn't called. in that case, we have to propagate the deletion explicitly.A "known limitation", which were actually a bug, was fixed because of this commit (see propagate.tcl):
When using a mix of RM_Call with
!and RM_Replicate, the command would propagate out-of-order: first all the commands from RM_Call, and then the ones from RM_ReplicateAnother thing worth mentioning is that if, in the past, a client would issue a MULTI/EXEC with just one write command the server would blindly propagate the MULTI/EXEC too, even though it's redundant. not anymore.
This commit renames propagate() to propagateNow() in order to cause conflicts in pending PRs.
propagatePendingCommands is the only caller of propagateNow, which is now a static, internal helper function.
Optimizations:
Bugfixes:
we need to prevent it from propagating to AOF/replicas
(we always send a notification before propagating the command)