Fail Exec command in case a watched key is expired#9194
Fail Exec command in case a watched key is expired#9194oranagra merged 4 commits intoredis:unstablefrom
Conversation
madolson
left a comment
There was a problem hiding this comment.
Thanks for this change! We should add tests for these cases as well.
|
@perryitay please add a test, it can basically be formed slightly similar to the |
|
@redis/core-team this is a behavior change, but it actually fixes an unpredictable (timing sensitive) behavior, so i think there's no doubt about merging it (people can't really rely on this behavior). |
Co-authored-by: Oran Agra <[email protected]>
|
@zuiderkwast what is there to document here? |
|
Doc: One more note under NOTE here: https://redis.io/topics/transactions#a-hrefcommandswatchwatcha-explained (I see now that the bullet list is messed up too. Probably a missing blank line before the bullet list.) |
|
well, i'm not sure we need to document bug history in redis.io (we have command and argument history, it may be enough). bugs are documented in the release notes.. new code that's written against a fixed version will never see the bug, i also wanted to argue that in #7920 the behavior was more consistent (app can depend on it) than the one fixed here, but actually i'm not sure that's right. anyway, bottom line, i'm ok to document it, although i'm not sure it's necessary, and if we do so, i'd advise to move that note to the bottom into some distinct history section. |
|
Good job, this PR solve most of the problems(keys expired during WATCH to EXEC), but there is still a problem it doesn't solve(keys expired before WATCH but not deleted) IIUC: Time 1: we have a key "foo" is expired but not deleted. Time 2: we WATCH the key "foo". Time 3: execute EXEC and we find "foo" is expired and discard the transaction, but it's not right, cause key "foo" is expired before WATCH. |
|
Maybe we should call |
|
|
||
| /* EXEC with expired watched key is disallowed*/ | ||
| if (isWatchedKeyExpired(c)) { | ||
| c->flags |= (CLIENT_DIRTY_CAS); |
There was a problem hiding this comment.
minor nit: The () is unnecessary and adds no value.
There was a problem hiding this comment.
@soloestoy you're right! Expired but not deleted before WATCH is broken by this PR. I wrote this test case and checked that it passes on unstable but fails in this PR branch:
test {WATCH deletes already expired keys} {
r del x
r debug set-active-expire 0
r set x foo px 1
after 2
r watch x
r multi
r ping
assert_equal {PONG} [r exec]
r debug set-active-expire 1
}It needs to be fixed. This test case can be added to tests/unit/multi.tcl.
|
@soloestoy @zuiderkwast i'm not sure i agree that this is really a problem (although i don't mind fixing it either). let's consider a more realistic examples (with similar timing as the one above): in this case, GET will fail the transaction. The problem this PR fixes is the case INCR will delete the key and create a new (non-volatile) one (which is a violation of the transaction guarantees). In the example without a GET or INCR, if the new code (this PR) will fail the transaction, there's really no harm (not a violation of the transaction guarantees). |
|
I agree with @oranagra that I don't believe it's a real problem. (and I also don't mind fixing it either) I think it can be fixed independently though. |
|
I have different opinions, I believe this is a bug, and I believe if we are sure the expire mechanism can affect transaction then we should handle all the scenarios not only the real expired deletion but also the logic expired time. |
|
@soloestoy after thinking this through more, I suppose I do agree this is a real issue and should be addressed. Watch should call lookup key. I would still be fine merging this than fixing it though, but we should definitely fix it. |
|
Ok. but for clarity, let's handle it in a separate PR (different scenario, and no shared code) |
|
I also agree with @madolson and @soloestoy, |
|
ok. @perryitay will make another PR to fix the other issue. |
There are two issues fixed in this commit: 1. we want to fail the EXEC command in case there is a watched key that's logically expired but not yet deleted by active expire or lazy expire. 2. we saw that currently cache time is update in every `call()` (including nested calls), this time is being also being use for the isKeyExpired comparison, we want to update the cache time only in the first call (execCommand) Co-authored-by: Oran Agra <[email protected]> (cherry picked from commit ac8b1df)
There are two issues fixed in this commit: 1. we want to fail the EXEC command in case there is a watched key that's logically expired but not yet deleted by active expire or lazy expire. 2. we saw that currently cache time is update in every `call()` (including nested calls), this time is being also being use for the isKeyExpired comparison, we want to update the cache time only in the first call (execCommand) Co-authored-by: Oran Agra <[email protected]> (cherry picked from commit ac8b1df)
There are two issues fixed in this commit: 1. we want to fail the EXEC command in case there is a watched key that's logically expired but not yet deleted by active expire or lazy expire. 2. we saw that currently cache time is update in every `call()` (including nested calls), this time is being also being use for the isKeyExpired comparison, we want to update the cache time only in the first call (execCommand) Co-authored-by: Oran Agra <[email protected]>
After a discussion in redis#9068 and redis#9194, we reached an agreement that we should handle all scenarios about expire, not only the real expired deletion but also the logical expired time. This PR aims to fix the issue below: Time 1: we have a key "foo" is expired but not deleted. Time 2: we WATCH the key "foo". Time 3: execute EXEC and we find "foo" is expired and discard the transaction, but it's not right, cause key "foo" is expired before WATCH. To adddress the issue, the WATCH command now calls expireIfNeeded() try to delete the expired keys. But there are two scenarios that expireIfNeeded() cannot work: clients are paused and role is replica. To handle the stale key, we add a flag to record if the watchedKey is stale, and don't flag client as CLIENT_DIRTY_CAS if key is stale when touch the key.
There are two issues fixed in this Pr:
fixing issue [BUG?] EXEC commands succeeds when one of the queued commands uses a WATCHed expired key #9068, we want to fail the EXEC command in case there is a watched key that's logically expired but not yet deleted by active expire or lazy expire.
we saw that currently cache time is update in every call, this time is being also being use for the isKeyExpired comparison, we in case of a nested call, we want to update the cache time only in the first call (execCommand)