[WIP] WATCH command can delete expired keys#9234
[WIP] WATCH command can delete expired keys#9234soloestoy wants to merge 1 commit intoredis:unstablefrom
Conversation
|
There are few things that i'm uncomfortable with: First, removing the may-replicate flag may go unnoticed, nothing will seem to break (no test). So in fact the reason we're setting the flag here is not because this command may replicate (unlike EVAL and PUBLISH), but because we wanna make sure that when it'll be called, it'll never skip deleting an expired key. Maybe some of this concern can be solved by separating p.s. correct me if i'm wrong. regarding WATCH on replicas, this PR doesn't create any new issue with that combination, it's just that the problem it come to fix isn't fixed on replicas, right? |
|
suggested course of action:
this way we know if the key changed logical state between the watch and exec. note that a similar issue this PR aims to fix (wrongly failing EXEC) already existed (before the recent change in #9194) for active expire. i.e. consider the following test: test {WATCH stale keys should not fail EXEC} {
r flushdb
r set x foo px 1
wait_for_condition {[r dbsize] == 0} else { fail ...}
r watch x
r multi
r ping
assert_equal {PONG} [r exec]
} {OK} {needs:debug} |
src/multi.c
Outdated
There was a problem hiding this comment.
If we fix the TODO for the replicas, we don't need the may-replicate flag, right?
@oranagra wrote:
- rename
expireIsNeededwhich is misleading toIsKeyExpired(orisKeyExpiredAndMaybeDeleteIt😄)- WATCH can call that function on the keys, and record the result (one bit) somewhere.
- when the key is deleted either by lazy-expire, active-expire, or the expiration check in EXEC, if that bit is set, we don't flag the transaction as dirty.
(1) The name is indeed misleading. It doesn't expire keys, so indeed isKeyExpiredAndMaybeDeleteIt is a better name. :) The comment above the function does explain that it doesn't always delete an expired key though. I don't mind renaming it but I wouldn't require it either.
(2-3) This would work, but where do we store this bit? It would need to be in the client struct so it doesn't affect other clients' watches.
A different approach is to record the timestamp of the WATCH somewhere, for example replacing the list client->watched_keys with a dict (key => time of watching). Then EXEC can see if the key was expired already before WATCH. Then we don't need to do anything special in WATCH or in replicas.
PS. I'll be hiking for the next two weeks so I won't be able to follow up on this until August.
There was a problem hiding this comment.
finding a place for one bit in a hackish way is always easy (there are at least 2 wasted bits in each pointer).
i also thought about storing the time, but that'll indeed require more memory, and in fact all we need is just one bit.
There was a problem hiding this comment.
Use a bit hack if you want :-) but storing the time doesn't need WATCH to have the may-replicate flag and WATCH doesn't even need to call expireIfNeeded. It's just an idea. You can choose.
There was a problem hiding this comment.
that's true for my bit representing the logical state of the key on watch too.. it no longer cares if expireIfNeeded deletes the key or not, so no need for may-replicate and no issues with replicas.
both achieve the same result, one may require a bit more memory, but maybe we could later find more uses for the time than just the state (bit).
Yes, I know this can work, and actually at first I wanna implement it, but it's a bit complex(not only the If you all think we need a new |
|
i don't see an urgency to fix it. it existed since forever, and the only thing recently changed is just to prevent a race condition (increases the chances the transaction will fail, but the chance existed before), and the outcome of the problem this PR comes to fix is not severe (failing a transaction, something an application is suppose to handle). |
|
ok, I will try to implement the new mechanism. BTW, since this PR is still WIP, I just reopen issue #9068 |
I'm trying to redesign, but meet some problem that I didn't figure it out. A problem is about replication, for example about point 3, it works well on master, but expiration doesn't work on replica, replica depends on master's replication stream to delete stale data (DEL or UNLINK command), so how to handle stale data on replica (especially the replication may delay)? I'd love to have your suggestions @oranagra @zuiderkwast |
After a discussion in redis#9068 and redis#9194, we reached an agreement that we should handle all scenarios about expire, not only the real expired deletion but also the logical expired time. This PR aims to fix the issue below: Time 1: we have a key "foo" is expired but not deleted. Time 2: we WATCH the key "foo". Time 3: execute EXEC and we find "foo" is expired and discard the transaction, but it's not right, cause key "foo" is expired before WATCH. To adddress the issue, the WATCH command now calls expireIfNeeded() try to delete the expired keys. But there are two scenarios that expireIfNeeded() cannot work: clients are paused and role is replica. To handle the stale key, we add a flag to record if the watchedKey is stale, and don't flag client as CLIENT_DIRTY_CAS if key is stale when touch the key.
|
trying to follow, i need some recap:
now when i actually think of case 2 (which wasn't the case for this PR till now in my understanding) in a replica, since GET wouldn't expire the key, it'll just return nil, we do indeed have a problem. I think the right way to solve it is that the DEL command (from a master client) deletes a logically expired key, we avoid invalidating the WATCH (if that cached bit i described in my 1,2,3 was set). so that means: now you'll probably argue that there's a problem with the above suggestion since it'll have a false positive in case the user sent a DEL command to the master before the key expired, and it arrived to the replica after it was logically expired, and the replica will still execute the expiration path. |
This is a revamp of redis#9234. When WATCH is called, a flag is stored if the key is already expired at the time of watch. The expired key is not deleted, only checked. When a key is "touched", if it is deleted and it was already expired when a client watched it, the client is not marked as dirty. The LRU field of the key name's robj (the argument of WATCH) is used for the flag.
b0c4538 to
f797d40
Compare
|
closing in favor of #10256 |
After a discussion in #9068 and #9194, we reached an agreement
that we should handle all scenarios about expire, not only the real
expired deletion but also the logic expired time.
This PR aims to fix the issue below:
Time 1: we have a key "foo" is expired but not deleted.
Time 2: we WATCH the key "foo".
Time 3: execute EXEC and we find "foo" is expired and discard the
transaction, but it's not right, cause key "foo" is expired before
WATCH.
To fix it, the WATCH command now calls expireIfNeeded() to delete
the expired keys, and considering stale data would not be deleted if
clients are paused, WATCH command is now with may-replicate flag.
Notes: expireIfNeeded() cannot work in replicas, but WATCH expired keys
in replicas is rare case, we can fix it in future.
@redis/core-team @zuiderkwast please check.