[WIP] WATCH command can delete expired keys by soloestoy · Pull Request #9234 · redis/redis

soloestoy · 2021-07-14T11:41:26Z

After a discussion in #9068 and #9194, we reached an agreement
that we should handle all scenarios about expire, not only the real
expired deletion but also the logic expired time.

This PR aims to fix the issue below:
Time 1: we have a key "foo" is expired but not deleted.
Time 2: we WATCH the key "foo".
Time 3: execute EXEC and we find "foo" is expired and discard the
transaction, but it's not right, cause key "foo" is expired before
WATCH.

To fix it, the WATCH command now calls expireIfNeeded() to delete
the expired keys, and considering stale data would not be deleted if
clients are paused, WATCH command is now with may-replicate flag.

Notes: expireIfNeeded() cannot work in replicas, but WATCH expired keys
in replicas is rare case, we can fix it in future.

@redis/core-team @zuiderkwast please check.

oranagra · 2021-07-14T18:41:22Z

There are few things that i'm uncomfortable with:

First, removing the may-replicate flag may go unnoticed, nothing will seem to break (no test).
at first, when i was looking at this, for a moment i thought maybe there's a serious bug and all the read commands need that flag, since they all call expireIfNeeded, but i later remembered that expireIsNeeded has an explicit check for write pause.

So in fact the reason we're setting the flag here is not because this command may replicate (unlike EVAL and PUBLISH), but because we wanna make sure that when it'll be called, it'll never skip deleting an expired key.
I.e. this call for expireIfNeeded is different than the others, in the other places we're interested in the return value of the function, and don't care much about the fact that it deletes the key or not.
But here, we don't care about the return value, we only care about the deletion.

Maybe some of this concern can be solved by separating isKeyExpired and expireIfNeeded (one returns a value, and the other returns void).
But as noted in the opening comment (about replicas), it may be problematic to rely on the fact the key is deleted on time, maybe instead we need to create some other mechanism to avoid failing EXEC if the key logical state didn't change between WATCH and EXEC?

p.s. correct me if i'm wrong. regarding WATCH on replicas, this PR doesn't create any new issue with that combination, it's just that the problem it come to fix isn't fixed on replicas, right?

oranagra · 2021-07-15T09:46:24Z

suggested course of action:

rename expireIsNeeded which is misleading to IsKeyExpired (or isKeyExpiredAndMaybeDeleteIt 😄)
WATCH can call that function on the keys, and record the result (one bit) somewhere.
when the key is deleted either by lazy-expire, active-expire, or the expiration check in EXEC, if that bit is set, we don't flag the transaction as dirty.

this way we know if the key changed logical state between the watch and exec.

note that a similar issue this PR aims to fix (wrongly failing EXEC) already existed (before the recent change in #9194) for active expire. i.e. consider the following test:

    test {WATCH stale keys should not fail EXEC} {
        r flushdb
        r set x foo px 1
        wait_for_condition {[r dbsize] == 0} else { fail ...}
        r watch x
        r multi
        r ping
        assert_equal {PONG} [r exec]
    } {OK} {needs:debug}

zuiderkwast · 2021-07-15T11:18:19Z

src/multi.c

If we fix the TODO for the replicas, we don't need the may-replicate flag, right?

@oranagra wrote:

rename expireIsNeeded which is misleading to IsKeyExpired (or isKeyExpiredAndMaybeDeleteIt 😄)

WATCH can call that function on the keys, and record the result (one bit) somewhere.

when the key is deleted either by lazy-expire, active-expire, or the expiration check in EXEC, if that bit is set, we don't flag the transaction as dirty.

(1) The name is indeed misleading. It doesn't expire keys, so indeed isKeyExpiredAndMaybeDeleteIt is a better name. :) The comment above the function does explain that it doesn't always delete an expired key though. I don't mind renaming it but I wouldn't require it either.

(2-3) This would work, but where do we store this bit? It would need to be in the client struct so it doesn't affect other clients' watches.

A different approach is to record the timestamp of the WATCH somewhere, for example replacing the list client->watched_keys with a dict (key => time of watching). Then EXEC can see if the key was expired already before WATCH. Then we don't need to do anything special in WATCH or in replicas.

PS. I'll be hiking for the next two weeks so I won't be able to follow up on this until August.

finding a place for one bit in a hackish way is always easy (there are at least 2 wasted bits in each pointer).
i also thought about storing the time, but that'll indeed require more memory, and in fact all we need is just one bit.

Use a bit hack if you want :-) but storing the time doesn't need WATCH to have the may-replicate flag and WATCH doesn't even need to call expireIfNeeded. It's just an idea. You can choose.

that's true for my bit representing the logical state of the key on watch too.. it no longer cares if expireIfNeeded deletes the key or not, so no need for may-replicate and no issues with replicas.
both achieve the same result, one may require a bit more memory, but maybe we could later find more uses for the time than just the state (bit).

soloestoy · 2021-07-20T02:55:08Z

rename expireIsNeeded which is misleading to IsKeyExpired (or isKeyExpiredAndMaybeDeleteIt 😄)

WATCH can call that function on the keys, and record the result (one bit) somewhere.

when the key is deleted either by lazy-expire, active-expire, or the expiration check in EXEC, if that bit is set, we don't flag the transaction as dirty.

Yes, I know this can work, and actually at first I wanna implement it, but it's a bit complex(not only the EXPIRED flag but also need modify touchWatchedKey function and calls to check if it is a stale deletion) and it's just a corner case, and seems we are near the next release, so I wanna fix it in an easy way.

If you all think we need a new WATCH mechanism, we can just do it.

oranagra · 2021-07-20T05:44:41Z

i don't see an urgency to fix it. it existed since forever, and the only thing recently changed is just to prevent a race condition (increases the chances the transaction will fail, but the chance existed before), and the outcome of the problem this PR comes to fix is not severe (failing a transaction, something an application is suppose to handle).
i rather take our time and solve it properly.

soloestoy · 2021-07-20T07:28:23Z

ok, I will try to implement the new mechanism.

BTW, since this PR is still WIP, I just reopen issue #9068

soloestoy · 2021-09-23T06:24:51Z

rename expireIsNeeded which is misleading to IsKeyExpired (or isKeyExpiredAndMaybeDeleteIt 😄)

WATCH can call that function on the keys, and record the result (one bit) somewhere.

when the key is deleted either by lazy-expire, active-expire, or the expiration check in EXEC, if that bit is set, we don't flag the transaction as dirty.

I'm trying to redesign, but meet some problem that I didn't figure it out.

A problem is about replication, for example about point 3, it works well on master, but expiration doesn't work on replica, replica depends on master's replication stream to delete stale data (DEL or UNLINK command), so how to handle stale data on replica (especially the replication may delay)?

I'd love to have your suggestions @oranagra @zuiderkwast

After a discussion in redis#9068 and redis#9194, we reached an agreement that we should handle all scenarios about expire, not only the real expired deletion but also the logical expired time. This PR aims to fix the issue below: Time 1: we have a key "foo" is expired but not deleted. Time 2: we WATCH the key "foo". Time 3: execute EXEC and we find "foo" is expired and discard the transaction, but it's not right, cause key "foo" is expired before WATCH. To adddress the issue, the WATCH command now calls expireIfNeeded() try to delete the expired keys. But there are two scenarios that expireIfNeeded() cannot work: clients are paused and role is replica. To handle the stale key, we add a flag to record if the watchedKey is stale, and don't flag client as CLIENT_DIRTY_CAS if key is stale when touch the key.

oranagra · 2021-09-23T15:17:21Z

trying to follow, i need some recap:

the proper use case for WATCH is: WATCH x + GET x + MULTI + SET x + EXEC.
but since we're discussing a (read-only) replica, maybe our (a bit odd) scenario is WATCH x + GET x + MULTI + GET y + EXEC
however, the case in 1 doesn't need this PR to be fixed, so this PR is discussing a case with no GET after WATCH, like WATCH x + MULTI + SET x + EXEC.
so for a replica are we discussing WATCH x + MULTI + GET y + EXEC?

now when i actually think of case 2 (which wasn't the case for this PR till now in my understanding) in a replica, since GET wouldn't expire the key, it'll just return nil, we do indeed have a problem.
so @soloestoy is that what you where talking about?
so you want the DEL the master sends, to avoid invalidating the WATCH since the client already viewed it as expired (got nil without deleting the key).

I think the right way to solve it is that the DEL command (from a master client) deletes a logically expired key, we avoid invalidating the WATCH (if that cached bit i described in my 1,2,3 was set).
i.e. in masters when DEL command arrives on a logically expired key, the expireIfNeeded check will delete it before the DEL command does, and that will be reflected in the DEL command response (and keyspace notification).
in the replica, that DEL proceeds and does the actual deletion, that's what we wanna change.

so that means:
delGenericCommand needs needs to look at the return value of the new isKeyExpiredAndMaybeDeleteIt, and if the client is the master client, call the same logic that active/lazy expire is using?
or alternatively it can pass a flag to isKeyExpiredAndMaybeDeleteIt telling it that it's a DEL request (so it should behave differently than how it normally behaves for other commands that are received from the master).

now you'll probably argue that there's a problem with the above suggestion since it'll have a false positive in case the user sent a DEL command to the master before the key expired, and it arrived to the replica after it was logically expired, and the replica will still execute the expiration path.
As far as i can tell, it doesn't matter for WATCH and EXEC. am i right?
it does however matter if we also wanna solve keyspcae notifications and module hooks better, but if we wanna do that, the only way i can see is add a new new command which will let the master send the reason for DEL (i.e propagate a different DEL for expiration, eviction, and anything else)... maybe that's ok too.

This is a revamp of redis#9234. When WATCH is called, a flag is stored if the key is already expired at the time of watch. The expired key is not deleted, only checked. When a key is "touched", if it is deleted and it was already expired when a client watched it, the client is not marked as dirty. The LRU field of the key name's robj (the argument of WATCH) is used for the flag.

oranagra · 2022-02-21T09:31:32Z

closing in favor of #10256

soloestoy requested review from madolson, oranagra, yossigo and zuiderkwast July 14, 2021 11:42

oranagra added release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus labels Jul 14, 2021

zuiderkwast reviewed Jul 15, 2021

View reviewed changes

soloestoy changed the title ~~WATCH command can delete expired keys~~ [WIP] WATCH command can delete expired keys Jul 20, 2021

oranagra linked an issue Jul 20, 2021 that may be closed by this pull request

[BUG?] EXEC commands succeeds when one of the queued commands uses a WATCHed expired key #9068

Closed

soloestoy added the state-work-needed label Sep 16, 2021

zuiderkwast mentioned this pull request Feb 7, 2022

Delete key doesn't dirty client who watched stale key #10256

Merged

soloestoy force-pushed the watch-touch-expire branch from b0c4538 to f797d40 Compare February 16, 2022 03:51

oranagra closed this Feb 21, 2022

enjoy-binbin mentioned this pull request Jul 2, 2022

[BUG?] EXEC commands succeeds when one of the queued commands uses a WATCHed expired key #9068

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] WATCH command can delete expired keys#9234

[WIP] WATCH command can delete expired keys#9234
soloestoy wants to merge 1 commit intoredis:unstablefrom
soloestoy:watch-touch-expire

soloestoy commented Jul 14, 2021

Uh oh!

oranagra commented Jul 14, 2021

Uh oh!

oranagra commented Jul 15, 2021

Uh oh!

zuiderkwast Jul 15, 2021

Uh oh!

oranagra Jul 15, 2021 •

edited

Loading

Uh oh!

zuiderkwast Jul 15, 2021

Uh oh!

oranagra Jul 15, 2021

Uh oh!

soloestoy commented Jul 20, 2021

Uh oh!

oranagra commented Jul 20, 2021

Uh oh!

soloestoy commented Jul 20, 2021

Uh oh!

soloestoy commented Sep 23, 2021 •

edited

Loading

Uh oh!

oranagra commented Sep 23, 2021

Uh oh!

oranagra commented Feb 21, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

soloestoy commented Jul 14, 2021

Uh oh!

oranagra commented Jul 14, 2021

Uh oh!

oranagra commented Jul 15, 2021

Uh oh!

zuiderkwast Jul 15, 2021

Choose a reason for hiding this comment

Uh oh!

oranagra Jul 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zuiderkwast Jul 15, 2021

Choose a reason for hiding this comment

Uh oh!

oranagra Jul 15, 2021

Choose a reason for hiding this comment

Uh oh!

soloestoy commented Jul 20, 2021

Uh oh!

oranagra commented Jul 20, 2021

Uh oh!

soloestoy commented Jul 20, 2021

Uh oh!

soloestoy commented Sep 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra commented Sep 23, 2021

Uh oh!

oranagra commented Feb 21, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

oranagra Jul 15, 2021 •

edited

Loading

soloestoy commented Sep 23, 2021 •

edited

Loading