Skip to content

[WIP] WATCH command can delete expired keys#9234

Closed
soloestoy wants to merge 1 commit intoredis:unstablefrom
soloestoy:watch-touch-expire
Closed

[WIP] WATCH command can delete expired keys#9234
soloestoy wants to merge 1 commit intoredis:unstablefrom
soloestoy:watch-touch-expire

Conversation

@soloestoy
Copy link
Contributor

After a discussion in #9068 and #9194, we reached an agreement
that we should handle all scenarios about expire, not only the real
expired deletion but also the logic expired time.

This PR aims to fix the issue below:
Time 1: we have a key "foo" is expired but not deleted.
Time 2: we WATCH the key "foo".
Time 3: execute EXEC and we find "foo" is expired and discard the
transaction, but it's not right, cause key "foo" is expired before
WATCH.

To fix it, the WATCH command now calls expireIfNeeded() to delete
the expired keys, and considering stale data would not be deleted if
clients are paused, WATCH command is now with may-replicate flag.

Notes: expireIfNeeded() cannot work in replicas, but WATCH expired keys
in replicas is rare case, we can fix it in future.

@redis/core-team @zuiderkwast please check.

@oranagra oranagra added release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus labels Jul 14, 2021
@oranagra
Copy link
Member

There are few things that i'm uncomfortable with:

First, removing the may-replicate flag may go unnoticed, nothing will seem to break (no test).
at first, when i was looking at this, for a moment i thought maybe there's a serious bug and all the read commands need that flag, since they all call expireIfNeeded, but i later remembered that expireIsNeeded has an explicit check for write pause.

So in fact the reason we're setting the flag here is not because this command may replicate (unlike EVAL and PUBLISH), but because we wanna make sure that when it'll be called, it'll never skip deleting an expired key.
I.e. this call for expireIfNeeded is different than the others, in the other places we're interested in the return value of the function, and don't care much about the fact that it deletes the key or not.
But here, we don't care about the return value, we only care about the deletion.

Maybe some of this concern can be solved by separating isKeyExpired and expireIfNeeded (one returns a value, and the other returns void).
But as noted in the opening comment (about replicas), it may be problematic to rely on the fact the key is deleted on time, maybe instead we need to create some other mechanism to avoid failing EXEC if the key logical state didn't change between WATCH and EXEC?

p.s. correct me if i'm wrong. regarding WATCH on replicas, this PR doesn't create any new issue with that combination, it's just that the problem it come to fix isn't fixed on replicas, right?

@oranagra
Copy link
Member

suggested course of action:

  1. rename expireIsNeeded which is misleading to IsKeyExpired (or isKeyExpiredAndMaybeDeleteIt 😄)
  2. WATCH can call that function on the keys, and record the result (one bit) somewhere.
  3. when the key is deleted either by lazy-expire, active-expire, or the expiration check in EXEC, if that bit is set, we don't flag the transaction as dirty.

this way we know if the key changed logical state between the watch and exec.

note that a similar issue this PR aims to fix (wrongly failing EXEC) already existed (before the recent change in #9194) for active expire. i.e. consider the following test:

    test {WATCH stale keys should not fail EXEC} {
        r flushdb
        r set x foo px 1
        wait_for_condition {[r dbsize] == 0} else { fail ...}
        r watch x
        r multi
        r ping
        assert_equal {PONG} [r exec]
    } {OK} {needs:debug}

src/multi.c Outdated
Comment on lines 427 to 453
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we fix the TODO for the replicas, we don't need the may-replicate flag, right?

@oranagra wrote:

  1. rename expireIsNeeded which is misleading to IsKeyExpired (or isKeyExpiredAndMaybeDeleteIt 😄)
  2. WATCH can call that function on the keys, and record the result (one bit) somewhere.
  3. when the key is deleted either by lazy-expire, active-expire, or the expiration check in EXEC, if that bit is set, we don't flag the transaction as dirty.

(1) The name is indeed misleading. It doesn't expire keys, so indeed isKeyExpiredAndMaybeDeleteIt is a better name. :) The comment above the function does explain that it doesn't always delete an expired key though. I don't mind renaming it but I wouldn't require it either.

(2-3) This would work, but where do we store this bit? It would need to be in the client struct so it doesn't affect other clients' watches.

A different approach is to record the timestamp of the WATCH somewhere, for example replacing the list client->watched_keys with a dict (key => time of watching). Then EXEC can see if the key was expired already before WATCH. Then we don't need to do anything special in WATCH or in replicas.

PS. I'll be hiking for the next two weeks so I won't be able to follow up on this until August.

Copy link
Member

@oranagra oranagra Jul 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finding a place for one bit in a hackish way is always easy (there are at least 2 wasted bits in each pointer).
i also thought about storing the time, but that'll indeed require more memory, and in fact all we need is just one bit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a bit hack if you want :-) but storing the time doesn't need WATCH to have the may-replicate flag and WATCH doesn't even need to call expireIfNeeded. It's just an idea. You can choose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's true for my bit representing the logical state of the key on watch too.. it no longer cares if expireIfNeeded deletes the key or not, so no need for may-replicate and no issues with replicas.
both achieve the same result, one may require a bit more memory, but maybe we could later find more uses for the time than just the state (bit).

@soloestoy
Copy link
Contributor Author

  1. rename expireIsNeeded which is misleading to IsKeyExpired (or isKeyExpiredAndMaybeDeleteIt 😄)
  2. WATCH can call that function on the keys, and record the result (one bit) somewhere.
  3. when the key is deleted either by lazy-expire, active-expire, or the expiration check in EXEC, if that bit is set, we don't flag the transaction as dirty.

Yes, I know this can work, and actually at first I wanna implement it, but it's a bit complex(not only the EXPIRED flag but also need modify touchWatchedKey function and calls to check if it is a stale deletion) and it's just a corner case, and seems we are near the next release, so I wanna fix it in an easy way.

If you all think we need a new WATCH mechanism, we can just do it.

@oranagra
Copy link
Member

i don't see an urgency to fix it. it existed since forever, and the only thing recently changed is just to prevent a race condition (increases the chances the transaction will fail, but the chance existed before), and the outcome of the problem this PR comes to fix is not severe (failing a transaction, something an application is suppose to handle).
i rather take our time and solve it properly.

@soloestoy soloestoy changed the title WATCH command can delete expired keys [WIP] WATCH command can delete expired keys Jul 20, 2021
@soloestoy
Copy link
Contributor Author

ok, I will try to implement the new mechanism.

BTW, since this PR is still WIP, I just reopen issue #9068

@soloestoy
Copy link
Contributor Author

soloestoy commented Sep 23, 2021

  1. rename expireIsNeeded which is misleading to IsKeyExpired (or isKeyExpiredAndMaybeDeleteIt 😄)
  2. WATCH can call that function on the keys, and record the result (one bit) somewhere.
  3. when the key is deleted either by lazy-expire, active-expire, or the expiration check in EXEC, if that bit is set, we don't flag the transaction as dirty.

I'm trying to redesign, but meet some problem that I didn't figure it out.

A problem is about replication, for example about point 3, it works well on master, but expiration doesn't work on replica, replica depends on master's replication stream to delete stale data (DEL or UNLINK command), so how to handle stale data on replica (especially the replication may delay)?

I'd love to have your suggestions @oranagra @zuiderkwast

After a discussion in redis#9068 and redis#9194, we reached an agreement
that we should handle all scenarios about expire, not only the real
expired deletion but also the logical expired time.

This PR aims to fix the issue below:
Time 1: we have a key "foo" is expired but not deleted.
Time 2: we WATCH the key "foo".
Time 3: execute EXEC and we find "foo" is expired and discard the
transaction, but it's not right, cause key "foo" is expired before
WATCH.

To adddress the issue, the WATCH command now calls expireIfNeeded()
try to delete the expired keys.

But there are two scenarios that expireIfNeeded() cannot work:
clients are paused and role is replica. To handle the stale key,
we add a flag to record if the watchedKey is stale, and don't
flag client as CLIENT_DIRTY_CAS if key is stale when touch the key.
@oranagra
Copy link
Member

trying to follow, i need some recap:

  1. the proper use case for WATCH is: WATCH x + GET x + MULTI + SET x + EXEC.
  2. but since we're discussing a (read-only) replica, maybe our (a bit odd) scenario is WATCH x + GET x + MULTI + GET y + EXEC
  3. however, the case in 1 doesn't need this PR to be fixed, so this PR is discussing a case with no GET after WATCH, like WATCH x + MULTI + SET x + EXEC.
  4. so for a replica are we discussing WATCH x + MULTI + GET y + EXEC?

now when i actually think of case 2 (which wasn't the case for this PR till now in my understanding) in a replica, since GET wouldn't expire the key, it'll just return nil, we do indeed have a problem.
so @soloestoy is that what you where talking about?
so you want the DEL the master sends, to avoid invalidating the WATCH since the client already viewed it as expired (got nil without deleting the key).

I think the right way to solve it is that the DEL command (from a master client) deletes a logically expired key, we avoid invalidating the WATCH (if that cached bit i described in my 1,2,3 was set).
i.e. in masters when DEL command arrives on a logically expired key, the expireIfNeeded check will delete it before the DEL command does, and that will be reflected in the DEL command response (and keyspace notification).
in the replica, that DEL proceeds and does the actual deletion, that's what we wanna change.

so that means:
delGenericCommand needs needs to look at the return value of the new isKeyExpiredAndMaybeDeleteIt, and if the client is the master client, call the same logic that active/lazy expire is using?
or alternatively it can pass a flag to isKeyExpiredAndMaybeDeleteIt telling it that it's a DEL request (so it should behave differently than how it normally behaves for other commands that are received from the master).

now you'll probably argue that there's a problem with the above suggestion since it'll have a false positive in case the user sent a DEL command to the master before the key expired, and it arrived to the replica after it was logically expired, and the replica will still execute the expiration path.
As far as i can tell, it doesn't matter for WATCH and EXEC. am i right?
it does however matter if we also wanna solve keyspcae notifications and module hooks better, but if we wanna do that, the only way i can see is add a new new command which will let the master send the reason for DEL (i.e propagate a different DEL for expiration, eviction, and anything else)... maybe that's ok too.

zuiderkwast added a commit to zuiderkwast/redis that referenced this pull request Feb 7, 2022
This is a revamp of redis#9234.

When WATCH is called, a flag is stored if the key is already expired
at the time of watch. The expired key is not deleted, only checked.

When a key is "touched", if it is deleted and it was already expired
when a client watched it, the client is not marked as dirty.

The LRU field of the key name's robj (the argument of WATCH) is used
for the flag.
@oranagra
Copy link
Member

closing in favor of #10256

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus state-work-needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG?] EXEC commands succeeds when one of the queued commands uses a WATCHed expired key

3 participants