Skip to content

Freeze time sampling during command execution, and scripts#10300

Merged
oranagra merged 16 commits intoredis:unstablefrom
enjoy-binbin:script_block_the_time
Oct 9, 2022
Merged

Freeze time sampling during command execution, and scripts#10300
oranagra merged 16 commits intoredis:unstablefrom
enjoy-binbin:script_block_the_time

Conversation

@enjoy-binbin
Copy link
Contributor

@enjoy-binbin enjoy-binbin commented Feb 15, 2022

Freeze time during execution of scripts and all other commands.
This means that a key is either expired or not, and doesn't change
state during a script execution. resolves #10182

This PR try to add a new commandTimeSnapshot function.
The function logic is extracted from keyIsExpired, but the related
calls to fixed_time_expire and mstime() are removed, see below.

In commands, we will avoid calling mstime() multiple times
and just use the one that sampled in call. The background is,
e.g. using PEXPIRE 1 with valgrind sometimes result in the key
being deleted rather than expired. The reason is that both PEXPIRE
command and checkAlreadyExpired call mstime() separately.

There are other more important changes in this PR:

  1. Eliminate fixed_time_expire, it is no longer needed.
    When we want to sample time we should always use a time snapshot.
    We will use in_nested_call instead to update the cached time in call.
  2. Move the call for updateCachedTime from serverCron to afterSleep.
    Now commandTimeSnapshot will always return the sample time, the
    lookupKeyReadWithFlags call in getNodeByQuery will get a outdated
    cached time (because processCommand is out of the call context).
    We put the call to updateCachedTime in aftersleep.
  3. Cache the time each time the module lock Redis.
    Call updateCachedTime in moduleGILAfterLock, affecting RM_ThreadSafeContextLock
    and RM_ThreadSafeContextTryLock

Currently the commandTimeSnapshot change affects the following TTL commands:

  • SET EX / SET PX
  • EXPIRE / PEXPIRE
  • SETEX / PSETEX
  • GETEX EX / GETEX PX
  • TTL / PTTL
  • EXPIRETIME / PEXPIRETIME
  • RESTORE key TTL

And other commands just use the cached mstime (including TIME).

This is considered to be a breaking change since it can break a script that uses a loop to wait for a key to expire.

@enjoy-binbin
Copy link
Contributor Author

Not sure if I did it right, but post it first...

@moticless
Copy link
Collaborator

Maybe I am missing something. Why to touch at several places, maybe miss few places, and create another similar api (mstime_2) which might be confusing, instead of embed the logic inside mstime(). Something like:

mstime_t mstime(void) {
if ((server.script_caller) && (server.script_caller_tid == syscall(__NR_gettid))
  return scriptTimeSnapshot()
else 
 return ustime()/1000;
}

(I didn't verified what is the proper way to gettid())

@enjoy-binbin
Copy link
Contributor Author

modify the mstime, i feel that it will affect/touch other uncertain things (a lot), since mstime is used too much.
my thought is that with a new function, at least we can know what it will be affect

@moticless
Copy link
Collaborator

You are right. Required an alternative indication than script_caller that raised per command call. And yet. Bad idea :)

@oranagra
Copy link
Member

i wouldn't want to change mstime() to possibly returned a cached copy, but i would just to suggest just using server.mstime and maybe completely drop scriptTimeSnapshot() as not needed.

@MeirShpilraien observed today that using PEXPIRE 1 with valgrind sometimes results in the key being deleted rather than expired (different keyspace notification for instance), the reason is that both PEXPIRE command, and checkAlreadyExpired call mstime() separately.
i think no command (or at least most) should call mstime twice, and they should instead all depend on the one sampled in call.

we need to look carefully on the implications (places that go to the command code without passing in call), and other special cases of long running commands (such as MULTI-EXEC, scripts and modules).

@enjoy-binbin
Copy link
Contributor Author

i think no command (or at least most) should call mstime twice, and they should instead all depend on the one sampled in call.

yes, agree with that. how about we just pull this logic as a new function, i suppose it is use in keyIsExpired, and it will be safe to use in the most cases

    /* If we are in the context of a Lua script, we pretend that time is
     * blocked to when the Lua script started. This way a key can expire
     * only the first time it is accessed and not in the middle of the
     * script execution, making propagation to slaves / AOF consistent.
     * See issue #1525 on Github for more information. */
    if (server.script_caller) {
        now = scriptTimeSnapshot();
    }
    /* If we are in the middle of a command execution, we still want to use
     * a reference time that does not change: in that case we just use the
     * cached time, that we update before each call in the call() function.
     * This way we avoid that commands such as RPOPLPUSH or similar, that
     * may re-open the same key multiple times, can invalidate an already
     * open object in a next call, if the next call will see the key expired,
     * while the first did not. */
    else if (server.fixed_time_expire > 0) {
        now = server.mstime;
    }
    /* For the other cases, we want to use the most fresh time we have. */
    else {
        now = mstime();
    }

@oranagra
Copy link
Member

oranagra commented Sep 4, 2022

makes sense.

for blocked / busy scripts that do redis.call, we'll need the first part anyway.
i'm not sure there's a need for the last part (calling mstime() directly) in most cases.
i.e. it makes sense in some generic code that checks expiry. but in commands, we can be sure we're always in call (except for AOF loading).

i wouldn't want to change mstime() to possibly returned a cached copy, but i would just to suggest just using server.mstime and maybe completely drop scriptTimeSnapshot() as not needed.

i now realize i was wrong, that would have done the wrong thing on blocked scripts (once who call processEventsWhileBlocked).

@enjoy-binbin
Copy link
Contributor Author

enjoy-binbin commented Sep 4, 2022

i'm not sure there's a need for the last part (calling mstime() directly) in most cases.
i.e. it makes sense in some generic code that checks expiry. but in commands, we can be sure we're always in call (except for AOF loading).

in most cases, or in the commands, i think we will go the second part, catch server.fixed_time_expire > 0 and use the server.mstime. we do maintenance the fixed_time_expire in call

void call(client *c, int flags) {
    /* Update cache time, in case we have nested calls we want to
     * update only on the first call*/
    if (server.fixed_time_expire++ == 0) {
        updateCachedTimeWithUs(0,call_timer);
    }

    server.in_nested_call++;
    c->cmd->proc(c);
    server.in_nested_call--;

    server.fixed_time_expire--;

@oranagra
Copy link
Member

oranagra commented Sep 4, 2022

yes i know.. i was arguing that maybe there's no need for the if on in_nested_call, and the explicit call to mstime(), and we can just always use either server.mstime or scriptTimeSnapshot().
but anyway, let's proceed with your proposal, and promote all this code to a function, what's left is to find a name for it.
how about commandTimeSnapshot()?

This PR try to block time while executing the scripts.
Currently it affects the following commands:
- SET EX / SET PX
- EXPIRE / PEXPIRE
- SETEX / PSETEX
- GETEX EX / GETEX PX
- TTL / PTTL
- EXPIRETIME / PEXPIRETIME
- RESTORE key TTL

If the above commands are called during script execution,
if you set the expiration time or calculate the expriation
time, the scrit time will be used instead of the current time.
@enjoy-binbin
Copy link
Contributor Author

yes i know.. i was arguing that maybe there's no need for the if on in_nested_call, and the explicit call to mstime(), and we can just always use either server.mstime or scriptTimeSnapshot().

ohh, i see, this makes sense too, anyway i pushed the commit (go with commandTimeSnapshot), we can see if we need to simplify it.

btw, for now i only do this for those commands that involve TTL, do I need to go through all the commands and change the mstime in the command to commandTimeSnapshot?

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly LGTM.
please also scan for commands that directly access server.mstime and others.

@enjoy-binbin
Copy link
Contributor Author

please also scan for commands that directly access server.mstime and others.

i did scan for commands that access server.mstime before, it was less. btw, what is and others., i don't get it...

(the one in sentinel can be dropped if not needed, i just saw it and could not help it... i guess someday we will drop the sentinel with cluster V2, so it not important)

@oranagra
Copy link
Member

oranagra commented Sep 5, 2022

by others i meant server.unixtime and server.ustime

@oranagra
Copy link
Member

oranagra commented Sep 5, 2022

@moticless can you take a look at the sentinel changes? if we're not sure about them, i'm willing to drop them.

@enjoy-binbin
Copy link
Contributor Author

i checked server.unixtime and server.ustime, it looks ok to me, there is no command to access them directly

@oranagra
Copy link
Member

conceptually approved in a core-team meeting to be part of 7.2.
this is considered to be a breaking change since it can break a script that uses a loop to wait for a key to expire.
needs one final review and it can be merged.

@oranagra oranagra added state:major-decision Requires core team consensus release-notes indication that this issue needs to be mentioned in the release notes breaking-change This change can potentially break existing application labels Sep 22, 2022
@oranagra oranagra changed the title script block the time when setting or calculating the expiration time Freeze time sampling during command execution, and scripts Sep 22, 2022
@oranagra
Copy link
Member

oranagra commented Oct 2, 2022

@enjoy-binbin please have a look at the test failure. seems like a result of f80cb16

@enjoy-binbin
Copy link
Contributor Author

@oranagra yes, i mentioned it in #10300 (comment)

@MeirShpilraien
Copy link

I did not go over the entire conversation, the code looks good to me. I just want to raise one question, do we also want to cache the time and use it on modules that perform background operations. Lets say cache the time each time the module lock Redis (RedisModule_ThreadSafeCtxLock)?

@oranagra
Copy link
Member

oranagra commented Oct 2, 2022

@MeirShpilraien i think you may be right.
i.e. when modules query time directly, they have a choice between cached and non-cached time.
but when they execute commands, they don't have that choice.
and since considering the cached time which commands look at, could change anyway during an execution of a blocked command (by the main thread), i guess we wanna make sure it's fresh.

it's kinda also in the direction of what i proposed here: #10300 (comment)
which is to update the cached time in afterSleep too.

for a moment, i had a concern is about performance overhead, but considering we're locking a mutex (system call), and considering updating time is likely to be a VDSO, and in any way we do it per command in a normal pipelined command workfload, it's overhead must be very low.

@enjoy-binbin i think you can go ahead and implement both changes.
mention them in the top comment, and let's wait for feedback after that.

enjoy-binbin and others added 2 commits October 2, 2022 16:48
in getNodeByQuery, the lookupKey, here it used to use server.mstime(),
and the new code we return server.mstime (we haven't updated the cache time yet),
then produce different expired results:
- the old code incr missing_keys (the key expired)
- the new code incr existing_keys (the key is alive).

Co-authored-by: Oran Agra <[email protected]>
Co-authored-by: Meir Shpilraien (Spielrein) <[email protected]>
Co-authored-by: Oran Agra <[email protected]>
@oranagra oranagra merged commit 35b3fbd into redis:unstable Oct 9, 2022
@enjoy-binbin enjoy-binbin deleted the script_block_the_time branch October 9, 2022 06:02
enjoy-binbin added a commit to enjoy-binbin/redis that referenced this pull request Oct 9, 2022
The old `server.unixtime*1000000` will overflow in 32-bits.
This was introduced in redis#10300 (not released).
oranagra pushed a commit that referenced this pull request Oct 9, 2022
The old `server.unixtime*1000000` will overflow in 32-bits.
This was introduced in #10300 (not released).
oranagra added a commit that referenced this pull request Feb 16, 2023
… for RM_Call (#11770)

* Make it clear that current_client is the root client that was called by
  external connection
* add executing_client which is the client that runs the current command
  (can be a module or a script)
* Remove script_caller that was used for commands that have CLIENT_SCRIPT
  to get the client that called the script. in most cases, that's the current_client,
  and in others (when being called from a module), it could be an intermediate
  client when we actually want the original one used by the external connection.

bugfixes:
* RM_Call with C flag should log ACL errors with the requested user rather than
  the one used by the original client, this also solves a crash when RM_Call is used
  with C flag from a detached thread safe context.
* addACLLogEntry would have logged info about the script_caller, but in case the
  script was issued by a module command we actually want the current_client. the
  exception is when RM_Call is called from a timer event, in which case we don't
  have a current_client.

behavior changes:
* client side tracking for scripts now tracks the keys that are read by the script
  instead of the keys that are declared by the caller for EVAL

other changes:
* Log both current_client and executing_client in the crash log.
* remove prepareLuaClient and resetLuaClient, being dead code that was forgotten.
* remove scriptTimeSnapshot and snapshot_time and instead add cmd_time_snapshot
  that serves all commands and is reset only when execution nesting starts.
* remove code to propagate CLIENT_FORCE_REPL from the executed command
  to the script caller since scripts aren't propagated anyway these days and anyway
  this flag wouldn't have had an effect since CLIENT_PREVENT_PROP is added by scriptResetRun.
* fix a module GIL violation issue in afterSleep that was introduced in #10300 (unreleased)
madolson pushed a commit to madolson/redis that referenced this pull request Apr 19, 2023
Freeze time during execution of scripts and all other commands.
This means that a key is either expired or not, and doesn't change
state during a script execution. resolves redis#10182

This PR try to add a new `commandTimeSnapshot` function.
The function logic is extracted from `keyIsExpired`, but the related
calls to `fixed_time_expire` and `mstime()` are removed, see below.

In commands, we will avoid calling `mstime()` multiple times
and just use the one that sampled in call. The background is,
e.g. using `PEXPIRE 1` with valgrind sometimes result in the key
being deleted rather than expired. The reason is that both `PEXPIRE`
command and `checkAlreadyExpired` call `mstime()` separately.

There are other more important changes in this PR:
1. Eliminate `fixed_time_expire`, it is no longer needed. 
   When we want to sample time we should always use a time snapshot. 
   We will use `in_nested_call` instead to update the cached time in `call`.
2. Move the call for `updateCachedTime` from `serverCron` to `afterSleep`.
    Now `commandTimeSnapshot` will always return the sample time, the
    `lookupKeyReadWithFlags` call in `getNodeByQuery` will get a outdated
    cached time (because `processCommand` is out of the `call` context).
    We put the call to `updateCachedTime` in `aftersleep`.
3. Cache the time each time the module lock Redis.
    Call `updateCachedTime` in `moduleGILAfterLock`, affecting `RM_ThreadSafeContextLock`
    and `RM_ThreadSafeContextTryLock`

Currently the commandTimeSnapshot change affects the following TTL commands:
- SET EX / SET PX
- EXPIRE / PEXPIRE
- SETEX / PSETEX
- GETEX EX / GETEX PX
- TTL / PTTL
- EXPIRETIME / PEXPIRETIME
- RESTORE key TTL

And other commands just use the cached mstime (including TIME).

This is considered to be a breaking change since it can break a script
that uses a loop to wait for a key to expire.
madolson pushed a commit to madolson/redis that referenced this pull request Apr 19, 2023
The old `server.unixtime*1000000` will overflow in 32-bits.
This was introduced in redis#10300 (not released).
enjoy-binbin added a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
Freeze time during execution of scripts and all other commands.
This means that a key is either expired or not, and doesn't change
state during a script execution. resolves redis#10182

This PR try to add a new `commandTimeSnapshot` function.
The function logic is extracted from `keyIsExpired`, but the related
calls to `fixed_time_expire` and `mstime()` are removed, see below.

In commands, we will avoid calling `mstime()` multiple times
and just use the one that sampled in call. The background is,
e.g. using `PEXPIRE 1` with valgrind sometimes result in the key
being deleted rather than expired. The reason is that both `PEXPIRE`
command and `checkAlreadyExpired` call `mstime()` separately.

There are other more important changes in this PR:
1. Eliminate `fixed_time_expire`, it is no longer needed. 
   When we want to sample time we should always use a time snapshot. 
   We will use `in_nested_call` instead to update the cached time in `call`.
2. Move the call for `updateCachedTime` from `serverCron` to `afterSleep`.
    Now `commandTimeSnapshot` will always return the sample time, the
    `lookupKeyReadWithFlags` call in `getNodeByQuery` will get a outdated
    cached time (because `processCommand` is out of the `call` context).
    We put the call to `updateCachedTime` in `aftersleep`.
3. Cache the time each time the module lock Redis.
    Call `updateCachedTime` in `moduleGILAfterLock`, affecting `RM_ThreadSafeContextLock`
    and `RM_ThreadSafeContextTryLock`

Currently the commandTimeSnapshot change affects the following TTL commands:
- SET EX / SET PX
- EXPIRE / PEXPIRE
- SETEX / PSETEX
- GETEX EX / GETEX PX
- TTL / PTTL
- EXPIRETIME / PEXPIRETIME
- RESTORE key TTL

And other commands just use the cached mstime (including TIME).

This is considered to be a breaking change since it can break a script
that uses a loop to wait for a key to expire.
enjoy-binbin added a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
The old `server.unixtime*1000000` will overflow in 32-bits.
This was introduced in redis#10300 (not released).
enjoy-binbin pushed a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
… for RM_Call (redis#11770)

* Make it clear that current_client is the root client that was called by
  external connection
* add executing_client which is the client that runs the current command
  (can be a module or a script)
* Remove script_caller that was used for commands that have CLIENT_SCRIPT
  to get the client that called the script. in most cases, that's the current_client,
  and in others (when being called from a module), it could be an intermediate
  client when we actually want the original one used by the external connection.

bugfixes:
* RM_Call with C flag should log ACL errors with the requested user rather than
  the one used by the original client, this also solves a crash when RM_Call is used
  with C flag from a detached thread safe context.
* addACLLogEntry would have logged info about the script_caller, but in case the
  script was issued by a module command we actually want the current_client. the
  exception is when RM_Call is called from a timer event, in which case we don't
  have a current_client.

behavior changes:
* client side tracking for scripts now tracks the keys that are read by the script
  instead of the keys that are declared by the caller for EVAL

other changes:
* Log both current_client and executing_client in the crash log.
* remove prepareLuaClient and resetLuaClient, being dead code that was forgotten.
* remove scriptTimeSnapshot and snapshot_time and instead add cmd_time_snapshot
  that serves all commands and is reset only when execution nesting starts.
* remove code to propagate CLIENT_FORCE_REPL from the executed command
  to the script caller since scripts aren't propagated anyway these days and anyway
  this flag wouldn't have had an effect since CLIENT_PREVENT_PROP is added by scriptResetRun.
* fix a module GIL violation issue in afterSleep that was introduced in redis#10300 (unreleased)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change This change can potentially break existing application release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus

Projects

Status: Done
Archived in project

Development

Successfully merging this pull request may close these issues.

Should time pass in scripts?

6 participants