Expose script flags to processCommand for better handling by oranagra · Pull Request #10744 · redis/redis

oranagra · 2022-05-18T13:24:38Z

In order to better handle scripts with shebang flags that can indicate
the script is a read-only, allow-oom, allow-stale etc, or not.

The important part is that read-only scripts (not just EVAL_RO
and FCALL_RO, but also ones with no-writes executed by normal EVAL or
FCALL), will now be permitted to run during CLIENT PAUSE WRITE (unlike
before where only the _RO commands would be processed).

Other than that, some errors like OOM, READONLY, MASTERDOWN are now
handled by processCommand, rather than the command itself affects the
error string (and even error code in some cases), and command stats.

Besides that, now the may-replicate commands, PFCOUNT and PUBLISH, will
be considered write commands in scripts and will be blocked in all
read-only scripts just like other write commands.
They'll also be blocked in EVAL_RO (i.e. even for scripts without the
no-writes shebang flag.

This commit also hides the may_replicate flag from the COMMAND command
output. this is a breaking change.

background about may_replicate:
We don't want to expose a no-may-replicate flag or alike to scripts, since we
consider the may-replicate thing an internal concern of redis, that we may
some day get rid of.
In fact, the may-replicate flag was initially introduced to flag EVAL: since
we didn't know what it's gonna do ahead of execution, before function-flags
existed). PUBLISH and PFCOUNT, both of which because they have side effects
which may some day be fixed differently.

code changes:
The changes in eval.c are mostly code re-ordering:

evalCalcFunctionName is extracted out of evalGenericCommand
evalExtractShebangFlags is extracted luaCreateFunction
evalGetCommandFlags is new code

This was discussed in #10364 (comment)

Doc PR: redis/redis-doc#1954

In order to better handle scripts with shebang flags that can indicate the script is a read-only, allow-oom, allow-stale etc, or not. The important part is that read-only scripts (not just EVAL_RO and FCALL_RO, but also ones with `no-writes` executed by normal EVAL or FCALL), will now be permitted to run during CLIENT PAUSE WRITE (unlike before where only the _RO commands would be processed). Other than that, some errors like OOM, READONLY, MASTERDOWN are now handled by processCommand, rather than the command itself affects the error string (and even error code in some cases), and command stats. Besides that, now the `may-replicate` commands, PFCOUNT and PUBLISH, will be considered `write` commands in scripts and will be blocked in all read-only scripts just like other write commands. They'll also be blocked in EVAL_RO (i.e. even for scripts without the `no-writes` shebang flag. This commit also hides the `may_replicate` flag from the COMMAND command output. background: We don't want to expose a no-may-replicate flag or alike to scripts, since we consider the may-replicate thing an internal concern of redis, that we may some day get rid of. In fact, the may-replicate flag was initially introduced to flag EVAL: since we didn't know what it's gonna do ahead of execution, before function-flags existed). PUBLISH and PFCOUNT, both of which because they have side effects which may some day be fixed differently. code changes: The changes in eval.c are mostly code re-ordering: - evalCalcFunctionName is extracted out of evalGenericCommand - evalExtractShebangFlags is extracted luaCreateFunction - evalGetCommandFlags is new code

oranagra · 2022-05-18T13:26:48Z

tests/unit/functions.tcl

            } e
            set _ $e
-        } {*Can not run script with write flag on readonly replica*}
+        } {READONLY You can't write against a read only replica.}


used to be ERR Can not run script with write flag on readonly replica

oranagra · 2022-05-18T13:27:25Z

tests/unit/functions.tcl


        catch {[r fcall f1 1 k]} e
-        assert_match {*can not run it when used memory > 'maxmemory'*} $e
+        assert_match {OOM *when used memory > 'maxmemory'*} $e


used to be OOM allow-oom flag is not set on the script, can not run it when used memory > 'maxmemory

oranagra · 2022-05-18T13:28:00Z

tests/unit/functions.tcl


        catch {[r fcall f1 0]} e
-        assert_match {*'allow-stale' flag is not set on the script*} $e
+        assert_match {MASTERDOWN *} $e


was already MASTERDOWN but the one in script.c

oranagra · 2022-05-18T13:28:28Z

tests/unit/functions.tcl


        catch {[r fcall f3 0]} e
-        assert_match {*Can not execute the command on a stale replica*} $e
+        assert_match {ERR *Can not execute the command on a stale replica*} $e


The error message didn't change in this PR (just the test)

oranagra · 2022-05-18T13:28:56Z

tests/unit/scripting.tcl


        # Fail to execute regardless of script content when we use default flags in OOM condition
-        assert_error {OOM allow-oom flag is not set on the script, can not run it when used memory > 'maxmemory'} {
+        assert_error {OOM *} {


now returns a normal OOM error

oranagra · 2022-05-18T13:29:37Z

tests/unit/scripting.tcl

            ] "some value"

-            assert_error {ERR Can not run script with write flag on readonly replica} {
+            assert_error {READONLY You can't write against a read only replica.} {


now return a normal READONLY error

oranagra · 2022-05-18T13:29:57Z

tests/unit/scripting.tcl

        }

-        assert_error {*'allow-stale' flag is not set on the script*} {
+        assert_error {MASTERDOWN Link with MASTER is down and replica-serve-stale-data is set to 'no'.} {


now returns a normal MASTERDOWN error

src/script.c

…_processCommand

MeirShpilraien

👍 LGTM
We should make sure to mentioned all the breaking changes on the release notes (MAY_REPLICATE command which will now be blocked on RO mode..).
Also we should check if there is any docs that need to be updated with respect to those changes.

soloestoy · 2022-05-27T03:53:57Z

Mostly LGTM.

But I still feel uncomfortable that when I use eval_ro with shebang I have to add no-writes flags, I think eval_ro with shebang should run with no-writes and allow-oom automatically, just like what function evalGetCommandFlags does:

int evalGetCommandFlags(client *c, uint64_t *flags) {
    char funcname[43];
    int evalsha = c->cmd->proc == evalShaCommand || c->cmd->proc == evalShaRoCommand;
    int ro_cmd = c->cmd->proc == evalRoCommand || c->cmd->proc == evalShaRoCommand;
    ...
    if (ro_cmd)
        script_flags |= SCRIPT_FLAG_NO_WRITES;
    *flags = scriptFlagsToCmdFlags(script_flags);
    return C_OK;
}

soloestoy · 2022-05-27T06:15:39Z

BTW, now the may-replicate commands are considered write commands in scripts, so we don't need scriptVerifyMayReplicate anymore.

static int scriptVerifyMayReplicate(scriptRunCtx *run_ctx, char **err) {
    if (run_ctx->c->cmd->flags & CMD_MAY_REPLICATE &&
        server.client_pause_type == CLIENT_PAUSE_WRITE) {
        *err = sdsnew("May-replicate commands are not allowed when client pause write.");
        return C_ERR;
    }
    return C_OK;
}

…_processCommand

oranagra · 2022-05-27T19:01:36Z

But I still feel uncomfortable that when I use eval_ro with shebang I have to add no-writes flags, I think eval_ro with shebang should run with no-writes and allow-oom automatically, just like what function evalGetCommandFlags does:

Think of functions (not eval), the developer declares if the function is read only, or is allowed to ignore OOM state, the user that calls it doesn't necessarily knows what it does (similar to a module command).

I remind us that the reason we added EVAL_RO was for the purpose of client side routing.
one of the main purposes of this PR (second paragraph at the top comment) was for EVAL to be just as good as EVAL_RO (for client pause) for scripts that declared proper flags.

    if (ro_cmd)
        script_flags |= SCRIPT_FLAG_NO_WRITES;

i don't know why i added that. it doesn't seem that it's needed for any of my tests, it'll just allow the code to pass processCommand and fail in scriptPrepareForRun

BTW, now the may-replicate commands are considered write commands in scripts, so we don't need scriptVerifyMayReplicate anymore.

right, i'll delete it.

oranagra · 2022-05-27T20:31:36Z

@MeirShpilraien i did list the breaking changes at the top (i invite you to review it and comment if it's unclear), and i'll mention them in the release notes.
i also prepared a doc PR (listed in the top comment), please review that one as well.

oranagra · 2022-05-27T20:35:17Z

all, while working on this, and testing OOM errors from EVAL that's called by RM_Call, i found another bug.
the code in processCommand that sets server.script_oom was trying to avoid changing it while a script a busy script yields, but it meant that EVAL that's called by RM_Call can get executed with an outdated flag, i'll update the top comment to mention that bug too.

@soloestoy please have a look at 3bc867f

oranagra · 2022-05-29T14:36:49Z

sorry for the mess. moved the script_oom bugfix to #10786 where fix other related issues and add a new RM_Call flag that needs it.

oranagra · 2022-05-29T14:42:19Z

It's complicated to separate these various fixes since they're all triggered by extra coverage tests, so the tests are all conflicting between these PRs, but i wanna try to have the PRs small with clear topic and description

madolson

Logic all makes sense.

src/server.c

tests/unit/pause.tcl

soloestoy · 2022-05-30T06:54:21Z

Think of functions (not eval), the developer declares if the function is read only, or is allowed to ignore OOM state, the user that calls it doesn't necessarily knows what it does (similar to a module command).

I know the flags are script's attributes not the command's, but eval and eval_ro have a different with functions, they can call and modify the script's body, that makes eval and eval_ro more complex. It's OK to me, I don't insist on changing it, but I'm afraid some users may misunderstand it.

Another question I wanna discuss, now when write paused, eval_ro and fcall_ro will be blocked if using them to call a script without no-writes flag, do you thinks it's better to return an error instead of block them?

oranagra · 2022-05-30T07:45:49Z

I know the flags are script's attributes not the command's, but eval and eval_ro have a different with functions, they can call and modify the script's body, that makes eval and eval_ro more complex. It's OK to me, I don't insist on changing it, but I'm afraid some users may misunderstand it.

are you also suggesting a different default behavior for EVAL_RO vs EVALSHA_RO?
i.e. that since EVAL_RO has the script embedded in the RO command it'll have default of no-writes and for EVALSHA_RO it'll be similar to FCALL_RO (same defaults as normal FCALL)?

i think we should keep these _RO commands the same as the non-RO commands, and just say that they're for client routing and ACL.

Another question I wanna discuss, now when write paused, eval_ro and fcall_ro will be blocked if using them to call a script without no-writes flag, do you thinks it's better to return an error instead of block them?

in theory, your suggestion would make a better experience (return error right away, rather than block and return error when unblocked). but in practice, it's a programming mistake that will be discovered by the programmer when testing the script (on normal state), long before ever getting to run it in CLIENT PAUSE state, so it doesn't make any difference, and i rather not complicate the redis code for that (whatever simpler is good in my book)

soloestoy · 2022-05-30T08:45:37Z

are you also suggesting a different default behavior for EVAL_RO vs EVALSHA_RO?
i.e. that since EVAL_RO has the script embedded in the RO command it'll have default of no-writes and for EVALSHA_RO it'll be similar to FCALL_RO (same defaults as normal FCALL)?

nope, I don't want make it too much complex, just keep it

…_processCommand Conflicts: src/server.h tests/modules/misc.c tests/unit/moduleapi/misc.tcl tests/unit/scripting.tcl

The important part is that read-only scripts (not just EVAL_RO and FCALL_RO, but also ones with `no-writes` executed by normal EVAL or FCALL), will now be permitted to run during CLIENT PAUSE WRITE (unlike before where only the _RO commands would be processed). Other than that, some errors like OOM, READONLY, MASTERDOWN are now handled by processCommand, rather than the command itself affects the error string (and even error code in some cases), and command stats. Besides that, now the `may-replicate` commands, PFCOUNT and PUBLISH, will be considered `write` commands in scripts and will be blocked in all read-only scripts just like other write commands. They'll also be blocked in EVAL_RO (i.e. even for scripts without the `no-writes` shebang flag. This commit also hides the `may_replicate` flag from the COMMAND command output. this is a **breaking change**. background about may_replicate: We don't want to expose a no-may-replicate flag or alike to scripts, since we consider the may-replicate thing an internal concern of redis, that we may some day get rid of. In fact, the may-replicate flag was initially introduced to flag EVAL: since we didn't know what it's gonna do ahead of execution, before function-flags existed). PUBLISH and PFCOUNT, both of which because they have side effects which may some day be fixed differently. code changes: The changes in eval.c are mostly code re-ordering: - evalCalcFunctionName is extracted out of evalGenericCommand - evalExtractShebangFlags is extracted luaCreateFunction - evalGetCommandFlags is new code

oranagra commented May 18, 2022

View reviewed changes

This was referenced May 18, 2022

script should not allow may-replicate commands when client pause write #10364

Merged

Add documentation documenting read-only scripts redis/redis-doc#1953

Merged

oranagra requested review from MeirShpilraien, madolson and soloestoy May 22, 2022 08:09

oranagra mentioned this pull request May 22, 2022

script flags update redis/redis-doc#1954

Merged

oranagra added 2 commits May 22, 2022 17:33

Merge remote-tracking branch 'origin/unstable' into function_flags_in…

7f17bba

…_processCommand

post merge adjustments

c595829

oranagra added the 7.0-must-have label May 24, 2022

MeirShpilraien reviewed May 25, 2022

View reviewed changes

MeirShpilraien approved these changes May 25, 2022

View reviewed changes

oranagra added the breaking-change This change can potentially break existing application label May 26, 2022

Merge remote-tracking branch 'origin/unstable' into function_flags_in…

0cd4099

…_processCommand

oranagra added 4 commits May 27, 2022 23:17

Fix a bug where scripts executed by modules use a stale OOM state

3bc867f

Bring the protocol fix from #10786 for the sake of tests to pass

aecb72f

remove unneeded code

93c961e

add tests for the checks in scriptPrepareForRun via RM_Call

934e7db

take out the script_oom flag, moved to #10786

2c81b6d

madolson approved these changes May 30, 2022

View reviewed changes

src/server.c Outdated Show resolved Hide resolved

src/server.c Outdated Show resolved Hide resolved

tests/unit/pause.tcl Outdated Show resolved Hide resolved

revert a refactoring change that's gonna be fixed differently in #10786)

dc50bcd

refactory following code review comments - no logical change

9efa4bc

soloestoy approved these changes May 30, 2022

View reviewed changes

some additional coverage of EVAL_RO in PAUSE WRITE

b4b69bd

yossigo approved these changes Jun 1, 2022

View reviewed changes

Merge remote-tracking branch 'origin/unstable' into function_flags_in…

6692a55

…_processCommand Conflicts: src/server.h tests/modules/misc.c tests/unit/moduleapi/misc.tcl tests/unit/scripting.tcl

oranagra merged commit df55861 into redis:unstable Jun 1, 2022

oranagra deleted the function_flags_in_processCommand branch June 1, 2022 11:09

oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Jun 1, 2022

soloestoy mentioned this pull request Jun 7, 2022

remove allow-oom from scripts and add no-deny-oom for scripts #10804

Open

oranagra mentioned this pull request Jun 8, 2022

Release 7.0.1 #10829

Merged

filipecosta90 mentioned this pull request Nov 17, 2022

Reduce eval related overhead introduced in v7.0 by evalCalcFunctionName #11521

Merged

Conversation

oranagra commented May 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra May 18, 2022

Choose a reason for hiding this comment

Uh oh!

oranagra May 18, 2022

Choose a reason for hiding this comment

Uh oh!

oranagra May 18, 2022

Choose a reason for hiding this comment

Uh oh!

oranagra May 18, 2022

Choose a reason for hiding this comment

Uh oh!

oranagra May 18, 2022

Choose a reason for hiding this comment

Uh oh!

oranagra May 18, 2022

Choose a reason for hiding this comment

Uh oh!

oranagra May 18, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MeirShpilraien left a comment

Choose a reason for hiding this comment

Uh oh!

soloestoy commented May 27, 2022

Uh oh!

soloestoy commented May 27, 2022

Uh oh!

oranagra commented May 27, 2022

Uh oh!

oranagra commented May 27, 2022

Uh oh!

oranagra commented May 27, 2022

Uh oh!

oranagra commented May 29, 2022

Uh oh!

oranagra commented May 29, 2022

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

soloestoy commented May 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra commented May 30, 2022

Uh oh!

soloestoy commented May 30, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

oranagra commented May 18, 2022 •

edited

Loading

soloestoy commented May 30, 2022 •

edited

Loading