Skip to content

Expose script flags to processCommand for better handling#10744

Merged
oranagra merged 13 commits intoredis:unstablefrom
oranagra:function_flags_in_processCommand
Jun 1, 2022
Merged

Expose script flags to processCommand for better handling#10744
oranagra merged 13 commits intoredis:unstablefrom
oranagra:function_flags_in_processCommand

Conversation

@oranagra
Copy link
Member

@oranagra oranagra commented May 18, 2022

In order to better handle scripts with shebang flags that can indicate
the script is a read-only, allow-oom, allow-stale etc, or not.

The important part is that read-only scripts (not just EVAL_RO
and FCALL_RO, but also ones with no-writes executed by normal EVAL or
FCALL), will now be permitted to run during CLIENT PAUSE WRITE (unlike
before where only the _RO commands would be processed).

Other than that, some errors like OOM, READONLY, MASTERDOWN are now
handled by processCommand, rather than the command itself affects the
error string (and even error code in some cases), and command stats.

Besides that, now the may-replicate commands, PFCOUNT and PUBLISH, will
be considered write commands in scripts and will be blocked in all
read-only scripts just like other write commands.
They'll also be blocked in EVAL_RO (i.e. even for scripts without the
no-writes shebang flag.

This commit also hides the may_replicate flag from the COMMAND command
output. this is a breaking change.

background about may_replicate:
We don't want to expose a no-may-replicate flag or alike to scripts, since we
consider the may-replicate thing an internal concern of redis, that we may
some day get rid of.
In fact, the may-replicate flag was initially introduced to flag EVAL: since
we didn't know what it's gonna do ahead of execution, before function-flags
existed). PUBLISH and PFCOUNT, both of which because they have side effects
which may some day be fixed differently.

code changes:
The changes in eval.c are mostly code re-ordering:

  • evalCalcFunctionName is extracted out of evalGenericCommand
  • evalExtractShebangFlags is extracted luaCreateFunction
  • evalGetCommandFlags is new code

This was discussed in #10364 (comment)

Doc PR: redis/redis-doc#1954

In order to better handle scripts with shebang flags that can indicate
the script is a read-only, allow-oom, allow-stale etc, or not.

The important part is that read-only scripts (not just EVAL_RO
and FCALL_RO, but also ones with `no-writes` executed by normal EVAL or
FCALL), will now be permitted to run during CLIENT PAUSE WRITE (unlike
before where only the _RO commands would be processed).

Other than that, some errors like OOM, READONLY, MASTERDOWN are now
handled by processCommand, rather than the command itself affects the
error string (and even error code in some cases), and command stats.

Besides that, now the `may-replicate` commands, PFCOUNT and PUBLISH, will
be considered `write` commands in scripts and will be blocked in all
read-only scripts just like other write commands.
They'll also be blocked in EVAL_RO (i.e. even for scripts without the
`no-writes` shebang flag.

This commit also hides the `may_replicate` flag from the COMMAND command
output.

background:
We don't want to expose a no-may-replicate flag or alike to scripts, since we
consider the may-replicate thing an internal concern of redis, that we may
some day get rid of.
In fact, the may-replicate flag was initially introduced to flag EVAL: since
we didn't know what it's gonna do ahead of execution, before function-flags
existed). PUBLISH and PFCOUNT, both of which because they have side effects
which may some day be fixed differently.

code changes:
The changes in eval.c are mostly code re-ordering:
- evalCalcFunctionName is extracted out of evalGenericCommand
- evalExtractShebangFlags is extracted luaCreateFunction
- evalGetCommandFlags is new code
} e
set _ $e
} {*Can not run script with write flag on readonly replica*}
} {READONLY You can't write against a read only replica.}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used to be ERR Can not run script with write flag on readonly replica


catch {[r fcall f1 1 k]} e
assert_match {*can not run it when used memory > 'maxmemory'*} $e
assert_match {OOM *when used memory > 'maxmemory'*} $e
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used to be OOM allow-oom flag is not set on the script, can not run it when used memory > 'maxmemory


catch {[r fcall f1 0]} e
assert_match {*'allow-stale' flag is not set on the script*} $e
assert_match {MASTERDOWN *} $e
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was already MASTERDOWN but the one in script.c


catch {[r fcall f3 0]} e
assert_match {*Can not execute the command on a stale replica*} $e
assert_match {ERR *Can not execute the command on a stale replica*} $e
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message didn't change in this PR (just the test)


# Fail to execute regardless of script content when we use default flags in OOM condition
assert_error {OOM allow-oom flag is not set on the script, can not run it when used memory > 'maxmemory'} {
assert_error {OOM *} {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now returns a normal OOM error

] "some value"

assert_error {ERR Can not run script with write flag on readonly replica} {
assert_error {READONLY You can't write against a read only replica.} {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now return a normal READONLY error

}

assert_error {*'allow-stale' flag is not set on the script*} {
assert_error {MASTERDOWN Link with MASTER is down and replica-serve-stale-data is set to 'no'.} {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now returns a normal MASTERDOWN error

Copy link

@MeirShpilraien MeirShpilraien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 LGTM
We should make sure to mentioned all the breaking changes on the release notes (MAY_REPLICATE command which will now be blocked on RO mode..).
Also we should check if there is any docs that need to be updated with respect to those changes.

@oranagra oranagra added the breaking-change This change can potentially break existing application label May 26, 2022
@soloestoy
Copy link
Contributor

Mostly LGTM.

But I still feel uncomfortable that when I use eval_ro with shebang I have to add no-writes flags, I think eval_ro with shebang should run with no-writes and allow-oom automatically, just like what function evalGetCommandFlags does:

int evalGetCommandFlags(client *c, uint64_t *flags) {
    char funcname[43];
    int evalsha = c->cmd->proc == evalShaCommand || c->cmd->proc == evalShaRoCommand;
    int ro_cmd = c->cmd->proc == evalRoCommand || c->cmd->proc == evalShaRoCommand;
    ...
    if (ro_cmd)
        script_flags |= SCRIPT_FLAG_NO_WRITES;
    *flags = scriptFlagsToCmdFlags(script_flags);
    return C_OK;
}

@soloestoy
Copy link
Contributor

BTW, now the may-replicate commands are considered write commands in scripts, so we don't need scriptVerifyMayReplicate anymore.

static int scriptVerifyMayReplicate(scriptRunCtx *run_ctx, char **err) {
    if (run_ctx->c->cmd->flags & CMD_MAY_REPLICATE &&
        server.client_pause_type == CLIENT_PAUSE_WRITE) {
        *err = sdsnew("May-replicate commands are not allowed when client pause write.");
        return C_ERR;
    }
    return C_OK;
}

@oranagra
Copy link
Member Author

But I still feel uncomfortable that when I use eval_ro with shebang I have to add no-writes flags, I think eval_ro with shebang should run with no-writes and allow-oom automatically, just like what function evalGetCommandFlags does:

Think of functions (not eval), the developer declares if the function is read only, or is allowed to ignore OOM state, the user that calls it doesn't necessarily knows what it does (similar to a module command).

I remind us that the reason we added EVAL_RO was for the purpose of client side routing.
one of the main purposes of this PR (second paragraph at the top comment) was for EVAL to be just as good as EVAL_RO (for client pause) for scripts that declared proper flags.

    if (ro_cmd)
        script_flags |= SCRIPT_FLAG_NO_WRITES;

i don't know why i added that. it doesn't seem that it's needed for any of my tests, it'll just allow the code to pass processCommand and fail in scriptPrepareForRun

BTW, now the may-replicate commands are considered write commands in scripts, so we don't need scriptVerifyMayReplicate anymore.

right, i'll delete it.

@oranagra
Copy link
Member Author

@MeirShpilraien i did list the breaking changes at the top (i invite you to review it and comment if it's unclear), and i'll mention them in the release notes.
i also prepared a doc PR (listed in the top comment), please review that one as well.

@oranagra
Copy link
Member Author

all, while working on this, and testing OOM errors from EVAL that's called by RM_Call, i found another bug.
the code in processCommand that sets server.script_oom was trying to avoid changing it while a script a busy script yields, but it meant that EVAL that's called by RM_Call can get executed with an outdated flag, i'll update the top comment to mention that bug too.

@soloestoy please have a look at 3bc867f

@oranagra
Copy link
Member Author

sorry for the mess. moved the script_oom bugfix to #10786 where fix other related issues and add a new RM_Call flag that needs it.

@oranagra
Copy link
Member Author

It's complicated to separate these various fixes since they're all triggered by extra coverage tests, so the tests are all conflicting between these PRs, but i wanna try to have the PRs small with clear topic and description

Copy link
Contributor

@madolson madolson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic all makes sense.

@soloestoy
Copy link
Contributor

soloestoy commented May 30, 2022

Think of functions (not eval), the developer declares if the function is read only, or is allowed to ignore OOM state, the user that calls it doesn't necessarily knows what it does (similar to a module command).

I know the flags are script's attributes not the command's, but eval and eval_ro have a different with functions, they can call and modify the script's body, that makes eval and eval_ro more complex. It's OK to me, I don't insist on changing it, but I'm afraid some users may misunderstand it.

Another question I wanna discuss, now when write paused, eval_ro and fcall_ro will be blocked if using them to call a script without no-writes flag, do you thinks it's better to return an error instead of block them?

@oranagra
Copy link
Member Author

I know the flags are script's attributes not the command's, but eval and eval_ro have a different with functions, they can call and modify the script's body, that makes eval and eval_ro more complex. It's OK to me, I don't insist on changing it, but I'm afraid some users may misunderstand it.

are you also suggesting a different default behavior for EVAL_RO vs EVALSHA_RO?
i.e. that since EVAL_RO has the script embedded in the RO command it'll have default of no-writes and for EVALSHA_RO it'll be similar to FCALL_RO (same defaults as normal FCALL)?

i think we should keep these _RO commands the same as the non-RO commands, and just say that they're for client routing and ACL.

Another question I wanna discuss, now when write paused, eval_ro and fcall_ro will be blocked if using them to call a script without no-writes flag, do you thinks it's better to return an error instead of block them?

in theory, your suggestion would make a better experience (return error right away, rather than block and return error when unblocked). but in practice, it's a programming mistake that will be discovered by the programmer when testing the script (on normal state), long before ever getting to run it in CLIENT PAUSE state, so it doesn't make any difference, and i rather not complicate the redis code for that (whatever simpler is good in my book)

@soloestoy
Copy link
Contributor

are you also suggesting a different default behavior for EVAL_RO vs EVALSHA_RO?
i.e. that since EVAL_RO has the script embedded in the RO command it'll have default of no-writes and for EVALSHA_RO it'll be similar to FCALL_RO (same defaults as normal FCALL)?

nope, I don't want make it too much complex, just keep it

…_processCommand

 Conflicts:
	src/server.h
	tests/modules/misc.c
	tests/unit/moduleapi/misc.tcl
	tests/unit/scripting.tcl
@oranagra oranagra merged commit df55861 into redis:unstable Jun 1, 2022
@oranagra oranagra deleted the function_flags_in_processCommand branch June 1, 2022 11:09
@oranagra oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Jun 1, 2022
@oranagra oranagra mentioned this pull request Jun 8, 2022
enjoy-binbin pushed a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
The important part is that read-only scripts (not just EVAL_RO
and FCALL_RO, but also ones with `no-writes` executed by normal EVAL or
FCALL), will now be permitted to run during CLIENT PAUSE WRITE (unlike
before where only the _RO commands would be processed).

Other than that, some errors like OOM, READONLY, MASTERDOWN are now
handled by processCommand, rather than the command itself affects the
error string (and even error code in some cases), and command stats.

Besides that, now the `may-replicate` commands, PFCOUNT and PUBLISH, will
be considered `write` commands in scripts and will be blocked in all
read-only scripts just like other write commands.
They'll also be blocked in EVAL_RO (i.e. even for scripts without the
`no-writes` shebang flag.

This commit also hides the `may_replicate` flag from the COMMAND command
output. this is a **breaking change**.

background about may_replicate:
We don't want to expose a no-may-replicate flag or alike to scripts, since we
consider the may-replicate thing an internal concern of redis, that we may
some day get rid of.
In fact, the may-replicate flag was initially introduced to flag EVAL: since
we didn't know what it's gonna do ahead of execution, before function-flags
existed). PUBLISH and PFCOUNT, both of which because they have side effects
which may some day be fixed differently.

code changes:
The changes in eval.c are mostly code re-ordering:
- evalCalcFunctionName is extracted out of evalGenericCommand
- evalExtractShebangFlags is extracted luaCreateFunction
- evalGetCommandFlags is new code
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change This change can potentially break existing application release-notes indication that this issue needs to be mentioned in the release notes

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants