Skip to content

Add new RM_Call flags for script mode, no writes, and error replies.#10372

Merged
oranagra merged 13 commits intoredis:unstablefrom
MeirShpilraien:safe_rm_call
Mar 22, 2022
Merged

Add new RM_Call flags for script mode, no writes, and error replies.#10372
oranagra merged 13 commits intoredis:unstablefrom
MeirShpilraien:safe_rm_call

Conversation

@MeirShpilraien
Copy link

@MeirShpilraien MeirShpilraien commented Mar 2, 2022

The PR extends RM_Call with 3 new capabilities using new flags that are given to RM_Call as part of the fmt argument.
It aims to assist modules that are getting a list of commands to be executed from the user (not hard coded as part of the module logic), think of a module that implements a new scripting language...

  • S - Run the command in a script mode, this means that it will raise an error if a command which are not allowed inside a script (flaged with the deny-script flag) is invoked (like SHUTDOWN). In addition, on script mode, write commands are not allowed if there is not enough good replicas (as configured with min-replicas-to-write) and/or a disk error happened.

  • W - no writes mode, Redis will reject any command that is marked with write flag. Again can be useful to modules that implement a new scripting language and wants to prevent any write commands.

  • E - Return errors as RedisModuleCallReply. Today the errors that happened before the command was invoked (like unknown commands or acl error) return a NULL reply and set errno. This might be missing important information about the failure and it is also impossible to just pass the error to the user using RM_ReplyWithCallReply. This new flag allows you to get a RedisModuleCallReply object with the relevant error message and treat it as if it was an error that was raised by the command invocation.

Tests were added to verify the new code paths.

In addition small refactoring was done to share some code between modules, scripts, and processCommand function:

  1. getAclErrorMessage was added to acl.c to unified to log message extraction from the acl result
  2. checkGoodReplicasStatus was added to replication.c to check the status of good replicas. It is used on scriptVerifyWriteCommandAllow, RM_Call, and processCommand.
  3. writeCommandsGetDiskErrorMessage was added to server.c to get the error message on persistence failure. Again it is used on scriptVerifyWriteCommandAllow, RM_Call, and processCommand.

The PR extends RM_Call with 3 new capabilities using new flags that are given
to RM_Call as part of the `fmt` argument.

* `S` - represents safe mode, Redis will reject dangerous commands like shutdown.
  We define dangerous commands as commands that are marked with `no-script` flag
  and are not allowed inside a script. This flag can be useful to modules that
  implement a new scripting language and the invoked commands are passing as a
  user input.

* `W` - no writes mode, Redis will reject any command that is marked with `write`
  flag. Again can be useful to modules that implement a new scripting language
  and wants to prevent any write commands.

* 'E' - Return errors as RedisModuleCallReply. Today the errors that happened before
  the command was invocation (like unknown commands or acl error) returns a NULL reply
  and set errno. This might be missing important information about the failure and
  it is also impossible to just pass the error to the user using RM_ReplyWithCallReply.
  This new flag allows you to get a RedisModuleCallReply object with the relevant error
  message and treat it as if it was an error that was raised by the command invocation.

  Tests were added to verify the new code paths.
@MeirShpilraien MeirShpilraien requested a review from oranagra March 2, 2022 15:00
@MeirShpilraien MeirShpilraien changed the title Add new RM_Call flags for safe mode, not writes, and error replies. Add new RM_Call flags for safe mode, no writes, and error replies. Mar 2, 2022
@MeirShpilraien
Copy link
Author

@oranagra I could not find where we are testing for cluster errors on RM_Call, do we have such tests?

@oranagra
Copy link
Member

oranagra commented Mar 2, 2022

we probably don't have anything, but you can extend tests/unit/moduleapi/cluster.tcl if you need.

@MeirShpilraien
Copy link
Author

Thanks @oranagra added a test.

Copy link
Collaborator

@sundb sundb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other %S

@MeirShpilraien
Copy link
Author

Thanks @sundb

@oranagra
Copy link
Member

oranagra commented Mar 8, 2022

I'd like to avoid the "dangerous" term in the top comment (and also code and docs if it's there).
this term is already abused (by ACL).
The term "safe" mode is also fuzzy, so instead we should call S a "script mode", which is in fact technically right, sine the no-script is the flag it is based on.

maybe NO_WRITES should be RO_MODE?

Copy link
Member

@oranagra oranagra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed the code.
i think the main issue to change is to switch "safe mode" and "dangerous commands" with "script mode",
then it becomes obvious what this does, and what it should be used for.

the the complementary task, would be to add other checks that are done when scripts call redis commands.

1. Rename safemode -> scriptmode
2. On script mode, added checks for disk errors and good replicas
3. Change errors formats
4. Unified some error message constructions with script code
@MeirShpilraien MeirShpilraien changed the title Add new RM_Call flags for safe mode, no writes, and error replies. Add new RM_Call flags for script mode, no writes, and error replies. Mar 10, 2022
@MeirShpilraien
Copy link
Author

@oranagra Fixed your comments and updated the top comment.

@MeirShpilraien
Copy link
Author

MeirShpilraien commented Mar 10, 2022

maybe NO_WRITES should be RO_MODE?

I thought it makes sense to match it with functions flags, let me know if you want me to change it.

@oranagra
Copy link
Member

@redis/core-team please take a look at the concept and comment.

Meir Shpilraien (Spielrein) and others added 2 commits March 10, 2022 15:28
Copy link
Collaborator

@yossigo yossigo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review the code (yet) but the concept LGTM.

@oranagra oranagra added 7.0-must-have state:major-decision Requires core team consensus release-notes indication that this issue needs to be mentioned in the release notes labels Mar 21, 2022
@oranagra
Copy link
Member

@MeirShpilraien this was approved in a core-team meeting.
please have a look at the last remaining comment to check if we can resolve that too, and then we'll merge it.

@oranagra oranagra merged commit f3855a0 into redis:unstable Mar 22, 2022
@oranagra oranagra mentioned this pull request Apr 5, 2022
oranagra pushed a commit that referenced this pull request Apr 25, 2022
…ror responses (#10612)

1. Disk error and slave count checks didn't flag the transactions or counted correctly in command stats (regression from #10372  , 7.0 RC3)
2. RM_Call will reply the same way Redis does, in case of non-exisitng command or arity error
3. RM_WrongArtiy will consider the full command name
4. Use lowercase 'u' in "unknonw subcommand" (to align with "unknown command")

Followup work of #10127
oranagra added a commit that referenced this pull request Jun 1, 2022
…M check bug, and new RM_Call checks. (#10786)

* Fix broken protocol when redis can't persist to RDB (general commands, not
  modules), excessive newline. regression of #10372 (7.0 RC3)
* Fix broken protocol when Redis can't persist to AOF (modules and
  scripts), missing newline.
* Fix bug in OOM check of EVAL scripts called from RM_Call.
  set the cached OOM state for scripts before executing module commands too,
  so that it can serve scripts that are executed by modules.
  i.e. in the past EVAL executed by RM_Call could have either falsely
  fail or falsely succeeded because of a wrong cached OOM state flag.
* Fix bugs with RM_Yield:
  1. SHUTDOWN should only accept the NOSAVE mode
  2. Avoid eviction during yield command processing.
  3. Avoid processing master client commands while yielding from another client
* Add new two more checks to RM_Call script mode.
  1. READONLY You can't write against a read only replica
  2. MASTERDOWN Link with MASTER is down and `replica-serve-stale-data` is set to `no`
* Add new RM_Call flag to let redis automatically refuse `deny-oom` commands
  while over the memory limit. 
* Add tests to cover various errors from Scripts, Modules, Modules
  calling scripts, and Modules calling commands in script mode.

Add tests:
* Looks like the MISCONF error was completely uncovered by the tests,
  add tests for it, including from scripts, and modules
* Add tests for NOREPLICAS from scripts
* Add tests for the various errors in module RM_Call, including RM_Call that
  calls EVAL, and RM_call in "eval mode". that includes:
  NOREPLICAS, READONLY, MASTERDOWN, MISCONF
MeirShpilraien pushed a commit to MeirShpilraien/redis that referenced this pull request Dec 1, 2022
Fix redis#7992, allow running blocking commands from within a module using `RM_Call`.

Today, when `RM_Call` is used, the fake client that is used to run command is marked
with `CLIENT_DENY_BLOCKING` flag. This flags tells the command that it is not allowed
to block the client and in case it needs to block, it must fallback to some alternative
(either return error or perform some default behavior). For example, `BLPOP` fallback
to simple `LPOP` if it is not allow to block.

All the commands must respect the `CLIENT_DENY_BLOCKING` flag (including module commands).
When the command invocation finished, Redis asserts that the client was not blocked.

This PR introduces the ability to call blocking command using `RM_Call` by passing a callback
that will be called when the client will get unblocked. The callback can be passed to `RM_Call`
using a new format specifier argument, `K`, follow by the callback and a private data (`void*`).
If the new specified is given to `RM_Call`, Redis knows that its OK to block and in such case it
performs the following:
1. **Not** setting `CLIENT_DENY_BLOCKING` flag on the fake client so the command will know it is
   ok to block the client.
2. When command invocation finishes, Redis checks if the client was blocked:
   1. If not, no changes to the code flow is done and the reply is simply return.
   2. If case the client was blocked, Redis set the callback and the private data on the fake client
      So it will be possible to call it when the client gets unblocked. Then Redis returns `NULL` reply
      (with no `errno` set) indicating the client was blocked.

When clients gets unblock, it eventually reaches `processUnblockedClients` function. This is where we
check if the client is a fake module client and if it is, we call the unblock callback instead of
performing the usual unblock operations.

### Calling blocking commands while running on script mode (`S`)

`RM_Call` script mode (`S`) was introduce on redis#10372. It is used for usecases where the command that was invoked on `RM_Call` comes from a user input and we want to make sure the user will not run dangerous commands like `shutdown`. Some command, such as `BLPOP`, are marked with `NO_SCRIPT` flag, which means they will not be allowed on script mode. Those commands are marked with  `NO_SCRIPT` just because they are blocking commands and not because they are dangerous. Now that we can run blocking commands on RM_Call, there is no real reason not to allow such commands on script mode.

The underline problem is that the `NO_SCRIPT` flag is abused to also mark some of the blocking commands (notice that those commands knows not to block the client if they are not allowed to, and have a fallback logic to such cases. So even if those commands was not makred with `NO_SCRIPT` flag, it would not harm Redis, and today we can already run those commands within multi exec).

In addition, not all blocking commands are marked with `NO_SCRIPT` flag, for example `blmpop` are not marked and can run from within a script.

Those facts shows that there are some ambiguity about the meaning of the `NO_SCRIPT` flag, and its not fully clear where it should be use.

The PR suggest that blocking commands should not be marked with `NO_SCRIPT` flag, those commands should handle `CLIENT_DENY_BLOCKING` flag and only block when it's safe (like they already does today). To achieve that, the PR removes the `NO_SCRIPT` flag from the following commands:
* `blmove`
* `blpop`
* `brpop`
* `brpoplpush`
* `bzpopmax`
* `bzpopmin`
* `wait`

This might be considered a breaking change as now, on scripts, instead of getting `command is not allowed from script` error, the user will get some fallback behavior base on the command implementation. That said, the change matches the behavior of scripts and multi exec with respect to those commands and allow running them on `RM_Call` even when script mode is used.

### TODO

* Accept/Decline the suggest API (using format specifier argument `K`)
* Decide if `processUnblockedClients` is the right place to call the unblock callback
* Decide if the changes to `BLPOP` like commands are acceptable or is it a breaking change.
enjoy-binbin pushed a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
…ror responses (redis#10612)

1. Disk error and slave count checks didn't flag the transactions or counted correctly in command stats (regression from redis#10372  , 7.0 RC3)
2. RM_Call will reply the same way Redis does, in case of non-exisitng command or arity error
3. RM_WrongArtiy will consider the full command name
4. Use lowercase 'u' in "unknonw subcommand" (to align with "unknown command")

Followup work of redis#10127
enjoy-binbin pushed a commit to enjoy-binbin/redis that referenced this pull request Jul 31, 2023
…M check bug, and new RM_Call checks. (redis#10786)

* Fix broken protocol when redis can't persist to RDB (general commands, not
  modules), excessive newline. regression of redis#10372 (7.0 RC3)
* Fix broken protocol when Redis can't persist to AOF (modules and
  scripts), missing newline.
* Fix bug in OOM check of EVAL scripts called from RM_Call.
  set the cached OOM state for scripts before executing module commands too,
  so that it can serve scripts that are executed by modules.
  i.e. in the past EVAL executed by RM_Call could have either falsely
  fail or falsely succeeded because of a wrong cached OOM state flag.
* Fix bugs with RM_Yield:
  1. SHUTDOWN should only accept the NOSAVE mode
  2. Avoid eviction during yield command processing.
  3. Avoid processing master client commands while yielding from another client
* Add new two more checks to RM_Call script mode.
  1. READONLY You can't write against a read only replica
  2. MASTERDOWN Link with MASTER is down and `replica-serve-stale-data` is set to `no`
* Add new RM_Call flag to let redis automatically refuse `deny-oom` commands
  while over the memory limit. 
* Add tests to cover various errors from Scripts, Modules, Modules
  calling scripts, and Modules calling commands in script mode.

Add tests:
* Looks like the MISCONF error was completely uncovered by the tests,
  add tests for it, including from scripts, and modules
* Add tests for NOREPLICAS from scripts
* Add tests for the various errors in module RM_Call, including RM_Call that
  calls EVAL, and RM_call in "eval mode". that includes:
  NOREPLICAS, READONLY, MASTERDOWN, MISCONF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes indication that this issue needs to be mentioned in the release notes state:major-decision Requires core team consensus

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants