Cleanup around script_caller, fix tracking of scripts and ACL logging for RM_Call by oranagra · Pull Request #11770 · redis/redis

oranagra · 2023-01-31T10:30:46Z

Make it clear that current_client is the root client that was called by external connection
add executing_client which is the client that runs the current command (can be a module or a script)
Remove script_caller that was used for commands that have CLIENT_SCRIPT to get the client that called the script. in most cases, that's the current_client, and in others (when being called from a module), it could be an intermediate client when we actually want the original one used by the external connection.

bugfixes:

RM_Call with C flag should log ACL errors with the requested user rather than the one used by the original client, this also solves a crash when RM_Call is used with C flag from a detached thread safe context.
addACLLogEntry would have logged info about the script_caller, but in case the script was issued by a module command we actually want the current_client. the exception is when RM_Call is called from a timer event, in which case we don't have a current_client.

behavior changes:

client side tracking for scripts now tracks the keys that are read by the script instead of the keys that are declared by the caller for EVAL

other changes:

Log both current_client and executing_client in the crash log.
remove prepareLuaClient and resetLuaClient, being dead code that was forgotten.
remove scriptTimeSnapshot and snapshot_time and instead add cmd_time_snapshot that serves all commands and is reset only when execution nesting starts.
remove code to propagate CLIENT_FORCE_REPL from the executed command to the script caller since scripts aren't propagated anyway these days and anyway this flag wouldn't have had an effect since CLIENT_PREVENT_PROP is added by scriptResetRun.
fix a module GIL violation issue in afterSleep that was introduced in Freeze time sampling during command execution, and scripts #10300 (unreleased)

… for RM_Call * Make it clear that current_client is the root client that was called by external connection * add executing_client which is the client that runs the current command (can be a module or a script) * Remove script_caller that was used for commands that have CLIENT_SCRIPT to get the client that called the script. in most cases, that's the current_client, and in others (when being called from a module), it could be an intermidiate client when we actually want the original one used by the external connectoin. bugfixes: * RM_Call with C flag should log ACL errors with the requested user rather than the one used by the original client, this also solves a crash when RM_Call is used with C flag from a detached thread safe context. * addACLLogEntry would have logged info about the script_caller, but in case the script was issued by a module command we actually want the current_client. the exception is when RM_Call is called from a timer event, in which case we don't have a current_client. behavior changes: * client side tracking for scripts now tracks the keys that are read by the script instead of the keys that are declared by the caller for EVAL other changes: * Log both current_client and executing_client in the crash log. * remove prepareLuaClient and resetLuaClient, being dead code that was forgotten. * remove scriptTimeSnapshot and snapshot_time which apparently aren't needed anymore. commandTimeSnapshot returns the cached server.mstime for all commands nowadays, and it isn't modified even when execuring a command during busy scripts since the execution_nesting protectes it. * remove code to propagate CLIENT_FORCE_REPL from the executed command to the script caller since scripts aren't propagated anyway these days and anyway this flag wouldn't have had an effect since CLIENT_PREVENT_PROP is added by scriptResetRun.

oranagra · 2023-01-31T10:32:44Z

tests/unit/moduleapi/usercall.tcl

+    test {test module check regular redis command with user and acl from blocked background thread} {
+        assert_equal [r set x 5] OK
+
+        r ACL LOG RESET
+        assert_equal [r usercall.reset_user] OK
+        assert_equal [r usercall.add_to_acl "~* &* +@all -set"] OK
+
+        # fails here as testing acl in rm call from a background thread
+        assert_error {*NOPERM User module_user has no permissions*} {r usercall.call_with_user_bg C set x 10}


ACL logging code used to crash here before the fix since REDISMODULE_ARGV_RUN_AS_USER attempted to use the user of the context client instead of the user requested by the context.

oranagra · 2023-01-31T10:33:26Z

tests/unit/moduleapi/usercall.tcl


+        # verify that new log entry added
+        set entry [lindex [r ACL LOG] 0]
+        assert_equal [dict get $entry username] {module_user}


before the fix, this used to return "default" user instead of "module_user"

oranagra · 2023-01-31T10:45:00Z

tests/unit/moduleapi/usercall.tcl

+        assert_equal [dict get $entry context] {lua}
+        assert_equal [dict get $entry object] {set}
+        assert_equal [dict get $entry reason] {command}
+        assert_match {*cmd=usercall.call_with_user_flag*} [dict get $entry client-info]


before the fix it used to return "cmd=evalsha" instead of "cmd=usercall.call_with_user_flag"

oranagra · 2023-01-31T20:11:09Z

@MeirShpilraien @sjpotter FYI

tests/modules/usercall.c

src/aof.c

madolson

I'm fairly confident the original decisions around CSC was just a simple mistake that was overlooked. There is one thought I have, which is that it's possible that EVAL was never supposed to support tracking, and it was just an unintended side effect that sub-commands were triggering the tracking code. Many use cases don't really make sense in the tracking world. For example, if you have a script that does a read/modify/write on a single key, maybe implementing a prepend command, you'll start tracking the key then get an invalidation immediately. Ideally you would just get an invalidation if you had previously looked at the key, since the script is logically doing a write.

I don't feel that strongly either way, I don't think this is really a common use case.

src/server.c

oranagra · 2023-02-04T08:18:45Z

@madolson you may be right about scripts that do read for the purpose of write, but on the other hand, maybe the caller of that script will want to re-issue it when the source changes?
and anyway, we should probably keep triggering CSC tracking on read-only scripts (i.e. simple ones are like MULTI-EXEC possibly doing some some filtering on the output).

what worries me regarding the current change is that there were tests that explicitly check this odd behavior (while it was really simple to fix, i.e. we did already have pointers to both clients when doing this thing).. so i'm paranoid about not being aware of a good reason why it should be the way it was.

madolson

This makes sense to me. I'm not sure if this is the most useful behavior, but it intuitively makes more sense than what we have.

src/server.c

oranagra · 2023-02-08T08:07:21Z

The PR and breaking change for CSC in scripts was approved in a core-team meeting.
waiting for another reviewer to try to catch bugs and then it can be merged.

src/server.c

src/networking.c

guybe7 · 2023-02-09T07:33:42Z

src/networking.c


+    /* For the few commands that are allowed during busy scripts, we rather
+     * provide a fresher time than the one from when the script started (they
+     * still won't get it from the call due to execution_nesting. For commands


WDYM by "they still won't get it from call"? won't get what? fresher time?
maybe "(execution_nesting is >=1 for all commands that run during a busy script, so cmd_time_snapshot won't be updated inside call until the script is done)"

yes, they won't get a fresher time.
i think your explanation is longer and more technical will be harder to read.
we just need to hint that it's still an inaccurate compromise, and people can read the code to understand it better.
i can replace "it" with "fresher time" if that helps.

let's first discuss the solution and later the comment..

src/server.c

… for RM_Call (redis#11770) * Make it clear that current_client is the root client that was called by external connection * add executing_client which is the client that runs the current command (can be a module or a script) * Remove script_caller that was used for commands that have CLIENT_SCRIPT to get the client that called the script. in most cases, that's the current_client, and in others (when being called from a module), it could be an intermediate client when we actually want the original one used by the external connection. bugfixes: * RM_Call with C flag should log ACL errors with the requested user rather than the one used by the original client, this also solves a crash when RM_Call is used with C flag from a detached thread safe context. * addACLLogEntry would have logged info about the script_caller, but in case the script was issued by a module command we actually want the current_client. the exception is when RM_Call is called from a timer event, in which case we don't have a current_client. behavior changes: * client side tracking for scripts now tracks the keys that are read by the script instead of the keys that are declared by the caller for EVAL other changes: * Log both current_client and executing_client in the crash log. * remove prepareLuaClient and resetLuaClient, being dead code that was forgotten. * remove scriptTimeSnapshot and snapshot_time and instead add cmd_time_snapshot that serves all commands and is reset only when execution nesting starts. * remove code to propagate CLIENT_FORCE_REPL from the executed command to the script caller since scripts aren't propagated anyway these days and anyway this flag wouldn't have had an effect since CLIENT_PREVENT_PROP is added by scriptResetRun. * fix a module GIL violation issue in afterSleep that was introduced in redis#10300 (unreleased)

oranagra commented Jan 31, 2023

View reviewed changes

oranagra requested review from guybe7 and madolson January 31, 2023 11:50

oranagra mentioned this pull request Jan 31, 2023

Call trimStringObjectIfNeeded only when applicable, don't call realloc on noop. #11508

Closed

madolson reviewed Feb 1, 2023

View reviewed changes

tests/modules/usercall.c Show resolved Hide resolved

sundb reviewed Feb 1, 2023

View reviewed changes

src/aof.c Show resolved Hide resolved

madolson reviewed Feb 3, 2023

View reviewed changes

src/server.c Outdated Show resolved Hide resolved

code review fixes

d831ab4

madolson approved these changes Feb 6, 2023

View reviewed changes

oranagra added state:major-decision Requires core team consensus release-notes indication that this issue needs to be mentioned in the release notes breaking-change This change can potentially break existing application labels Feb 7, 2023

guybe7 reviewed Feb 7, 2023

View reviewed changes

src/server.c Show resolved Hide resolved

oranagra requested a review from soloestoy February 7, 2023 08:41

guybe7 mentioned this pull request Feb 7, 2023

SCAN/RANDOMKEY and lazy-expire #11788

Merged

introduce server.cmd_time_snapshot

b265769

guybe7 reviewed Feb 8, 2023

View reviewed changes

src/server.c Show resolved Hide resolved

backup and restore cmd_time_snapshot in processEventsWhileBlocked

6761b74

sundb reviewed Feb 9, 2023

View reviewed changes

src/networking.c Outdated Show resolved Hide resolved

typos

aefc381

guybe7 reviewed Feb 9, 2023

View reviewed changes

soloestoy reviewed Feb 15, 2023

View reviewed changes

src/server.c Outdated Show resolved Hide resolved

beforeSleep: don't update cmd_time when WhileBlocked, fix GIL violation

4a5b4d4

oranagra requested a review from soloestoy February 15, 2023 12:18

guybe7 reviewed Feb 15, 2023

View reviewed changes

src/server.c Outdated Show resolved Hide resolved

improve GIL warning comment

b8fe666

soloestoy approved these changes Feb 16, 2023

View reviewed changes

oranagra merged commit 233abbb into redis:unstable Feb 16, 2023

oranagra deleted the script_caller_cleanup branch February 16, 2023 06:07

Conversation

oranagra commented Jan 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra Jan 31, 2023

Choose a reason for hiding this comment

Uh oh!

oranagra Jan 31, 2023

Choose a reason for hiding this comment

Uh oh!

oranagra Jan 31, 2023

Choose a reason for hiding this comment

Uh oh!

oranagra commented Jan 31, 2023

Uh oh!

Uh oh!

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oranagra commented Feb 4, 2023

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oranagra commented Feb 8, 2023

Uh oh!

Uh oh!

Uh oh!

guybe7 Feb 9, 2023

Choose a reason for hiding this comment

Uh oh!

oranagra Feb 9, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

oranagra commented Jan 31, 2023 •

edited

Loading