Fix CLIENT UNBLOCK crashing modules. by yossigo · Pull Request #9167 · redis/redis

yossigo · 2021-06-29T11:05:05Z

Modules that use background threads with thread safe contexts are likely
to use RM_BlockClient() without a timeout function, because they do not
set up a timeout.

Before this commit, CLIENT UNBLOCK would result with a crash as the
NULL timeout callback is called. Beyond just crashing, this is also
logically wrong as it may throw the module into an unexpected client
state.

This commits makes CLIENT UNBLOCK on such clients behave the same as
any other client that is not in a blocked state and therefore cannot be
unblocked.

Modules that use background threads with thread safe contexts are likely to use RM_BlockClient() without a timeout function, because they do not set up a timeout. Before this commit, `CLIENT UNBLOCK` would result with a crash as the `NULL` timeout callback is called. Beyond just crashing, this is also logically wrong as it may throw the module into an unexpected client state. This commits makes `CLIENT UNBLOCK` on such clients behave the same as any other client that is not in a blocked state and therefore cannot be unblocked.

guybe7 · 2021-06-29T11:58:59Z

src/networking.c

            != C_OK) return;
        struct client *target = lookupClientByID(id);
-        if (target && target->flags & CLIENT_BLOCKED) {
+        if (target && target->flags & CLIENT_BLOCKED && moduleBlockedClientMayTimeout(target)) {


what happens if moduleBlockedClientMayTimeout returns 0 but unblock_error is 1? shouldn't we still reply to the blocked client with -UNBLOCKED?

@guybe7 I think not, the way I see it a module that doesn't set a timeout callback doesn't expect spontaneous unblocking of clients.

ok so you basically disabled CLIENT UNBLOCK for clients that don't expect a spontaneous unblock.. which means the only possibility to unblock the client is in the hands of the module.. but, isn't that the purpose of CLIENT UNBLOCK? shouldn't it do the best it can (even if the blocked client isn't "expecting" to be unblocked) in order to unblock a client?

@guybe7 My POV is that referring to a command is blocking is mostly relevant for Redis built-in commands. Modules can use the blocking API to implement commands that run in background threads, and will use the blocking API, but from a user's perspective this is not really a blocking command.

i think the important distinction is that the client side can expect to get unblocked with error, we care less about that. but we're afraid to throw the module off balance and into a problematic state, right?

so i agree we don't wanna corrupt the state for modules, but what if a certain module wants to support CLIENT unblock, but doesn't want timeout? i.e. what if he didn't define a timeout callback since it has nothing to do there, but will in some way gracefully handle a disconnected client? (they should expect client disconnection anyway, right?)

@yossigo what about Oran's question?

maybe what we're saying is: if you want to be CLIENT UNBLOCKed you must define a timeout_callback. if you don't actually want a timeout, just pass timeout=0

btw what if i pass timeout!=0 without a timeout callback? what happens it's timedout?

Exactly that, if you want CLIENT UNBLOCK you need a timeout function - essentially preserving the current behavior. BTW going there is not very comfortable because you can't pass private data to the timeout function - but that was already discussed in the past and is a known shortcoming of the API.

Passing a non-zero timeout and a NULL callback will crash. Ways to address that:

Refuse the blocking call and return a NULL RedisModuleBlockedClient. That could potentially lead to other kinds of crashes because until now a NULL was never returned.

Silently skip the call. This would leave the client without a reply causing command/reply de-sync.

Produce our own custom error reply in this case. I think that makes most sense. WDUT?

tests/unit/moduleapi/blockonbackground.tcl

* Avoid a potential race condition on slow systems where the blocking of a client may take some time. * Aoid needlessly waiting.

guybe7 · 2021-06-30T14:11:51Z

tests/modules/blockonbackground.c

+        return RedisModule_ReplyWithError(ctx, "ERR another client already blocked");
+    }
+
+    blocked_client = RedisModule_BlockClient(ctx, Block_RedisCommand, timeout > 0 ? Block_RedisCommand : NULL, NULL, timeout);


did you pass Block_RedisCommand as timeout_callback on purpose?

same goes for the reply callback..

i guess it is on purpose? just for the comparison with NULL

anyway, it should be documented because it looks like a mistake

Yes, the API was architected originally to support that. Same function prototype, and RedisModule_IsBlockedReplyRequest() and RedisModule_IsBlockedTimeoutRequest() to differentiate the flows.

soloestoy · 2021-07-01T07:25:23Z

It reminds me about #4366 , using a default module timeout handler.

yossigo · 2021-07-01T14:13:01Z

@soloestoy I agree, same problem basically. Not sure what would be the better option through - RM_BlockClient() returning a NULL or having a default callback.

sundb · 2021-07-02T05:46:40Z

@yossigo It looks like the memory leak was introduced by this pr.
https://github.com/redis/redis/runs/2967831993?check_suite_focus=true

…ound (#9192) fixes test issue introduced in #9167 1. invalid reads due to accessing non-retained string (passed as unblock context). 2. leaking module blocked client context, see #6922 for info.

Modules that use background threads with thread safe contexts are likely to use RM_BlockClient() without a timeout function, because they do not set up a timeout. Before this commit, `CLIENT UNBLOCK` would result with a crash as the `NULL` timeout callback is called. Beyond just crashing, this is also logically wrong as it may throw the module into an unexpected client state. This commits makes `CLIENT UNBLOCK` on such clients behave the same as any other client that is not in a blocked state and therefore cannot be unblocked. (cherry picked from commit aa139e2)

…ound (#9192) fixes test issue introduced in #9167 1. invalid reads due to accessing non-retained string (passed as unblock context). 2. leaking module blocked client context, see #6922 for info. (cherry picked from commit a8518cc)

Modules that use background threads with thread safe contexts are likely to use RM_BlockClient() without a timeout function, because they do not set up a timeout. Before this commit, `CLIENT UNBLOCK` would result with a crash as the `NULL` timeout callback is called. Beyond just crashing, this is also logically wrong as it may throw the module into an unexpected client state. This commits makes `CLIENT UNBLOCK` on such clients behave the same as any other client that is not in a blocked state and therefore cannot be unblocked.

…ound (redis#9192) fixes test issue introduced in redis#9167 1. invalid reads due to accessing non-retained string (passed as unblock context). 2. leaking module blocked client context, see redis#6922 for info.

yossigo requested review from guybe7 and oranagra June 29, 2021 11:20

guybe7 reviewed Jun 29, 2021

View reviewed changes

oranagra reviewed Jun 29, 2021

View reviewed changes

tests/unit/moduleapi/blockonbackground.tcl Outdated Show resolved Hide resolved

yossigo added 2 commits June 30, 2021 16:52

Update tests.

e7cbb1a

* Avoid a potential race condition on slow systems where the blocking of a client may take some time. * Aoid needlessly waiting.

Document timeout_callback and CLIENT UNBLOCK.

886bdb6

guybe7 reviewed Jun 30, 2021

View reviewed changes

Add a comment on callback usage.

71f715c

oranagra previously approved these changes Jun 30, 2021

View reviewed changes

yossigo dismissed oranagra’s stale review via 71f715c July 1, 2021 13:44

oranagra approved these changes Jul 1, 2021

View reviewed changes

yossigo merged commit aa139e2 into redis:unstable Jul 1, 2021

yossigo deleted the client-unblock-module-crash branch July 1, 2021 14:11

oranagra mentioned this pull request Jul 4, 2021

fix valgrind issues with recently added test in modules/blockonbackground #9192

Merged

oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Jul 19, 2021

oranagra mentioned this pull request Jul 21, 2021

Release 6.2.5 #9264

Merged

oranagra mentioned this pull request Feb 17, 2022

XREADGROUP: Unblock client if stream is deleted #10306

Merged

Conversation

yossigo commented Jun 29, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

soloestoy commented Jul 1, 2021

Uh oh!

yossigo commented Jul 1, 2021

Uh oh!

sundb commented Jul 2, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yossigo commented Jun 29, 2021 •

edited

Loading