Fix CLIENT UNBLOCK crashing modules.#9167
Conversation
Modules that use background threads with thread safe contexts are likely to use RM_BlockClient() without a timeout function, because they do not set up a timeout. Before this commit, `CLIENT UNBLOCK` would result with a crash as the `NULL` timeout callback is called. Beyond just crashing, this is also logically wrong as it may throw the module into an unexpected client state. This commits makes `CLIENT UNBLOCK` on such clients behave the same as any other client that is not in a blocked state and therefore cannot be unblocked.
| != C_OK) return; | ||
| struct client *target = lookupClientByID(id); | ||
| if (target && target->flags & CLIENT_BLOCKED) { | ||
| if (target && target->flags & CLIENT_BLOCKED && moduleBlockedClientMayTimeout(target)) { |
There was a problem hiding this comment.
what happens if moduleBlockedClientMayTimeout returns 0 but unblock_error is 1? shouldn't we still reply to the blocked client with -UNBLOCKED?
There was a problem hiding this comment.
@guybe7 I think not, the way I see it a module that doesn't set a timeout callback doesn't expect spontaneous unblocking of clients.
There was a problem hiding this comment.
ok so you basically disabled CLIENT UNBLOCK for clients that don't expect a spontaneous unblock.. which means the only possibility to unblock the client is in the hands of the module.. but, isn't that the purpose of CLIENT UNBLOCK? shouldn't it do the best it can (even if the blocked client isn't "expecting" to be unblocked) in order to unblock a client?
There was a problem hiding this comment.
@guybe7 My POV is that referring to a command is blocking is mostly relevant for Redis built-in commands. Modules can use the blocking API to implement commands that run in background threads, and will use the blocking API, but from a user's perspective this is not really a blocking command.
There was a problem hiding this comment.
i think the important distinction is that the client side can expect to get unblocked with error, we care less about that. but we're afraid to throw the module off balance and into a problematic state, right?
so i agree we don't wanna corrupt the state for modules, but what if a certain module wants to support CLIENT unblock, but doesn't want timeout? i.e. what if he didn't define a timeout callback since it has nothing to do there, but will in some way gracefully handle a disconnected client? (they should expect client disconnection anyway, right?)
There was a problem hiding this comment.
@yossigo what about Oran's question?
maybe what we're saying is: if you want to be CLIENT UNBLOCKed you must define a timeout_callback. if you don't actually want a timeout, just pass timeout=0
btw what if i pass timeout!=0 without a timeout callback? what happens it's timedout?
There was a problem hiding this comment.
Exactly that, if you want CLIENT UNBLOCK you need a timeout function - essentially preserving the current behavior. BTW going there is not very comfortable because you can't pass private data to the timeout function - but that was already discussed in the past and is a known shortcoming of the API.
Passing a non-zero timeout and a NULL callback will crash. Ways to address that:
- Refuse the blocking call and return a
NULLRedisModuleBlockedClient. That could potentially lead to other kinds of crashes because until now aNULLwas never returned. - Silently skip the call. This would leave the client without a reply causing command/reply de-sync.
- Produce our own custom error reply in this case. I think that makes most sense. WDUT?
* Avoid a potential race condition on slow systems where the blocking of a client may take some time. * Aoid needlessly waiting.
tests/modules/blockonbackground.c
Outdated
| return RedisModule_ReplyWithError(ctx, "ERR another client already blocked"); | ||
| } | ||
|
|
||
| blocked_client = RedisModule_BlockClient(ctx, Block_RedisCommand, timeout > 0 ? Block_RedisCommand : NULL, NULL, timeout); |
There was a problem hiding this comment.
did you pass Block_RedisCommand as timeout_callback on purpose?
There was a problem hiding this comment.
same goes for the reply callback..
There was a problem hiding this comment.
i guess it is on purpose? just for the comparison with NULL
anyway, it should be documented because it looks like a mistake
There was a problem hiding this comment.
Yes, the API was architected originally to support that. Same function prototype, and RedisModule_IsBlockedReplyRequest() and RedisModule_IsBlockedTimeoutRequest() to differentiate the flows.
|
It reminds me about #4366 , using a default module timeout handler. |
|
@soloestoy I agree, same problem basically. Not sure what would be the better option through - |
|
@yossigo It looks like the memory leak was introduced by this pr. |
Modules that use background threads with thread safe contexts are likely to use RM_BlockClient() without a timeout function, because they do not set up a timeout. Before this commit, `CLIENT UNBLOCK` would result with a crash as the `NULL` timeout callback is called. Beyond just crashing, this is also logically wrong as it may throw the module into an unexpected client state. This commits makes `CLIENT UNBLOCK` on such clients behave the same as any other client that is not in a blocked state and therefore cannot be unblocked. (cherry picked from commit aa139e2)
Modules that use background threads with thread safe contexts are likely to use RM_BlockClient() without a timeout function, because they do not set up a timeout. Before this commit, `CLIENT UNBLOCK` would result with a crash as the `NULL` timeout callback is called. Beyond just crashing, this is also logically wrong as it may throw the module into an unexpected client state. This commits makes `CLIENT UNBLOCK` on such clients behave the same as any other client that is not in a blocked state and therefore cannot be unblocked.
…ound (redis#9192) fixes test issue introduced in redis#9167 1. invalid reads due to accessing non-retained string (passed as unblock context). 2. leaking module blocked client context, see redis#6922 for info.
Modules that use background threads with thread safe contexts are likely
to use
RM_BlockClient()without a timeout function, because they do notset up a timeout.
Before this commit,
CLIENT UNBLOCKwould result with a crash as theNULLtimeout callback is called. Beyond just crashing, this is alsologically wrong as it may throw the module into an unexpected client
state.
This commits makes
CLIENT UNBLOCKon such clients behave the same asany other client that is not in a blocked state and therefore cannot be
unblocked.