Skip to content

Fix CLIENT UNBLOCK crashing modules.#9167

Merged
yossigo merged 4 commits intoredis:unstablefrom
yossigo:client-unblock-module-crash
Jul 1, 2021
Merged

Fix CLIENT UNBLOCK crashing modules.#9167
yossigo merged 4 commits intoredis:unstablefrom
yossigo:client-unblock-module-crash

Conversation

@yossigo
Copy link
Collaborator

@yossigo yossigo commented Jun 29, 2021

Modules that use background threads with thread safe contexts are likely
to use RM_BlockClient() without a timeout function, because they do not
set up a timeout.

Before this commit, CLIENT UNBLOCK would result with a crash as the
NULL timeout callback is called. Beyond just crashing, this is also
logically wrong as it may throw the module into an unexpected client
state.

This commits makes CLIENT UNBLOCK on such clients behave the same as
any other client that is not in a blocked state and therefore cannot be
unblocked.

Modules that use background threads with thread safe contexts are likely
to use RM_BlockClient() without a timeout function, because they do not
set up a timeout.

Before this commit, `CLIENT UNBLOCK` would result with a crash as the
`NULL` timeout callback is called. Beyond just crashing, this is also
logically wrong as it may throw the module into an unexpected client
state.

This commits makes `CLIENT UNBLOCK` on such clients behave the same as
any other client that is not in a blocked state and therefore cannot be
unblocked.
@yossigo yossigo requested review from guybe7 and oranagra June 29, 2021 11:20
!= C_OK) return;
struct client *target = lookupClientByID(id);
if (target && target->flags & CLIENT_BLOCKED) {
if (target && target->flags & CLIENT_BLOCKED && moduleBlockedClientMayTimeout(target)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if moduleBlockedClientMayTimeout returns 0 but unblock_error is 1? shouldn't we still reply to the blocked client with -UNBLOCKED?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guybe7 I think not, the way I see it a module that doesn't set a timeout callback doesn't expect spontaneous unblocking of clients.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok so you basically disabled CLIENT UNBLOCK for clients that don't expect a spontaneous unblock.. which means the only possibility to unblock the client is in the hands of the module.. but, isn't that the purpose of CLIENT UNBLOCK? shouldn't it do the best it can (even if the blocked client isn't "expecting" to be unblocked) in order to unblock a client?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guybe7 My POV is that referring to a command is blocking is mostly relevant for Redis built-in commands. Modules can use the blocking API to implement commands that run in background threads, and will use the blocking API, but from a user's perspective this is not really a blocking command.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the important distinction is that the client side can expect to get unblocked with error, we care less about that. but we're afraid to throw the module off balance and into a problematic state, right?

so i agree we don't wanna corrupt the state for modules, but what if a certain module wants to support CLIENT unblock, but doesn't want timeout? i.e. what if he didn't define a timeout callback since it has nothing to do there, but will in some way gracefully handle a disconnected client? (they should expect client disconnection anyway, right?)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yossigo what about Oran's question?

maybe what we're saying is: if you want to be CLIENT UNBLOCKed you must define a timeout_callback. if you don't actually want a timeout, just pass timeout=0

btw what if i pass timeout!=0 without a timeout callback? what happens it's timedout?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly that, if you want CLIENT UNBLOCK you need a timeout function - essentially preserving the current behavior. BTW going there is not very comfortable because you can't pass private data to the timeout function - but that was already discussed in the past and is a known shortcoming of the API.

Passing a non-zero timeout and a NULL callback will crash. Ways to address that:

  1. Refuse the blocking call and return a NULL RedisModuleBlockedClient. That could potentially lead to other kinds of crashes because until now a NULL was never returned.
  2. Silently skip the call. This would leave the client without a reply causing command/reply de-sync.
  3. Produce our own custom error reply in this case. I think that makes most sense. WDUT?

yossigo added 2 commits June 30, 2021 16:52
* Avoid a potential race condition on slow systems where the blocking of
  a client may take some time.
* Aoid needlessly waiting.
return RedisModule_ReplyWithError(ctx, "ERR another client already blocked");
}

blocked_client = RedisModule_BlockClient(ctx, Block_RedisCommand, timeout > 0 ? Block_RedisCommand : NULL, NULL, timeout);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you pass Block_RedisCommand as timeout_callback on purpose?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same goes for the reply callback..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess it is on purpose? just for the comparison with NULL

anyway, it should be documented because it looks like a mistake

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the API was architected originally to support that. Same function prototype, and RedisModule_IsBlockedReplyRequest() and RedisModule_IsBlockedTimeoutRequest() to differentiate the flows.

oranagra
oranagra previously approved these changes Jun 30, 2021
@soloestoy
Copy link
Contributor

It reminds me about #4366 , using a default module timeout handler.

@yossigo yossigo merged commit aa139e2 into redis:unstable Jul 1, 2021
@yossigo yossigo deleted the client-unblock-module-crash branch July 1, 2021 14:11
@yossigo
Copy link
Collaborator Author

yossigo commented Jul 1, 2021

@soloestoy I agree, same problem basically. Not sure what would be the better option through - RM_BlockClient() returning a NULL or having a default callback.

@sundb
Copy link
Collaborator

sundb commented Jul 2, 2021

oranagra added a commit that referenced this pull request Jul 4, 2021
…ound (#9192)

fixes test issue introduced in #9167

1. invalid reads due to accessing non-retained string (passed as unblock context).
2. leaking module blocked client context, see #6922 for info.
@oranagra oranagra added the release-notes indication that this issue needs to be mentioned in the release notes label Jul 19, 2021
@oranagra oranagra mentioned this pull request Jul 21, 2021
oranagra pushed a commit that referenced this pull request Jul 21, 2021
Modules that use background threads with thread safe contexts are likely
to use RM_BlockClient() without a timeout function, because they do not
set up a timeout.

Before this commit, `CLIENT UNBLOCK` would result with a crash as the
`NULL` timeout callback is called. Beyond just crashing, this is also
logically wrong as it may throw the module into an unexpected client
state.

This commits makes `CLIENT UNBLOCK` on such clients behave the same as
any other client that is not in a blocked state and therefore cannot be
unblocked.

(cherry picked from commit aa139e2)
oranagra added a commit that referenced this pull request Jul 21, 2021
…ound (#9192)

fixes test issue introduced in #9167

1. invalid reads due to accessing non-retained string (passed as unblock context).
2. leaking module blocked client context, see #6922 for info.

(cherry picked from commit a8518cc)
JackieXie168 pushed a commit to JackieXie168/redis that referenced this pull request Sep 8, 2021
Modules that use background threads with thread safe contexts are likely
to use RM_BlockClient() without a timeout function, because they do not
set up a timeout.

Before this commit, `CLIENT UNBLOCK` would result with a crash as the
`NULL` timeout callback is called. Beyond just crashing, this is also
logically wrong as it may throw the module into an unexpected client
state.

This commits makes `CLIENT UNBLOCK` on such clients behave the same as
any other client that is not in a blocked state and therefore cannot be
unblocked.
JackieXie168 pushed a commit to JackieXie168/redis that referenced this pull request Sep 8, 2021
…ound (redis#9192)

fixes test issue introduced in redis#9167

1. invalid reads due to accessing non-retained string (passed as unblock context).
2. leaking module blocked client context, see redis#6922 for info.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-notes indication that this issue needs to be mentioned in the release notes

Projects

Development

Successfully merging this pull request may close these issues.

5 participants