Don't write to clients with CLIENT_CLOSE_ASAP flag by jeffreylovitz · Pull Request #7694 · redis/redis

jeffreylovitz · 2020-08-21T20:13:41Z

This PR resolves an issue observed on RedisGraph in which prematurely closing clients that are awaiting replies from Redis can cause a server crash on a broken pipe.

I can consistently reproduce this issue locally, but the code to do so is unfortunately non-trivial. The reproduction steps include instantiating several clients, issuing RedisGraph commands which return data, then disconnecting the clients without waiting for responses.

This causes a crash with the stack trace:

Thread 1 "redis-server" received signal SIGPIPE, Broken pipe.
0x00007ffff76252c7 in __libc_write (fd=16, buf=0x555555a74bc8, nbytes=140) at ../sysdeps/unix/sysv/linux/write.c:27
27      ../sysdeps/unix/sysv/linux/write.c: No such file or directory.

#0  0x00007ffff76252c7 in __libc_write (fd=16, buf=0x555555a74bc8, nbytes=140) at ../sysdeps/unix/sysv/linux/write.c:27
#1  0x000055555564dfef in connSocketWrite (conn=0x555555a1ec50, data=0x555555a74bc8, data_len=140) at connection.c:168
#2  0x00005555555a2af7 in connWrite (conn=0x555555a1ec50, data=0x555555a74bc8, data_len=140) at connection.h:140
#3  0x00005555555a630f in writeToClient (c=0x555555a74970, handler_installed=0) at networking.c:1311
#4  0x00005555555a6780 in handleClientsWithPendingWrites () at networking.c:1426
#5  0x00005555555ab64c in handleClientsWithPendingWritesUsingThreads () at networking.c:3063
#6  0x0000555555590ef5 in beforeSleep (eventLoop=0x555555928230) at server.c:2202
#7  0x000055555558a42a in aeProcessEvents (eventLoop=0x555555928230, flags=27) at ae.c:443
#8  0x000055555558a748 in aeMain (eventLoop=0x555555928230) at ae.c:539
#9  0x00005555555996fe in main (argc=2, argv=0x7fffffffd2a8) at server.c:5325

This issue is reproduce on unstable and 6.0.6, this PR resolves it on both.

madolson · 2020-08-22T01:13:11Z

Hey, thanks for trying to fix this, but I'm not sure this is the right fix. We should attempt to write out pending data from a client that has been marked as close asap, since there are certain places we do it intentionally. I'm also wondering what RedisGraph is doing, since the SIGPIPE signal should be ignored. You might want to follow up with them:

ref:

redis/src/server.c

Line 2815 in e61adc0

signal(SIGPIPE, SIG_IGN);

oranagra · 2020-08-23T05:34:33Z

This is indeed odd that you're getting that signal although suppose to be ignored.
@jeffreylovitz is there any chance RedisGraph is doing something that overrides that?
@yossigo maybe you have some other idea why this can happen?

Regarding the fix, there's actually already a PR with a similar fix and a similar title: #7202 (attempting to fix a different issue), it looks like Salvatore was actually ok with this specific change, but not about another one in that other PR.

@madolson i suppose you're referring to clientAcceptHandler, but there we're writing before setting the flag, and anyway, we're writing directly to the socket, not using the write handler.
is there anything i'm missing?

oranagra · 2020-08-23T05:50:31Z

Ohh sorry, i now realize it's not the same change as in the other PR (it's the same as the one Salvatore suggested instead of one of the changes there).

yossigo · 2020-08-23T08:05:06Z

@oranagra no idea, unless it gets overridden elsewhere but it doesn't seem so.

jeffreylovitz · 2020-08-23T13:10:26Z

@oranagra We override the SIGUSR signals and, on some versions, inject our own logs upon receiving the SIGSEGV signal before calling the Redis handler. Other than that, RedisGraph shouldn't interfere at all.

yossigo · 2020-08-23T18:45:04Z

@jeffreylovitz I can confirm creating similar situations where there's a lot of buffered output and a client drops, connSocketWrite() does get an EPIPE response as expected (i.e. SIGPIPE is ignored). Something in RedisGraph must be messing up with the handler...

jeffreylovitz · 2020-08-28T15:25:08Z

@yossigo To get more context, I added an explicit EPIPE signal handler to RedisGraph, and it fails to catch the signal or have any impact on this crash.

madolson · 2020-08-31T21:55:56Z

@oranagra I was actually thinking about ACLs, but thinking about it more this PR looks correct, since we have another functionality to achieve the mechanism I was considering.

oranagra · 2020-09-01T05:39:03Z

@madolson i guess you meant #7202 right?

madolson · 2020-10-16T03:43:43Z

@jeffreylovitz I think your change should now pass the tests since we fixed the underlying issue, would you mind doing an empty push to your branch to kickoff the CI again?

oranagra · 2020-10-16T12:36:02Z

This was actually already resolved in #7202 (has exactly this change and more. I forgot to close this one).
@madolson FYI, there's a button that can rerun a failed CI, but in this case a rebase was required (not an empty push)

**Bug Fixes:** * [#7219](RediSearch/RediSearch#7219) Fix a concurrency issue on Reducer in `FT.AGGREGATE` * [#7255](RediSearch/RediSearch#7255) Fix `BM25STD` underflow wraparound * [#7275](RediSearch/RediSearch#7275) Report used memory as `unsigned long long` to avoid underflows * [#7264](RediSearch/RediSearch#7264) Fix `totalDocsLen` updates * [#6995](RediSearch/RediSearch#6995) Do not fanout `FT.INFO` to replicas * [#7350](RediSearch/RediSearch#7350) Fix `FT.CREATE` failure with LeanVec parameters on non-Intel architectures * [#7694](RediSearch/RediSearch#7694) Use asynchronous jobs for GC in SVS to accelerate execution * [#7384](RediSearch/RediSearch#7384) Fix index load from RDB temporary memory overhead * [#7459](RediSearch/RediSearch#7459) Fix Fork GC potential double-free on error path * [#7458](RediSearch/RediSearch#7458) Fix a GC performence regression * [#7470](RediSearch/RediSearch#7470) Avoid draining worker thread pool from FLUSH callback to avoid deadlocks * [#7554](RediSearch/RediSearch#7554) Handle the case where `SCORE` is sent alone without extra fields (coordinator) * [#7685](RediSearch/RediSearch#7685) Fix cursor logical leak * [#7794](RediSearch/RediSearch#7794) Fix `cmp_strings()` to correctly handle binary data with embedded NULLs in TOLIST reducer in FT.AGGREGATE * [#7873](RediSearch/RediSearch#7873) Handle warnings in empty `FT.AGGREGATE` replies (cluster) * [#7886](RediSearch/RediSearch#7886) Remove non-TEXT fields from the spec keys dictionary * [#7904](RediSearch/RediSearch#7904) Refactor keys dictionary handling * [#7901](RediSearch/RediSearch#7901) Support multiple warnings in reply * [#8083](RediSearch/RediSearch#8083) Fix incorrect FULLTEXT field metric counts * [#8153](RediSearch/RediSearch#8153) Fix configuration registration issues **Improvements:** * [#7154](RediSearch/RediSearch#7154) `FT.AGGREGATE` can return Background Indexing OOM warnings * [#7083](RediSearch/RediSearch#7083) Add the default text scorer as a configuration option * [#7341](RediSearch/RediSearch#7341) Rename `FT.PROFILE` counter fields * [#7436](RediSearch/RediSearch#7436) Enhance `FT.PROFILE` with vector search execution details * [#7435](RediSearch/RediSearch#7435) Ensure full `FT.PROFILE` output on timeout with RETURN policy * [#7534](RediSearch/RediSearch#7534) Reduce the number of worker threads asynchronously to avoid deadlocks during queries * [#7614](RediSearch/RediSearch#7614) Track timeout warnings and errors in INFO * [#7646](RediSearch/RediSearch#7646) Track `maxprefixexpansions` warnings and errors in INFO * [#7577](RediSearch/RediSearch#7577) Track query syntax/argument errors (basis for query error tracking) * [#7737](RediSearch/RediSearch#7737) Add `Internal cursor reads` metric to cluster `FT.PROFILE` output * [#7759](RediSearch/RediSearch#7759) Extend indexing metrics * [#7710](RediSearch/RediSearch#7710) Support `WITHCOUNT` keyword in `FT.AGGREGATE` * [#7957](RediSearch/RediSearch#7957) Persist query warnings across cursor reads * [#8054](RediSearch/RediSearch#8054) Add logging for index-related commands * [#8151](RediSearch/RediSearch#8151) Fix shard total profile time reporting in `FT.PROFILE` * [#8103](RediSearch/RediSearch#8103) Output current thread IndexSpec information on crash

Don't write to clients with CLIENT_CLOSE_ASAP flag

7f49374

oranagra closed this Oct 16, 2020

alonre24 mentioned this pull request Jan 26, 2026

Update RediSearch 8.2.9 #14744

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't write to clients with CLIENT_CLOSE_ASAP flag#7694

Don't write to clients with CLIENT_CLOSE_ASAP flag#7694
jeffreylovitz wants to merge 1 commit intoredis:unstablefrom
jeffreylovitz:write-to-closing-clients

jeffreylovitz commented Aug 21, 2020

Uh oh!

madolson commented Aug 22, 2020

Uh oh!

oranagra commented Aug 23, 2020

Uh oh!

oranagra commented Aug 23, 2020 •

edited

Loading

Uh oh!

yossigo commented Aug 23, 2020

Uh oh!

jeffreylovitz commented Aug 23, 2020

Uh oh!

yossigo commented Aug 23, 2020

Uh oh!

jeffreylovitz commented Aug 28, 2020

Uh oh!

madolson commented Aug 31, 2020

Uh oh!

oranagra commented Sep 1, 2020

Uh oh!

madolson commented Oct 16, 2020

Uh oh!

oranagra commented Oct 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jeffreylovitz commented Aug 21, 2020

Uh oh!

madolson commented Aug 22, 2020

Uh oh!

oranagra commented Aug 23, 2020

Uh oh!

oranagra commented Aug 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yossigo commented Aug 23, 2020

Uh oh!

jeffreylovitz commented Aug 23, 2020

Uh oh!

yossigo commented Aug 23, 2020

Uh oh!

jeffreylovitz commented Aug 28, 2020

Uh oh!

madolson commented Aug 31, 2020

Uh oh!

oranagra commented Sep 1, 2020

Uh oh!

madolson commented Oct 16, 2020

Uh oh!

oranagra commented Oct 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

oranagra commented Aug 23, 2020 •

edited

Loading