Fix abnormal server termination when http client goes away#20464
Merged
alesapin merged 2 commits intoClickHouse:masterfrom Feb 15, 2021
Merged
Fix abnormal server termination when http client goes away#20464alesapin merged 2 commits intoClickHouse:masterfrom
alesapin merged 2 commits intoClickHouse:masterfrom
Conversation
In [1] stress tests found:
2021.02.12 14:20:58.800988 [ 17728 ] {de3e7894-b401-4f7d-8530-90cd5ab06682} <Debug> executeQuery: (from [::1]:45792, using production parser) (comment: /usr/share/clickhouse-test/queries/0_stateless/01520_client_print_query_id.expect) SELECT * FROM numbers(34599)
2021.02.12 14:20:58.916484 [ 17728 ] {de3e7894-b401-4f7d-8530-90cd5ab06682} <Trace> ContextAccess (default): Access granted: CREATE TEMPORARY TABLE ON *.*
2021.02.12 14:20:59.071980 [ 17728 ] {de3e7894-b401-4f7d-8530-90cd5ab06682} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
2021.02.12 14:21:10.708202 [ 17728 ] {de3e7894-b401-4f7d-8530-90cd5ab06682} <Information> executeQuery: Read 34599 rows, 270.30 KiB in 11.876294055 sec., 2913 rows/sec., 22.76 KiB/sec.
2021.02.12 14:22:10.506261 [ 17728 ] {de3e7894-b401-4f7d-8530-90cd5ab06682} <Debug> DynamicQueryHandler: Done processing query
2021.02.12 14:22:18.238037 [ 375 ] {} <Fatal> BaseDaemon: (version 21.3.1.5996, build id: 8DBCED54529C989F7AD4D991F51410774D55DE6C) (from thread 17728) Terminate called for uncaught exception:
Code: 24, e.displayText() = DB::Exception: Cannot write to ostream at offset 262994, Stack trace (when copying this message, always include the lines below):
0. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/exception:0: Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x15c976cb in /usr/bin/clickhouse
1. ./obj-x86_64-linux-gnu/../src/Common/Exception.cpp:56: DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0x8c9320e in /usr/bin/clickhouse
2. ./obj-x86_64-linux-gnu/../src/IO/WriteBufferFromOStream.cpp:0: DB::WriteBufferFromOStream::nextImpl() @ 0x8d54da5 in /usr/bin/clickhouse
3. ./obj-x86_64-linux-gnu/../src/IO/BufferBase.h:39: DB::WriteBufferFromOStream::~WriteBufferFromOStream() @ 0x8d551d7 in /usr/bin/clickhouse
4. ./obj-x86_64-linux-gnu/../src/IO/WriteBufferFromOStream.cpp:44: DB::Write
2021.02.12 14:22:18.811071 [ 18134 ] {} <Fatal> BaseDaemon: ########################################
2021.02.12 14:22:18.878935 [ 18134 ] {} <Fatal> BaseDaemon: (version 21.3.1.5996, build id: 8DBCED54529C989F7AD4D991F51410774D55DE6C) (from thread 17728) (query_id: de3e7894-b401-4f7d-8530-90cd5ab06682) Received signal Aborted (6)
2021.02.12 14:22:18.943148 [ 18134 ] {} <Fatal> BaseDaemon:
2021.02.12 14:22:19.007073 [ 18134 ] {} <Fatal> BaseDaemon: Stack trace: 0x7f109932018b 0x7f10992ff859 0x8bb33ae 0x8e301dd 0x17dac8c4 0x17dac7c7 0x8c3fe0b 0x8d552c5 0x8d552ea 0x11a29914 0x11a2a2ca 0x12f96092 0x12f8c65e 0x12f84300 0x15b84110 0x15bc0913 0x15bc103f 0x15d29a12 0x15d27fb0 0x15d267b8 0x8badbad 0x7f10994d5609 0x7f10993fc293
2021.02.12 14:22:19.255998 [ 18134 ] {} <Fatal> BaseDaemon: 5. raise @ 0x4618b in /usr/lib/x86_64-linux-gnu/libc-2.31.so
2021.02.12 14:22:19.270203 [ 18134 ] {} <Fatal> BaseDaemon: 6. abort @ 0x25859 in /usr/lib/x86_64-linux-gnu/libc-2.31.so
2021.02.12 14:22:50.108918 [ 370 ] {} <Fatal> Application: Child process was terminated by signal 6.
[1]: https://clickhouse-test-reports.s3.yandex.net/19580/6aecb62416ece880cbb8ee3a803e14d841388dde/stress_test_(thread).html#fail1
Verified locally by commenting out->next() call in
WriteBufferFromHTTPServerResponse::nextImpl(), adding a sleep(1) and
canceling HTTP request before it finished, the stacktrace as follow:
[ 6351 ] {} <Fatal> BaseDaemon: (version 21.3.1.1, build id: 9B40466BF3D2F5AED78A52A995A4A2FD3116787C) (from thread 6677) Terminate called for uncaught exception:
Code: 24, e.displayText() = DB::Exception: Cannot write to ostream at offset 4, Stack trace (when copying this message, always include the lines below):
0. /src/ch/clickhouse/.cmake/../src/Common/StackTrace.cpp:298: StackTrace::tryCapture() @ 0x30a52a in /src/ch/clickhouse/.cmake/src/libclickhouse_common_iod.so
1. /src/ch/clickhouse/.cmake/../src/Common/StackTrace.cpp:260: StackTrace::StackTrace() @ 0x30a4e5 in /src/ch/clickhouse/.cmake/src/libclickhouse_common_iod.so
2. /src/ch/clickhouse/.cmake/../src/Common/Exception.cpp:53: DB::Exception::Exception(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, bool) @ 0x2a61ae in /src/ch/clickhouse/.cmake/src/libclickhouse_common_iod.so
3. /src/ch/clickhouse/.cmake/../src/IO/WriteBufferFromOStream.cpp:22: DB::WriteBufferFromOStream::nextImpl() @ 0x3b468a in /src/ch/clickhouse/.cmake/src/libclickhouse_common_iod.so
4. /src/ch/clickhouse/.cmake/../src/IO/WriteBuffer.h:47
<snip>
[ 8966 ] {} <Fatal> BaseDaemon: 7. __cxxabiv1::__terminate(void (*)()) @ 0x1784ca in /src/ch/clickhouse/.cmake/contrib/replxx-cmake/libreplxxd.so
<snip>
[ 8966 ] {} <Fatal> BaseDaemon: 10. /src/ch/clickhouse/.cmake/../src/IO/WriteBufferFromOStream.cpp:0: DB::WriteBufferFromOStream::~WriteBufferFromOStream() @ 0x3b48c1 in /src/ch/clickhouse/.cmake/src/libclickhouse_common_iod.so
[ 8966 ] {} <Fatal> BaseDaemon: 11. /src/ch/clickhouse/.cmake/../src/IO/WriteBufferFromOStream.cpp:44: DB::WriteBufferFromOStream::~WriteBufferFromOStream() @ 0x3b48ec in /src/ch/clickhouse/.cmake/src/libclickhouse_common_iod.so
<snip>
[ 8966 ] {} <Fatal> BaseDaemon: 14. /src/ch/clickhouse/.cmake/../src/IO/WriteBufferFromHTTPServerResponse.cpp:218: DB::WriteBufferFromHTTPServerResponse::~WriteBufferFromHTTPServerResponse() @ 0x3b33cd in /src/ch/clickhouse/.cmake/src/libclickhouse_common_iod.so
<snip>
[ 8966 ] {} <Fatal> BaseDaemon: 22. /src/ch/clickhouse/.cmake/../src/Server/HTTPHandler.h:43: DB::HTTPHandler::Output::~Output() @ 0x260421 in /src/ch/clickhouse/.cmake/src/libclickhouse_serverd.so
[ 8966 ] {} <Fatal> BaseDaemon: 23. /src/ch/clickhouse/.cmake/../src/Server/HTTPHandler.cpp:778: DB::HTTPHandler::handleRequest(Poco::Net::HTTPServerRequest&, Poco::Net::HTTPServerResponse&) @ 0x253fd4 in /src/ch/clickhouse/.cmake/src/libclickhouse_serverd.so
cd6ba0a to
33f54cd
Compare
alesapin
approved these changes
Feb 15, 2021
azat
added a commit
to azat/ClickHouse
that referenced
this pull request
Feb 19, 2021
Even after ClickHouse#20464 it was still possible, for example [1]. 2021.02.19 11:40:21.886191 [ 68373 ] {} <Trace> DynamicQueryHandler: Request URI: /?database=test_ds2d6y&log_comment=/usr/share/clickhouse-test/queries/0_stateless/01302_aggregate_state_exception_memory_leak.sh&enable_http_compression=1&http_zlib_compression_level=1 <snip> 2021.02.19 11:41:35.289940 [ 365 ] {} <Fatal> BaseDaemon: (version 21.3.1.6058, build id: 8D46D65205E2C8B7FE408A0B4EC76CA0483F9E92) (from thread 68373) Terminate called for uncaught exception: Code: 24, e.displayText() = DB::Exception: Cannot write to ostream at offset 262568, Stack trace (when copying this message, always include the lines below): 0. ./obj-x86_64-linux-gnu/../contrib/libcxx/include/exception:0: Poco::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int) @ 0x15b3c7db in /usr/bin/clickhouse 1. ./obj-x86_64-linux-gnu/../src/Common/Exception.cpp:56: DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0x8aba66e in /usr/bin/clickhouse 2. ./obj-x86_64-linux-gnu/../src/IO/WriteBufferFromOStream.cpp:0: DB::WriteBufferFromOStream::nextImpl() @ 0x8b8c105 in /usr/bin/clickhouse 3. ./obj-x86_64-linux-gnu/../src/IO/BufferBase.h:39: DB::WriteBufferFromOStream::~WriteBufferFromOStream() @ 0x8b8c537 in /usr/bin/clickhouse 4. ./obj-x86_64-linux-gnu/../src/IO/WriteBufferFromOStream.cpp:44: DB::Write [1]: https://clickhouse-test-reports.s3.yandex.net/16481/5d150cce4778dd14f58dcff67435bdec1efa155b/stress_test_(thread).html#fail1 And according to this partial stacktrace it seems that the dtor of WriteBufferFromOStream was called from WriteBufferFromHTTPServerResponse, since the class name starts from DB::Write* The problem is that if first time WriteBufferFromOStream::next() fails, it will reset position to make next write no-op, however WriteBufferFromHTTPServerResponse::next() will set position to available buffer back, and next() will throw again, but this time it can be from dtor.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix abnormal server termination when http client goes away
Fixes: #19451 (please add the no-backport label, since it has not bee included into any release yet)
For details, see commit descriptions