Skip to content

Fix signal-unsafe TSan report in client#34988

Merged
kitaisreal merged 2 commits intoClickHouse:masterfrom
azat:safe-exit
Mar 3, 2022
Merged

Fix signal-unsafe TSan report in client#34988
kitaisreal merged 2 commits intoClickHouse:masterfrom
azat:safe-exit

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Mar 2, 2022

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix signal-unsafe TSan report in client

CI founds 1:

WARNING: ThreadSanitizer: signal-unsafe call inside of a signal (pid=2975)
    0 malloc  (clickhouse+0xab36d8d)
    1 _dl_exception_create_format  (ld-linux-x86-64.so.2+0x18ea8)
    2 DB::interruptSignalHandler(int) obj-x86_64-linux-gnu/../src/Client/ClientBase.cpp:236:9 (clickhouse+0x1a1d603e)
    3 __tsan::CallUserSignalHandler(__tsan::ThreadState*, bool, bool, bool, int, __sanitizer::__sanitizer_siginfo*, void*) crtstuff.c (clickhouse+0xab3ee5f)
    4 unsigned long std::__1::__cxx_atomic_load(std::__1::__cxx_atomic_base_impl const*, std::__1::memory_order) obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:1006:12 (clickhouse+0x1da6c41d)
    5 std::__1::__atomic_base::load(std::__1::memory_order) const obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:1615:17 (clickhouse+0x1da6c41d)
    6 cctz::TimeZoneInfo::MakeTime(cctz::detail::civil_time const&) const obj-x86_64-linux-gnu/../contrib/cctz/src/time_zone_info.cc:844:47 (clickhouse+0x1da6c41d)
    7 cctz::time_zone::Impl::MakeTime(cctz::detail::civil_time const&) const obj-x86_64-linux-gnu/../contrib/cctz/src/time_zone_impl.h:57:19 (clickhouse+0x1da64777)
    8 cctz::time_zone::lookup(cctz::detail::civil_time const&) const obj-x86_64-linux-gnu/../contrib/cctz/src/time_zone_lookup.cc:72:27 (clickhouse+0x1da64777)
    9 DateLUTImpl::DateLUTImpl(std::__1::basic_string, std::__1::allocator > const&) obj-x86_64-linux-gnu/../src/Common/DateLUTImpl.cpp:70:63 (clickhouse+0xac8910b)
    10 DateLUT::getImplementation(std::__1::basic_string, std::__1::allocator > const&) const obj-x86_64-linux-gnu/../src/Common/DateLUT.cpp:155:55 (clickhouse+0xac87404)
    11 DateLUT::DateLUT() obj-x86_64-linux-gnu/../src/Common/DateLUT.cpp:145:25 (clickhouse+0xac8697f)
    12 DateLUT::getInstance() obj-x86_64-linux-gnu/../src/Common/DateLUT.cpp:162:20 (clickhouse+0xac87530)
    13 DateLUT::instance(std::__1::basic_string, std::__1::allocator > const&) obj-x86_64-linux-gnu/../src/Common/DateLUT.h:29:33 (clickhouse+0x188481f9)
    14 TimezoneMixin::TimezoneMixin(std::__1::basic_string, std::__1::allocator > const&) obj-x86_64-linux-gnu/../src/DataTypes/TimezoneMixin.h:18:21 (clickhouse+0x188481f9)
    15 DB::DataTypeDateTime::DataTypeDateTime(std::__1::basic_string, std::__1::allocator > const&) obj-x86_64-linux-gnu/../src/DataTypes/DataTypeDateTime.cpp:11:7 (clickhouse+0x18847e55)
    16 DB::(anonymous namespace)::FunctionEmptyArray::getNameImpl() emptyArray.cpp (clickhouse+0x1602abb1)
    17 DB::registerFunctionsEmptyArray(DB::FunctionFactory&)  (clickhouse+0x16023b6f)
    18 DB::registerFunctionsArray(DB::FunctionFactory&)  (clickhouse+0x15de7b99)
    19 DB::registerFunctions()  (clickhouse+0xe4e7bc6)
    20 DB::Client::main(std::__1::vector, std::__1::allocator >, std::__1::allocator, std::__1::allocator > > > const&) obj-x86_64-linux-gnu/../programs/client/Client.cpp:402:5 (clickhouse+0xacb2649)
    21 Poco::Util::Application::run() obj-x86_64-linux-gnu/../contrib/poco/Util/src/Application.cpp:334:8 (clickhouse+0x1d96868a)
    22 mainEntryClickHouseClient(int, char**) obj-x86_64-linux-gnu/../programs/client/Client.cpp:1237:23 (clickhouse+0xacbdd7c)
    23 main obj-x86_64-linux-gnu/../programs/main.cpp:378:12 (clickhouse+0xabae77a)

Fixes: #34923
Cc: @kssenii

@robot-clickhouse robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label Mar 2, 2022
@kitaisreal kitaisreal self-assigned this Mar 2, 2022
@azat
Copy link
Copy Markdown
Member Author

azat commented Mar 2, 2022

FWIW Actually CI should not show anything interesting, since this issue exists only if queries successfully terminated, and this is in a separate PR - #34924

@azat
Copy link
Copy Markdown
Member Author

azat commented Mar 2, 2022

Integration tests (asan, actions) [1/3] — fail: 2, passed: 660, flaky: 0

@azat
Copy link
Copy Markdown
Member Author

azat commented Mar 2, 2022

Stress test (thread, actions) — Sanitizer assert (in stderr.log)

WARNING: ThreadSanitizer: thread leak (pid=496)
  Thread T759 (tid=21905, finished) created by thread T321 at:
    #0 pthread_create <null> (clickhouse+0xab384dd)
    #1 std::__1::__libcpp_thread_create(unsigned long*, void* (*)(void*), void*) obj-x86_64-linux-gnu/../contrib/libcxx/include/__threading_support:509:10 (clickhouse+0x1d3c1620)
    #2 std::__1::thread::thread<std::__1::__bind<void (Hdfs::Internal::RpcClientImpl::*)(), Hdfs::Internal::RpcClientImpl*>, void>(std::__1::__bind<void (Hdfs::Internal::RpcClientImpl::*)(), Hdfs::Internal::RpcClientImpl*>&&) obj-x86_64-linux-gnu/../contrib/libcxx/include/thread:307:16 (clickhouse+0x1d3c1620)
    #3 Hdfs::Internal::RpcClientImpl::getChannel(Hdfs::Internal::RpcAuth const&, Hdfs::Internal::RpcProtocolInfo const&, Hdfs::Internal::RpcServerInfo const&, Hdfs::Internal::RpcConfig const&) obj-x86_64-linux-gnu/../contrib/libhdfs3/src/rpc/RpcClient.cpp:166:13 (clickhouse+0x1d3c0b34)
    #4 Hdfs::Internal::NamenodeImpl::invoke(Hdfs::Internal::RpcCall const&) obj-x86_64-linux-gnu/../contrib/libhdfs3/src/server/NamenodeImpl.cpp:56:35 (clickhouse+0x1d3e2387)
    #5 Hdfs::Internal::NamenodeImpl::getFsStats() obj-x86_64-linux-gnu/../contrib/libhdfs3/src/server/NamenodeImpl.cpp:521:9 (clickhouse+0x1d3e2387)
    #6 Hdfs::Internal::NamenodeProxy::getFsStats() obj-x86_64-linux-gnu/../contrib/libhdfs3/src/server/NamenodeProxy.cpp:416:22 (clickhouse+0x1d3d93e4)
    #7 Hdfs::Internal::FileSystemImpl::getFsStats() obj-x86_64-linux-gnu/../contrib/libhdfs3/src/client/FileSystemImpl.cpp:565:39 (clickhouse+0x1d37ebeb)
    #8 Hdfs::Internal::FileSystemImpl::connect() obj-x86_64-linux-gnu/../contrib/libhdfs3/src/client/FileSystemImpl.cpp:193:5 (clickhouse+0x1d37be97)
    #9 Hdfs::FileSystem::connect(char const*, char const*, char const*) obj-x86_64-linux-gnu/../contrib/libhdfs3/src/client/FileSystem.cpp:276:27 (clickhouse+0x1d374b26)
    #10 Hdfs::FileSystem::connect(char const*) obj-x86_64-linux-gnu/../contrib/libhdfs3/src/client/FileSystem.cpp:209:5 (clickhouse+0x1d374fe4)
    #11 hdfsBuilderConnect obj-x86_64-linux-gnu/../contrib/libhdfs3/src/client/Hdfs.cpp:493:17 (clickhouse+0x1d368338)
    #12 DB::createHDFSFS(hdfsBuilder*) obj-x86_64-linux-gnu/../src/Storages/HDFS/HDFSCommon.cpp:185:18 (clickhouse+0x183c2e42)
    #13 DB::HDFSSource::URISIterator::Impl::Impl() obj-x86_64-linux-gnu/../src/Storages/HDFS/StorageHDFS.cpp:250:24 (clickhouse+0x183bdfef)

SUMMARY: ThreadSanitizer: thread leak (/usr/bin/clickhouse+0xab384dd) in pthread_create

Missing defines.h, rebased.

azat added 2 commits March 2, 2022 22:17
v2: add missing defines.h header
Signed-off-by: Azat Khuzhin <[email protected]>
CI founds [1]:

    WARNING: ThreadSanitizer: signal-unsafe call inside of a signal (pid=2975)
        0 malloc  (clickhouse+0xab36d8d)
        1 _dl_exception_create_format  (ld-linux-x86-64.so.2+0x18ea8)
        2 DB::interruptSignalHandler(int) obj-x86_64-linux-gnu/../src/Client/ClientBase.cpp:236:9 (clickhouse+0x1a1d603e)
        3 __tsan::CallUserSignalHandler(__tsan::ThreadState*, bool, bool, bool, int, __sanitizer::__sanitizer_siginfo*, void*) crtstuff.c (clickhouse+0xab3ee5f)
        4 unsigned long std::__1::__cxx_atomic_load(std::__1::__cxx_atomic_base_impl const*, std::__1::memory_order) obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:1006:12 (clickhouse+0x1da6c41d)
        5 std::__1::__atomic_base::load(std::__1::memory_order) const obj-x86_64-linux-gnu/../contrib/libcxx/include/atomic:1615:17 (clickhouse+0x1da6c41d)
        6 cctz::TimeZoneInfo::MakeTime(cctz::detail::civil_time const&) const obj-x86_64-linux-gnu/../contrib/cctz/src/time_zone_info.cc:844:47 (clickhouse+0x1da6c41d)
        7 cctz::time_zone::Impl::MakeTime(cctz::detail::civil_time const&) const obj-x86_64-linux-gnu/../contrib/cctz/src/time_zone_impl.h:57:19 (clickhouse+0x1da64777)
        8 cctz::time_zone::lookup(cctz::detail::civil_time const&) const obj-x86_64-linux-gnu/../contrib/cctz/src/time_zone_lookup.cc:72:27 (clickhouse+0x1da64777)
        9 DateLUTImpl::DateLUTImpl(std::__1::basic_string, std::__1::allocator > const&) obj-x86_64-linux-gnu/../src/Common/DateLUTImpl.cpp:70:63 (clickhouse+0xac8910b)
        10 DateLUT::getImplementation(std::__1::basic_string, std::__1::allocator > const&) const obj-x86_64-linux-gnu/../src/Common/DateLUT.cpp:155:55 (clickhouse+0xac87404)
        11 DateLUT::DateLUT() obj-x86_64-linux-gnu/../src/Common/DateLUT.cpp:145:25 (clickhouse+0xac8697f)
        12 DateLUT::getInstance() obj-x86_64-linux-gnu/../src/Common/DateLUT.cpp:162:20 (clickhouse+0xac87530)
        13 DateLUT::instance(std::__1::basic_string, std::__1::allocator > const&) obj-x86_64-linux-gnu/../src/Common/DateLUT.h:29:33 (clickhouse+0x188481f9)
        14 TimezoneMixin::TimezoneMixin(std::__1::basic_string, std::__1::allocator > const&) obj-x86_64-linux-gnu/../src/DataTypes/TimezoneMixin.h:18:21 (clickhouse+0x188481f9)
        15 DB::DataTypeDateTime::DataTypeDateTime(std::__1::basic_string, std::__1::allocator > const&) obj-x86_64-linux-gnu/../src/DataTypes/DataTypeDateTime.cpp:11:7 (clickhouse+0x18847e55)
        16 DB::(anonymous namespace)::FunctionEmptyArray::getNameImpl() emptyArray.cpp (clickhouse+0x1602abb1)
        17 DB::registerFunctionsEmptyArray(DB::FunctionFactory&)  (clickhouse+0x16023b6f)
        18 DB::registerFunctionsArray(DB::FunctionFactory&)  (clickhouse+0x15de7b99)
        19 DB::registerFunctions()  (clickhouse+0xe4e7bc6)
        20 DB::Client::main(std::__1::vector, std::__1::allocator >, std::__1::allocator, std::__1::allocator > > > const&) obj-x86_64-linux-gnu/../programs/client/Client.cpp:402:5 (clickhouse+0xacb2649)
        21 Poco::Util::Application::run() obj-x86_64-linux-gnu/../contrib/poco/Util/src/Application.cpp:334:8 (clickhouse+0x1d96868a)
        22 mainEntryClickHouseClient(int, char**) obj-x86_64-linux-gnu/../programs/client/Client.cpp:1237:23 (clickhouse+0xacbdd7c)
        23 main obj-x86_64-linux-gnu/../programs/main.cpp:378:12 (clickhouse+0xabae77a)

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/34924/66cbb468eb6990b74ef4d08beefbe48d32bf4bd5/stateless_tests__thread__actions__[1/3].html

Fixes: ClickHouse#34923
Signed-off-by: Azat Khuzhin <[email protected]>
@kitaisreal kitaisreal merged commit 7d90afb into ClickHouse:master Mar 3, 2022
@azat azat deleted the safe-exit branch March 5, 2022 07:09
azat added a commit to azat/ClickHouse that referenced this pull request Nov 7, 2022
Because safeExit() does not includes header with defines, it does not
know about THREAD_SANITIZER.

And it also fixes Azure blob storage, actually everything is fine with
the sdk itself, the problem is only in TSan that intercepts _exit() and
report leak, even thoug that tread will be joined later.

Refs: ClickHouse#23056 (ClickHouse#23616)
Fixes: ClickHouse#38474
Closes: ClickHouse#42640
Fixes: ClickHouse#42638
Fixes: ClickHouse#34988
Cc: @alexey-milovidov, @tavplubix

Signed-off-by: Azat Khuzhin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants