Skip to content

Add support for DWARF-5 (with clang-15)#41063

Closed
azat wants to merge 35 commits intoClickHouse:masterfrom
azat:DWARF-5-v2-clang15
Closed

Add support for DWARF-5 (with clang-15)#41063
azat wants to merge 35 commits intoClickHouse:masterfrom
azat:DWARF-5-v2-clang15

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Sep 6, 2022

Changelog category (leave one):

  • Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Add support for DWARF-5 (with clang-15)

Refs: #40772
Refs: #41046

Note, this had been submitted separately, since I don't want to mix docker images, yes if I will revert the merge it should detech the change and rebuild docker images, however I may just reset to the clang-14, and then it will not detect this.

azat added 8 commits August 29, 2022 20:30
Signed-off-by: Azat Khuzhin <[email protected]>
(cherry picked from commit 444acb9)
I have to do this, since I have libraries compiled with DWARF5 (i.e.
glibc).

ClickHouse changes:
- use camel_case
- add NOLINT
- avoid using folly:: (use std:: instead)
- avoid using boost:: (use std:: instead)

Refs: facebook/folly@490b287
Signed-off-by: Azat Khuzhin <[email protected]>
(cherry picked from commit ee5696b)
(cherry picked from commit e03870b)
@robot-ch-test-poll robot-ch-test-poll added pr-not-for-changelog This PR should not be mentioned in the changelog submodule changed At least one submodule changed in this PR. labels Sep 6, 2022
azat and others added 20 commits September 8, 2022 11:53
It had been released few hours ago, and I want to check how clang-15
generates DWARF-5

Signed-off-by: Azat Khuzhin <[email protected]>
Build error [1]:

    Sep 06 18:40:53 FAILED: contrib/jemalloc-cmake/CMakeFiles/_jemalloc.dir/__/jemalloc/src/malloc_io.c.o
    Sep 06 18:40:53 /usr/bin/ccache /usr/bin/clang-15 --target=x86_64-linux-musl --sysroot=/build/cmake/linux/../../contrib/sysroot/linux-x86_64-musl -DHAS_RESERVED_IDENTIFIER -DJEMALLOC_NO_PRIVATE_NAMESPACE -DJEMALLOC_PROF=1 -DJEMALLOC_PROF_LIBGCC=1 -DSTD_EXCEPTION_HAS_STACK_TRACE=1 -DUSE_MUSL=1 -D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS -D_LIBCPP_HAS_MUSL_LIBC=1 -I../contrib/jemalloc/include -isystem ../contrib/jemalloc-cmake/include -isystem contrib/jemalloc-cmake/include_linux_x86_64_musl/jemalloc/internal -isystem ../contrib/libcxx/include -isystem ../contrib/libcxxabi/include -isystem ../contrib/libunwind/include --gcc-toolchain=/build/cmake/linux/../../contrib/sysroot/linux-x86_64-musl --gcc-toolchain=/build/cmake/linux/../../contrib/sysroot/linux-x86_64-musl -fdiagnostics-color=always -Xclang -fuse-ctor-homing  -gdwarf-aranges -pipe -mssse3 -msse4.1 -msse4.2 -mpclmul -mpopcnt -fasynchronous-unwind-tables -ffile-prefix-map=/build=. -falign-functions=32 -mbranches-within-32B-boundaries  -fdiagnostics-absolute-paths -fexperimental-new-pass-manager -w -O2 -g -DNDEBUG -O3 -g -gdwarf-4  -flto=thin -fwhole-program-vtables -fno-pie   -D OS_LINUX -D_GNU_SOURCE -Werror -std=gnu11 -MD -MT contrib/jemalloc-cmake/CMakeFiles/_jemalloc.dir/__/jemalloc/src/malloc_io.c.o -MF contrib/jemalloc-cmake/CMakeFiles/_jemalloc.dir/__/jemalloc/src/malloc_io.c.o.d -o contrib/jemalloc-cmake/CMakeFiles/_jemalloc.dir/__/jemalloc/src/malloc_io.c.o   -c ../contrib/jemalloc/src/malloc_io.c
    Sep 06 18:40:53 /build/contrib/jemalloc/src/malloc_io.c:100:8: error: incompatible integer to pointer conversion initializing 'char *' with an expression of type 'int' [-Wint-conversion]
    Sep 06 18:40:53         char *b = strerror_r(err, buf, buflen);
    Sep 06 18:40:53               ^   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    Sep 06 18:40:53 1 error generated.

  [1]: https://s3.amazonaws.com/clickhouse-builds/41046/0e9265ad951d40cdce3716fb8a679360b2e0c156/package_release/build_log.log

Signed-off-by: Azat Khuzhin <[email protected]>
- ASan - not enough memory
- Darwin/FreeBSD/Tidy - fails

Signed-off-by: Azat Khuzhin <[email protected]>
Signed-off-by: Azat Khuzhin <[email protected]>
Fixes the following use-of-uninitialized-value in llvm [1]:

    ==696==WARNING: MemorySanitizer: use-of-uninitialized-value
        0 0x498141d9 in llvm::ilist_traits<llvm::MachineInstr>::removeNodeFromList(llvm::MachineInstr*) build_docker/../contrib/llvm/llvm/lib/CodeGen/MachineBasicBlock.cpp:155:24
        1 0x498141d9 in llvm::iplist_impl<llvm::simple_ilist<llvm::MachineInstr, llvm::ilist_sentinel_tracking<true>>, llvm::ilist_traits<llvm::MachineInstr>>::remove(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, false, false>&) build_docker/../contrib/llvm/llvm/include/llvm/ADT/ilist.h:253:11
        2 0x498141d9 in llvm::iplist_impl<llvm::simple_ilist<llvm::MachineInstr, llvm::ilist_sentinel_tracking<true>>, llvm::ilist_traits<llvm::MachineInstr>>::erase(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, false, false>) build_docker/../contrib/llvm/llvm/include/llvm/ADT/ilist.h:268:22
        3 0x498141d9 in llvm::iplist_impl<llvm::simple_ilist<llvm::MachineInstr, llvm::ilist_sentinel_tracking<true>>, llvm::ilist_traits<llvm::MachineInstr>>::erase(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, false, false>, llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, false, false>) build_docker/../contrib/llvm/llvm/include/llvm/ADT/ilist.h:305:15
        4 0x498141d9 in llvm::iplist_impl<llvm::simple_ilist<llvm::MachineInstr, llvm::ilist_sentinel_tracking<true>>, llvm::ilist_traits<llvm::MachineInstr>>::clear() build_docker/../contrib/llvm/llvm/include/llvm/ADT/ilist.h:309:18
        5 0x498141d9 in llvm::iplist_impl<llvm::simple_ilist<llvm::MachineInstr, llvm::ilist_sentinel_tracking<true>>, llvm::ilist_traits<llvm::MachineInstr>>::~iplist_impl() build_docker/../contrib/llvm/llvm/include/llvm/ADT/ilist.h:210:20
        6 0x498145ff in llvm::MachineBasicBlock::~MachineBasicBlock() build_docker/../contrib/llvm/llvm/lib/CodeGen/MachineBasicBlock.cpp:56:1
        7 0x49969c1f in llvm::MachineFunction::DeleteMachineBasicBlock(llvm::MachineBasicBlock*) build_docker/../contrib/llvm/llvm/lib/CodeGen/MachineFunction.cpp:426:8
        8 0x49969c1f in llvm::ilist_alloc_traits<llvm::MachineBasicBlock>::deleteNode(llvm::MachineBasicBlock*) build_docker/../contrib/llvm/llvm/lib/CodeGen/MachineFunction.cpp:127:21
        9 0x4983f4d6 in llvm::iplist_impl<llvm::simple_ilist<llvm::MachineBasicBlock>, llvm::ilist_traits<llvm::MachineBasicBlock>>::erase(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineBasicBlock, false, false, void>, false, false>) build_docker/../contrib/llvm/llvm/include/llvm/ADT/ilist.h:268:11
        10 0x4983f4d6 in llvm::iplist_impl<llvm::simple_ilist<llvm::MachineBasicBlock>, llvm::ilist_traits<llvm::MachineBasicBlock>>::erase(llvm::MachineBasicBlock*) build_docker/../contrib/llvm/llvm/include/llvm/ADT/ilist.h:272:39
        11 0x4983f4d6 in llvm::MachineFunction::erase(llvm::MachineBasicBlock*) build_docker/../contrib/llvm/llvm/include/llvm/CodeGen/MachineFunction.h:767:53
        12 0x4983f4d6 in llvm::MachineBasicBlock::eraseFromParent() build_docker/../contrib/llvm/llvm/lib/CodeGen/MachineBasicBlock.cpp:1317:16
        13 0x4a0c9a27 in llvm::TailDuplicator::removeDeadBlock(llvm::MachineBasicBlock*, llvm::function_ref<void (llvm::MachineBasicBlock*)>*) build_docker/../contrib/llvm/llvm/lib/CodeGen/TailDuplicator.cpp:1051:8
        14 0x4a0c1e41 in llvm::TailDuplicator::tailDuplicateAndUpdate(bool, llvm::MachineBasicBlock*, llvm::MachineBasicBlock*, llvm::SmallVectorImpl<llvm::MachineBasicBlock*>*, llvm::function_ref<void (llvm::MachineBasicBlock*)>*, llvm::SmallVectorImpl<llvm::MachineBasicBlock*>*) build_docker/../contrib/llvm/llvm/lib/CodeGen/TailDuplicator.cpp:189:5
        15 0x4a0ca16e in llvm::TailDuplicator::tailDuplicateBlocks() build_docker/../contrib/llvm/llvm/lib/CodeGen/TailDuplicator.cpp:288:19
        16 0x4a0be9f9 in (anonymous namespace)::TailDuplicateBase::runOnMachineFunction(llvm::MachineFunction&) build_docker/../contrib/llvm/llvm/lib/CodeGen/TailDuplication.cpp:98:21
        17 0x499a2777 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) build_docker/../contrib/llvm/llvm/lib/CodeGen/MachineFunctionPass.cpp:72:13
        18 0x4dbba34d in llvm::FPPassManager::runOnFunction(llvm::Function&) build_docker/../contrib/llvm/llvm/lib/IR/LegacyPassManager.cpp:1435:27
        19 0x4dbe3761 in llvm::FPPassManager::runOnModule(llvm::Module&) build_docker/../contrib/llvm/llvm/lib/IR/LegacyPassManager.cpp:1481:16
        20 0x4dbbebbb in (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) build_docker/../contrib/llvm/llvm/lib/IR/LegacyPassManager.cpp:1550:27
        21 0x4dbbebbb in llvm::legacy::PassManagerImpl::run(llvm::Module&) build_docker/../contrib/llvm/llvm/lib/IR/LegacyPassManager.cpp:541:44
        22 0x4dbe454b in llvm::legacy::PassManager::run(llvm::Module&) build_docker/../contrib/llvm/llvm/lib/IR/LegacyPassManager.cpp:1677:14
        23 0x414405df in DB::JITCompiler::compile(llvm::Module&) build_docker/../src/Interpreters/JIT/CHJIT.cpp:78:22
        24 0x4143bb7d in DB::CHJIT::compileModule(std::__1::unique_ptr<llvm::Module, std::__1::default_delete<llvm::Module>>) build_docker/../src/Interpreters/JIT/CHJIT.cpp:378:29
        25 0x4143aded in DB::CHJIT::compileModule(std::__1::function<void (llvm::Module&)>) build_docker/../src/Interpreters/JIT/CHJIT.cpp:359:24
        26 0x4147b25e in DB::compileAggregateFunctions(DB::CHJIT&, std::__1::vector<DB::AggregateFunctionWithOffset, std::__1::allocator<DB::AggregateFunctionWithOffset>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) build_docker/../src/Interpreters/JIT/compileFunction.cpp:738:32
        27 0x3de0a23a in DB::Aggregator::compileAggregateFunctionsIfNeeded()::$_0::operator()() const build_docker/../src/Interpreters/Aggregator.cpp:661:49
        28 0x3de0a23a in std::__1::pair<std::__1::shared_ptr<DB::CompiledExpressionCacheEntry>, bool> DB::CacheBase<wide::integer<128ul, unsigned int>, DB::CompiledExpressionCacheEntry, UInt128Hash, DB::CompiledFunctionWeightFunction>::getOrSet<DB::Aggregator::compileAggregateFunctionsIfNeeded()::$_0>(wide::integer<128ul, unsigned int> const&, DB::Aggregator::compileAggregateFunctionsIfNeeded()::$_0&&) build_docker/../src/Common/CacheBase.h:125:24
        29 0x3de0a23a in DB::Aggregator::compileAggregateFunctionsIfNeeded() build_docker/../src/Interpreters/Aggregator.cpp:657:70

      Memory was marked as uninitialized
        0 0xb988ded in __sanitizer_dtor_callback (/usr/bin/clickhouse+0xb988ded) (BuildId: c4a880b742797a1f37bc4f5ed869f055cc86486b)
        1 0x498145da in llvm::MachineBasicBlock::~MachineBasicBlock() build_docker/../contrib/llvm/llvm/lib/CodeGen/MachineBasicBlock.cpp:56:1
        2 0x49969c1f in llvm::MachineFunction::DeleteMachineBasicBlock(llvm::MachineBasicBlock*) build_docker/../contrib/llvm/llvm/lib/CodeGen/MachineFunction.cpp:426:8
        3 0x49969c1f in llvm::ilist_alloc_traits<llvm::MachineBasicBlock>::deleteNode(llvm::MachineBasicBlock*) build_docker/../contrib/llvm/llvm/lib/CodeGen/MachineFunction.cpp:127:21
        4 0x4983f4d6 in llvm::iplist_impl<llvm::simple_ilist<llvm::MachineBasicBlock>, llvm::ilist_traits<llvm::MachineBasicBlock>>::erase(llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineBasicBlock, false, false, void>, false, false>) build_docker/../contrib/llvm/llvm/include/llvm/ADT/ilist.h:268:11
        5 0x4983f4d6 in llvm::iplist_impl<llvm::simple_ilist<llvm::MachineBasicBlock>, llvm::ilist_traits<llvm::MachineBasicBlock>>::erase(llvm::MachineBasicBlock*) build_docker/../contrib/llvm/llvm/include/llvm/ADT/ilist.h:272:39
        6 0x4983f4d6 in llvm::MachineFunction::erase(llvm::MachineBasicBlock*) build_docker/../contrib/llvm/llvm/include/llvm/CodeGen/MachineFunction.h:767:53
        7 0x4983f4d6 in llvm::MachineBasicBlock::eraseFromParent() build_docker/../contrib/llvm/llvm/lib/CodeGen/MachineBasicBlock.cpp:1317:16
        8 0x4a0c9a27 in llvm::TailDuplicator::removeDeadBlock(llvm::MachineBasicBlock*, llvm::function_ref<void (llvm::MachineBasicBlock*)>*) build_docker/../contrib/llvm/llvm/lib/CodeGen/TailDuplicator.cpp:1051:8
        9 0x4a0c1e41 in llvm::TailDuplicator::tailDuplicateAndUpdate(bool, llvm::MachineBasicBlock*, llvm::MachineBasicBlock*, llvm::SmallVectorImpl<llvm::MachineBasicBlock*>*, llvm::function_ref<void (llvm::MachineBasicBlock*)>*, llvm::SmallVectorImpl<llvm::MachineBasicBlock*>*) build_docker/../contrib/llvm/llvm/lib/CodeGen/TailDuplicator.cpp:189:5
        10 0x4a0ca16e in llvm::TailDuplicator::tailDuplicateBlocks() build_docker/../contrib/llvm/llvm/lib/CodeGen/TailDuplicator.cpp:288:19
        11 0x4a0be9f9 in (anonymous namespace)::TailDuplicateBase::runOnMachineFunction(llvm::MachineFunction&) build_docker/../contrib/llvm/llvm/lib/CodeGen/TailDuplication.cpp:98:21
        12 0x499a2777 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) build_docker/../contrib/llvm/llvm/lib/CodeGen/MachineFunctionPass.cpp:72:13
        13 0x4dbba34d in llvm::FPPassManager::runOnFunction(llvm::Function&) build_docker/../contrib/llvm/llvm/lib/IR/LegacyPassManager.cpp:1435:27
        14 0x4dbe3761 in llvm::FPPassManager::runOnModule(llvm::Module&) build_docker/../contrib/llvm/llvm/lib/IR/LegacyPassManager.cpp:1481:16
        15 0x4dbbebbb in (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) build_docker/../contrib/llvm/llvm/lib/IR/LegacyPassManager.cpp:1550:27
        16 0x4dbbebbb in llvm::legacy::PassManagerImpl::run(llvm::Module&) build_docker/../contrib/llvm/llvm/lib/IR/LegacyPassManager.cpp:541:44
        17 0x4dbe454b in llvm::legacy::PassManager::run(llvm::Module&) build_docker/../contrib/llvm/llvm/lib/IR/LegacyPassManager.cpp:1677:14
        18 0x414405df in DB::JITCompiler::compile(llvm::Module&) build_docker/../src/Interpreters/JIT/CHJIT.cpp:78:22
        19 0x4143bb7d in DB::CHJIT::compileModule(std::__1::unique_ptr<llvm::Module, std::__1::default_delete<llvm::Module>>) build_docker/../src/Interpreters/JIT/CHJIT.cpp:378:29
        20 0x4143aded in DB::CHJIT::compileModule(std::__1::function<void (llvm::Module&)>) build_docker/../src/Interpreters/JIT/CHJIT.cpp:359:24
        21 0x4147b25e in DB::compileAggregateFunctions(DB::CHJIT&, std::__1::vector<DB::AggregateFunctionWithOffset, std::__1::allocator<DB::AggregateFunctionWithOffset>> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>) build_docker/../src/Interpreters/JIT/compileFunction.cpp:738:32
        22 0x3de0a23a in DB::Aggregator::compileAggregateFunctionsIfNeeded()::$_0::operator()() const build_docker/../src/Interpreters/Aggregator.cpp:661:49
        23 0x3de0a23a in std::__1::pair<std::__1::shared_ptr<DB::CompiledExpressionCacheEntry>, bool> DB::CacheBase<wide::integer<128ul, unsigned int>, DB::CompiledExpressionCacheEntry, UInt128Hash, DB::CompiledFunctionWeightFunction>::getOrSet<DB::Aggregator::compileAggregateFunctionsIfNeeded()::$_0>(wide::integer<128ul, unsigned int> const&, DB::Aggregator::compileAggregateFunctionsIfNeeded()::$_0&&) build_docker/../src/Common/CacheBase.h:125:24
        24 0x3de0a23a in DB::Aggregator::compileAggregateFunctionsIfNeeded() build_docker/../src/Interpreters/Aggregator.cpp:657:70

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/41046/490a2c75610c4bc3191d55226f8454b3c3d3919a/stateful_tests__msan_.html

Note, that it is safe to do, but only for this method, since it had been
disabled anyway, back in ClickHouse#27574, and I guess this MSan report may be
related.
This should address issue with ASan:

- CI report - https://s3.amazonaws.com/clickhouse-test-reports/41046/490a2c75610c4bc3191d55226f8454b3c3d3919a/stateless_tests__asan__[1/2].html

    2022-09-08 19:39:40 kj/exception.c++:977: failed: expected offset < 65536 && offset > -65536; ExceptionCallback must be allocated on the stack.

- Discussion in ML - https://www.mail-archive.com/[email protected]/msg01451.html

- Fix capnproto/capnproto@c4eef80a13e8575d

    "Fix ASAN problems under Clang 15.

    It appears ASAN now by default tries to detect stack-use-after-return. This breaks our assumptions in requireOnStack() and totally breaks fibers.

    For requireOnStack() we can just skip the check in this case.

    For fibers, we need to implement the ASAN hints to tell it when we're switching fibers."

Signed-off-by: Azat Khuzhin <[email protected]>
azat added 2 commits September 9, 2022 15:37
Since it also does not fits into timeouts [1]:

    2022-09-08 21:16:20 [ 377 ] DEBUG : Command:['docker', 'exec', '-u', 'root', 'roottestsendcrashreports_node_1', 'bash', '-c', 'pkill -SEGV clickhouse'] (cluster.py:95, run_and_check)
    ...
    2022-09-08 21:16:22 [ 377 ] DEBUG : run container_id:roottestsendcrashreports_node_1 detach:False nothrow:False cmd: ['cat', '/result.txt'] (cluster.py:1744, exec_in_container)
    ...
    2022-09-08 21:16:36 [ 377 ] DEBUG : Stdout:INITIAL_STATE (cluster.py:103, run_and_check)

And server logs:

    2022.09.08 21:16:21.112076 [ 228 ] {} <Fatal> BaseDaemon: ########################################
    2022.09.08 21:16:21.112170 [ 228 ] {} <Fatal> BaseDaemon: (version 22.9.1.1, build id: 0F7336E0A4D64134C51C8365DADCB78A9B39AA3B) (from thread 1) (no query) Received signal Segmentation fault (11)
    2022.09.08 21:16:21.112244 [ 228 ] {} <Fatal> BaseDaemon: Address: 0xde Access: read. Unknown si_code.
    2022.09.08 21:16:21.112321 [ 228 ] {} <Fatal> BaseDaemon: Stack trace: 0x7fbe21d09376 0x40a4f71a 0xe293a4b 0xdc5c51a 0x38dee227 0xdc326a0 0x38e319d9 0xdc2b4e2 0xdc25fdb 0x7fbe21b2c083 0xdb636ae
    2022.09.08 21:16:21.112419 [ 228 ] {} <Fatal> BaseDaemon: 3. pthread_cond_wait @ 0x7fbe21d09376 in ?
    2022.09.08 21:16:21.122914 [ 228 ] {} <Fatal> BaseDaemon: 4. ./build_docker/../contrib/libcxx/src/condition_variable.cpp:0: std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&) @ 0x40a4f71a in /usr/bin/clickhouse
    2022.09.08 21:16:21.233016 [ 228 ] {} <Fatal> BaseDaemon: 5.1. inlined from ./build_docker/../contrib/libcxx/include/atomic:952: unsigned long std::__1::__cxx_atomic_load<unsigned long>(std::__1::__cxx_atomic_base_impl<unsigned long> const*, std::__1::memory_order)
    2022.09.08 21:16:21.233135 [ 228 ] {} <Fatal> BaseDaemon: 5.2. inlined from ../contrib/libcxx/include/atomic:1582: std::__1::__atomic_base<unsigned long, false>::load(std::__1::memory_order) const
    2022.09.08 21:16:21.233183 [ 228 ] {} <Fatal> BaseDaemon: 5.3. inlined from ../contrib/libcxx/include/atomic:1586: std::__1::__atomic_base<unsigned long, false>::operator unsigned long() const
    2022.09.08 21:16:21.233234 [ 228 ] {} <Fatal> BaseDaemon: 5.4. inlined from ../src/Daemon/BaseDaemon.cpp:967: operator()
    2022.09.08 21:16:21.233303 [ 228 ] {} <Fatal> BaseDaemon: 5.5. inlined from ../contrib/libcxx/include/__mutex_base:402: void std::__1::condition_variable::wait<BaseDaemon::waitForTerminationRequest()::$_0>(std::__1::unique_lock<std::__1::mutex>&, BaseDaemon::waitForTerminationRequest()::$_0)
    2022.09.08 21:16:21.233334 [ 228 ] {} <Fatal> BaseDaemon: 5. ../src/Daemon/BaseDaemon.cpp:967: BaseDaemon::waitForTerminationRequest() @ 0xe293a4b in /usr/bin/clickhouse
    2022.09.08 21:16:21.350675 [ 228 ] {} <Fatal> BaseDaemon: 6. ./build_docker/../programs/server/Server.cpp:0: DB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) @ 0xdc5c51a in /usr/bin/clickhouse
    2022.09.08 21:16:21.394092 [ 228 ] {} <Fatal> BaseDaemon: 7. ./build_docker/../contrib/poco/Util/src/Application.cpp:0: Poco::Util::Application::run() @ 0x38dee227 in /usr/bin/clickhouse
    2022.09.08 21:16:21.654195 [ 228 ] {} <Fatal> BaseDaemon: 8. ./build_docker/../programs/server/Server.cpp:466: DB::Server::run() @ 0xdc326a0 in /usr/bin/clickhouse
    2022.09.08 21:16:21.666991 [ 228 ] {} <Fatal> BaseDaemon: 9. ./build_docker/../contrib/poco/Util/src/ServerApplication.cpp:0: Poco::Util::ServerApplication::run(int, char**) @ 0x38e319d9 in /usr/bin/clickhouse
    2022.09.08 21:16:21.916078 [ 228 ] {} <Fatal> BaseDaemon: 10. ./build_docker/../programs/server/Server.cpp:0: mainEntryClickHouseServer(int, char**) @ 0xdc2b4e2 in /usr/bin/clickhouse
    2022.09.08 21:16:21.929922 [ 228 ] {} <Fatal> BaseDaemon: 11. ./build_docker/../programs/main.cpp:0: main @ 0xdc25fdb in /usr/bin/clickhouse
    2022.09.08 21:16:21.929981 [ 228 ] {} <Fatal> BaseDaemon: 12. __libc_start_main @ 0x7fbe21b2c083 in ?
    2022.09.08 21:16:30.357032 [ 228 ] {} <Fatal> BaseDaemon: 13. _start @ 0xdb636ae in /usr/bin/clickhouse
    2022.09.08 21:16:31.383233 [ 228 ] {} <Fatal> BaseDaemon: Integrity check of the executable skipped because the reference checksum could not be read. (calculated checksum: 6200AC7C1270DC293DF3302E1C64399B)
    ...
    2022.09.08 21:16:40.564453 [ 228 ] {} <Information> SentryWriter: Sending crash report

  [1]: https://s3.amazonaws.com/clickhouse-test-reports/41046/a0b85eaca8d4003c9fbc4571b30830d30f1984e9/integration_tests__asan__[3/3].html

Though another option is to increase waiting time.

Signed-off-by: Azat Khuzhin <[email protected]>
@azat
Copy link
Copy Markdown
Member Author

azat commented Sep 9, 2022

@azat azat force-pushed the DWARF-5-v2-clang15 branch from d5db2bc to 26b38a8 Compare September 9, 2022 18:12
@azat azat mentioned this pull request Sep 11, 2022
7 tasks
@azat
Copy link
Copy Markdown
Member Author

azat commented Sep 11, 2022

This does not help, but I will try one more time to see how cores will be dumped w/o catching SIGTRAP/SIGABRT explicitly in gdb

@azat
Copy link
Copy Markdown
Member Author

azat commented Sep 11, 2022

Stress test (msan) — Killed by signal (in clickhouse-server.log)

One interesting stacktrace here is :

(lldb) bt 20
* thread #1, name = 'clickhouse-serv', stop reason = signal SIGTRAP
  * frame #0: 0x00007f494705e09f ld-linux-x86-64.so.2`dl_open_worker(a=0x00007f49470769e8) at dl-open.c:744:6
    frame #1: 0x00007f4946f7a928 libc.so.6`__GI__dl_catch_exception(exception=<unavailable>, operate=<unavailable>, args=<unavailable>) at dl-error-skeleton.c:208:8
    frame #2: 0x00007f494705d60a ld-linux-x86-64.so.2`_dl_open(file="", mode=-2147483646, caller_dlopen=<unavailable>, nsid=-2, argc=6, argv=0x00007ffcc986d4b8, env=0x00007ffcc986d4f0) at dl-open.c:837:17
    frame #3: 0x00007f4946f798c1 libc.so.6`do_dlopen(ptr=0x00007f47641693d0) at dl-libc.c:96:15
    frame #4: 0x00007f4946f7a928 libc.so.6`__GI__dl_catch_exception(exception=<unavailable>, operate=<unavailable>, args=<unavailable>) at dl-error-skeleton.c:208:8
    frame #5: 0x00007f4946f7a9f3 libc.so.6`__GI__dl_catch_error(objname=0x00007f47641693c0, errstring=0x00007f47641693c8, mallocedp=0x00007f47641693bf, operate=<unavailable>, args=<unavailable>) at dl-error-skeleton.c:227:19
    frame #6: 0x00007f4946f799f5 libc.so.6`__GI___libc_dlopen_mode at dl-libc.c:46:17
    frame #7: 0x00007f4946f799cc libc.so.6`__GI___libc_dlopen_mode(name=<unavailable>, mode=<unavailable>) at dl-libc.c:195
    frame #8: 0x00007f4946f46fb9 libc.so.6`init at backtrace.c:54:19
    frame #9: 0x00007f494701d4df libpthread.so.0`__pthread_once_slow(once_control=0x00007f4947009e68, init_routine=(libc.so.6`init at backtrace.c:53:1)) at pthread_once.c:116:7
    frame #10: 0x00007f4946f47104 libc.so.6`__GI___backtrace(array=<unavailable>, size=<unavailable>) at backtrace.c:111:3
    frame #11: 0x000000000b9a5f2f clickhouse`__interceptor_backtrace + 159
    frame #12: 0x000000005088a5e9 clickhouse`kj::getStackTrace(space=ArrayPtr<void *> @ 0x000000002a35b130, ignoreCount=3) at exception.c++:240:17
    frame #13: 0x0000000050897bab clickhouse`kj::Exception::extendTrace(this=0x00007f4764169860, ignoreCount=2, limit=<unavailable>) at exception.c++:811:19
    frame #14: 0x0000000050899dc2 clickhouse`kj::throwRecoverableException(exception=0x00007f4764169860, ignoreCount=<unavailable>) at exception.c++:1137:13
    frame #15: 0x0000000050886b51 clickhouse`kj::_::Debug::Fault::~Fault(this=0x00007f4764169ab0) at debug.c++:363:5
    frame #16: 0x000000005092db37 clickhouse`kj::ReadableDirectory::openFile(this=<unavailable>, path=PathPtr @ 0x00007f4764169a40) const at filesystem.c++:551:5
    frame #17: 0x00000000504c21d7 clickhouse`capnp::SchemaFile::newFromDirectory(baseDir=<unavailable>, path=<unavailable>, importPath=ArrayPtr<const kj::ReadableDirectory *const> @ 0x00007f4764169b30, displayNameOverride=<unavailable>) at schema-parser.c++:451:78
    frame #18: 0x00000000504c15d1 clickhouse`capnp::SchemaParser::parseFromDirectory(this=0x000070300081d038, baseDir=0x00007010000b2c90, path=<unavailable>, importPath=ArrayPtr<const kj::ReadableDirectory *const> @ 0x0000000031641940) const at schema-parser.c++:200:20
    frame #19: 0x00000000448caf4a clickhouse`DB::CapnProtoSchemaParser::getMessageSchema(this=0x000070300081d038, schema_info=0x00007f476416a210) + 1066
...
(lldb) p *path.parts.ptr
warning: `this' is not accessible (substituting 0)
(const kj::String) $34 = {
  content = (ptr = "nonexist.capnp", size_ = 15, disposer = 0x0000000008a1c2f0)
}

And indeed there is int3 instruction:

    0x7f494705e093 <+1139>: jne    0x7f494705e450            ; <+2096> [inlined] update_tls_slotinfo at dl-open.c:732
    0x7f494705e099 <+1145>: testl  %r13d, %r13d
    0x7f494705e09c <+1148>: je     0x7f494705e09f            ; <+1151> at dl-open.c:744:6
    0x7f494705e09e <+1150>: int3
->  0x7f494705e09f <+1151>: movl   -0x54(%rbp), %eax
    0x7f494705e0a2 <+1154>: testl  %eax, %eax
    0x7f494705e0a4 <+1156>: jne    0x7f494705e410            ; <+2032> at dl-open.c:745:5

However there should be anything, here is how it should look (when I run ./clickhouse local --stacktrace -q "select * from file('data.capnp', 'CapnProto', 'val1 char') settings format_schema='nonexist:Message'")

    0x7ffff7fe5093 <+1139>: jne    0x7ffff7fe5450            ; <+2096> [inlined] update_tls_slotinfo at dl-open.c:732
    0x7ffff7fe5099 <+1145>: testl  %r13d, %r13d
    0x7ffff7fe509c <+1148>: je     0x7ffff7fe509f            ; <+1151> at dl-open.c:744:6
    0x7ffff7fe509e <+1150>: nop
->  0x7ffff7fe509f <+1151>: movl   -0x54(%rbp), %eax
    0x7ffff7fe50a2 <+1154>: testl  %eax, %eax
    0x7ffff7fe50a4 <+1156>: jne    0x7ffff7fe5410            ; <+2032> at dl-open.c:745:5

This looks like SDT batched nop to int3.

Okay, I think I found the reason, the problem is that the binary is running with gdb attached, and since gdb cannot handle DWARF-5 (there are some errors like Dwarf Error: DW_FORM_strx1 found in non-DWO CU), it doing some messy things, in particular replacing activating this SDT probes, that libc has there https://github.com/bminor/glibc/blob/c8f2a3e8038232f7707d11b4629f5d5cf32244fc/elf/dl-open.c#L736

@azat
Copy link
Copy Markdown
Member Author

azat commented Sep 11, 2022

But anyway clang-15 does not help, will close this and open new if will try to test something else.

@azat azat closed this Sep 11, 2022
@azat azat mentioned this pull request Sep 11, 2022
@azat azat deleted the DWARF-5-v2-clang15 branch September 11, 2022 19:02
azat added a commit to azat/ClickHouse that referenced this pull request Sep 11, 2022
ClickHouse changes to the folly parser:
- use camel_case
- add NOLINT
- avoid using folly:: (use std:: instead)
- avoid using boost:: (use std:: instead)

But note, now it has not been enabled by default (like it was
initially), because you may need recent debugger to support DWARF-5
correctly, and to make debugging easier, let's do this later.

A good example is gdb 10, even though it looks like it should support
it, it still produce some errors, like here [1]:

    Dwarf Error: DW_FORM_strx1 found in non-DWO CU [in module /usr/bin/clickhouse]

  [1]: ClickHouse#40772 (comment)

And not only it complains, apparently it can "activate" SDT probes
(replace "nop" with "int3"), and I believe this is what happens here
[2].

  [2]: ClickHouse#41063 (comment)

There you got int3 in the case when ClickHouse got SIGTRAP:

<details>

```
    0x7f494705e093 <+1139>: jne    0x7f494705e450            ; <+2096> [inlined] update_tls_slotinfo at dl-open.c:732
    0x7f494705e099 <+1145>: testl  %r13d, %r13d
    0x7f494705e09c <+1148>: je     0x7f494705e09f            ; <+1151> at dl-open.c:744:6
    0x7f494705e09e <+1150>: int3
->  0x7f494705e09f <+1151>: movl   -0x54(%rbp), %eax
    0x7f494705e0a2 <+1154>: testl  %eax, %eax
    0x7f494705e0a4 <+1156>: jne    0x7f494705e410            ; <+2032> at dl-open.c:745:5

But if I repeat the query it does not:

    0x7ffff7fe5093 <+1139>: jne    0x7ffff7fe5450            ; <+2096> [inlined] update_tls_slotinfo at dl-open.c:732
    0x7ffff7fe5099 <+1145>: testl  %r13d, %r13d
    0x7ffff7fe509c <+1148>: je     0x7ffff7fe509f            ; <+1151> at dl-open.c:744:6
    0x7ffff7fe509e <+1150>: nop
->  0x7ffff7fe509f <+1151>: movl   -0x54(%rbp), %eax
    0x7ffff7fe50a2 <+1154>: testl  %eax, %eax
    0x7ffff7fe50a4 <+1156>: jne    0x7ffff7fe5410            ; <+2032> at dl-open.c:745:5
```

</details>

Test command was:

    clickhouse local --stacktrace -q "select * from file('data.capnp', 'CapnProto', 'val1 char') settings format_schema='nonexist:Message'

*P.S. I did this, because I have libraries compiled with DWARF5 (i.e. glibc), and dwarf parser simply fails on my dev env.*

Refs: facebook/folly@490b287
(cherry picked from commit ee5696b)
(cherry picked from commit e03870b)
Signed-off-by: Azat Khuzhin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog submodule changed At least one submodule changed in this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants