Skip to content

Fix incorrect attach query in threadpool readers, get rid of static threadpools for reads/writes, make threadpool size for reads/writes configurable#41011

Merged
kssenii merged 19 commits intoClickHouse:masterfrom
kssenii:fix-thread-status
Sep 27, 2022
Merged

Fix incorrect attach query in threadpool readers, get rid of static threadpools for reads/writes, make threadpool size for reads/writes configurable#41011
kssenii merged 19 commits intoClickHouse:masterfrom
kssenii:fix-thread-status

Conversation

@kssenii
Copy link
Copy Markdown
Member

@kssenii kssenii commented Sep 5, 2022

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Improvements for reading from remote filesystems, made threadpool size for reads/writes configurable. Closes #41070.

@robot-ch-test-poll2 robot-ch-test-poll2 added the pr-not-for-changelog This PR should not be mentioned in the changelog label Sep 5, 2022
@kssenii
Copy link
Copy Markdown
Member Author

kssenii commented Sep 5, 2022

the hypothesis is confirmed - here is the same problem with threadpools as in #40821 ☹️ It was not visible before only because of this incorrect second thread status object

 ==95127==ERROR: AddressSanitizer: heap-use-after-free on address 0x61400004751c at pc 0x00000dbb5bad bp 0x7fcdfb1f5970 sp 0x7fcdfb1f5968 READ of size 4 at 0x61400004751c thread T176 (ThreadPool)
#0 0xdbb5bac in MemoryTracker::free(long) build_docker/../src/Common/MemoryTracker.cpp:312:49
#1 0xdbb5a38 in MemoryTracker::free(long) build_docker/../src/Common/MemoryTracker.cpp:353:22
#2 0xdbab827 in DB::ThreadStatus::~ThreadStatus() build_docker/../src/Common/ThreadStatus.cpp:148:20
#3 0xdcf4774 in ThreadFromGlobalPoolImpl::ThreadFromGlobalPoolImpl >::scheduleImpl(std::__1::function, int, std::__1::optional, bool)::'lambda0'()>(void&&)::'lambda'()::operator()() build_docker/../src/Common/ThreadPool.h:194:9
#4 0xdcf4774 in decltype(static_cast(fp)()) std::__1::__invoke::ThreadFromGlobalPoolImpl >::scheduleImpl(std::__1::function, int, std::__1::optional, bool)::'lambda0'()>(void&&)::'lambda'()&>(void&&) build_docker/../contrib/libcxx/include/type_traits:3640:23
#5 0xdce406b in std::__1::__function::__policy_func::operator()() const build_docker/../contrib/libcxx/include/__functional/function.h:843:16
#6 0xdce406b in std::__1::function::operator()() const build_docker/../contrib/libcxx/include/__functional/function.h:1184:12
#7 0xdce406b in ThreadPoolImpl::worker(std::__1::__list_iterator) build_docker/../src/Common/ThreadPool.cpp:294:17

@tavplubix
Copy link
Copy Markdown
Member

Let's also add assertions to ctor and dtor of ThreadStatus: chassert(!current_thread) and chassert(current_thread == this)

@kssenii kssenii changed the title Fix incorrect attach query in threadpool readers Fix incorrect attach query in threadpool readers, get rid of static threadpools for reads/writes, make threadpool size for reads/writes configurable Sep 7, 2022
Copy link
Copy Markdown
Member

@tavplubix tavplubix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but it would be great to add assertions into ThreadStatus (can be done in separate PR)

Co-authored-by: Alexander Tokmakov <[email protected]>
@kssenii kssenii requested a review from tavplubix September 26, 2022 10:50
@kssenii
Copy link
Copy Markdown
Member Author

kssenii commented Sep 27, 2022

AST fuzzer (asan)

#41745

2022.09.27 00:00:25.522497 [ 147 ] {} <Trace> BaseDaemon: Received signal -3
2022.09.27 00:00:25.523133 [ 402 ] {} <Fatal> BaseDaemon: ########################################
2022.09.27 00:00:25.523205 [ 402 ] {} <Fatal> BaseDaemon: (version 22.10.1.1, build id: 25DB6D4071A889C5095A909B9C8A817ADA945C3D) (from thread 183) (query_id: 9afecae0-dce0-4383-b96a-d1a770f9e59d::202209_5_9_1) (query: ) Received signal Unknown signal (-3)
2022.09.27 00:00:25.523308 [ 402 ] {} <Fatal> BaseDaemon: Sanitizer trap.
2022.09.27 00:00:25.523502 [ 402 ] {} <Fatal> BaseDaemon: Stack trace: 0xde118a7 0xe3c675a 0xdd0b306 0xdcee1e7 0xdcf0ffc 0xdcf1ad8 0x30770370 0x30767cff 0x306156e1 0x3061550e 0x304b2092 0x30206881 0x31a4b274 0x31d0c977 0x3183bc2a 0x3183b959 0x318473d7 0x31fc352b 0x31870568 0x31873b96 0xdf4ff20 0xdf59c6d 0xdf49e20 0xdf5390d 0x7f1ea9143609 0x7f1ea9068133
2022.09.27 00:00:25.565247 [ 402 ] {} <Fatal> BaseDaemon: 0.1. inlined from ./build_docker/../src/Common/StackTrace.cpp:332: StackTrace::tryCapture()
2022.09.27 00:00:25.565318 [ 402 ] {} <Fatal> BaseDaemon: 0. ./build_docker/../src/Common/StackTrace.cpp:293: StackTrace::StackTrace() @ 0xde118a7 in /workspace/clickhouse
2022.09.27 00:00:25.650307 [ 402 ] {} <Fatal> BaseDaemon: 1. ./build_docker/../src/Daemon/BaseDaemon.cpp:415: sanitizerDeathCallback() @ 0xe3c675a in /workspace/clickhouse
2022.09.27 00:00:29.821548 [ 402 ] {} <Fatal> BaseDaemon: 2. __sanitizer::Die() @ 0xdd0b306 in /workspace/clickhouse
2022.09.27 00:00:33.784548 [ 402 ] {} <Fatal> BaseDaemon: 3. ? @ 0xdcee1e7 in /workspace/clickhouse
2022.09.27 00:00:37.739243 [ 402 ] {} <Fatal> BaseDaemon: 4. __asan::ReportGenericError(unsigned long, unsigned long, unsigned long, unsigned long, bool, unsigned long, unsigned int, bool) @ 0xdcf0ffc in /workspace/clickhouse
2022.09.27 00:00:41.640965 [ 402 ] {} <Fatal> BaseDaemon: 5. __asan_report_load1 @ 0xdcf1ad8 in /workspace/clickhouse
2022.09.27 00:00:42.405360 [ 402 ] {} <Fatal> BaseDaemon: 6. ./build_docker/../src/Columns/ColumnVector.h:407: void DB::TargetSpecific::AVX512VBMI::vectorIndexImpl<DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 15ul, 16ul>, char8_t>(DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, unsigned long, DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 15ul, 16ul>&) @ 0x30770370 in /workspace/clickhouse
2022.09.27 00:00:42.931621 [ 402 ] {} <Fatal> BaseDaemon: 7. ./build_docker/../src/Columns/ColumnVector.h:530: COW<DB::IColumn>::immutable_ptr<DB::IColumn> DB::ColumnVector<char8_t>::indexImpl<char8_t>(DB::PODArray<char8_t, 4096ul, Allocator<false, false>, 15ul, 16ul> const&, unsigned long) const @ 0x30767cff in /workspace/clickhouse
2022.09.27 00:00:43.682371 [ 402 ] {} <Fatal> BaseDaemon: 8. ./build_docker/../src/Columns/ColumnsCommon.h:0: COW<DB::IColumn>::immutable_ptr<DB::IColumn> DB::selectIndexImpl<DB::ColumnVector<char8_t> >(DB::ColumnVector<char8_t> const&, DB::IColumn const&, unsigned long) @ 0x306156e1 in /workspace/clickhouse
2022.09.27 00:00:44.231611 [ 402 ] {} <Fatal> BaseDaemon: 9. ./build_docker/../src/Columns/ColumnVector.cpp:577: DB::ColumnVector<char8_t>::index(DB::IColumn const&, unsigned long) const @ 0x3061550e in /workspace/clickhouse
2022.09.27 00:00:44.328537 [ 402 ] {} <Fatal> BaseDaemon: 10.1. inlined from ./build_docker/../contrib/boost/boost/smart_ptr/intrusive_ptr.hpp:194: boost::intrusive_ptr<DB::IColumn const>::operator*() const
2022.09.27 00:00:44.328571 [ 402 ] {} <Fatal> BaseDaemon: 10. ./build_docker/../src/Columns/ColumnLowCardinality.cpp:201: DB::ColumnLowCardinality::insertRangeFrom(DB::IColumn const&, unsigned long, unsigned long) @ 0x304b2092 in /workspace/clickhouse

@kssenii
Copy link
Copy Markdown
Member Author

kssenii commented Sep 27, 2022

Stress test (debug) — Cannot start clickhouse-server

clickhouse-server: ../src/Disks/IO/AsynchronousReadIndirectBufferFromRemoteFS.cpp:228: virtual bool DB::AsynchronousReadIndirectBufferFromRemoteFS::nextImpl(): Assertion `file_offset_of_buffer_end <= impl->getFileSize()' failed.

this error is related to #41783

@kssenii
Copy link
Copy Markdown
Member Author

kssenii commented Sep 27, 2022

Stress test (ubsan) — Cannot start clickhouse-server
Stress test (asan) — Cannot start clickhouse-server

Not sure but looks the reason is:

2022.09.27 02:32:34.830712 [ 338626 ] {} <Error> Application: Caught exception while loading metadata: Code: 80. DB::Exception: Unknown Index type 'annoy'. Available index types: hypothesis, bloom_filter, tokenbf_v1, ngrambf_v1, set, minmax: Cannot attach table `test_1`.`02354_annoy` from metadata file /var/lib/clickhouse/metadata/test_1/02354_annoy.sql from query ATTACH TABLE test_1.`02354_annoy` (`id` Int32, `embedding` Array(Float32), INDEX annoy_index embedding TYPE annoy(100) GRANULARITY 1) ENGINE = MergeTree ORDER BY id SETTINGS index_granularity = 5. (INCORRECT_QUERY), Stack trace (when copying this message, always include the lines below):

@tavplubix
Copy link
Copy Markdown
Member

Not sure but looks the reason is

#41729

@kssenii kssenii merged commit caf6a99 into ClickHouse:master Sep 27, 2022
@lesandie
Copy link
Copy Markdown
Contributor

Hi @kssenii

about this make threadpool size for reads/writes configurable are there any new clickhouse setting related?

I'm testing master with s3 table function ... related about this #40128

@kssenii
Copy link
Copy Markdown
Member Author

kssenii commented Sep 30, 2022

Yes, settings are in server config threadpool_remote(local)_fs_reader_pool_size and threadpool_remote(local)_fs_reader_queue_size, but these thread pools are used only for s3 as merge tree engine storage. For s3 table function you should use user setting max_download_threads.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-not-for-changelog This PR should not be mentioned in the changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

makes pool_size and queue size configurable for S3 reader&writer

5 participants