Skip to content

Fixed wrong code around Memory Profiler#9472

Merged
alexey-milovidov merged 10 commits intomasterfrom
memory-profiler-fix
Mar 3, 2020
Merged

Fixed wrong code around Memory Profiler#9472
alexey-milovidov merged 10 commits intomasterfrom
memory-profiler-fix

Conversation

@alexey-milovidov
Copy link
Copy Markdown
Member

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Do not include in changelog. The feature "memory profiler" is not yet in release.

@alexey-milovidov
Copy link
Copy Markdown
Member Author

This partially reverts #8765

@alexey-milovidov
Copy link
Copy Markdown
Member Author

alexey-milovidov commented Mar 3, 2020

There were a few bugs:

  1. The check for signal overrun is not done before collecting a stack trace, that defeats the purpose of this check:

https://github.com/ClickHouse/ClickHouse/pull/8765/files#diff-1f307cce4cc43b590c8b9c2023ab213fR35

This leads to deadlock.

  1. Bad calculation of the next profile step:

https://github.com/ClickHouse/ClickHouse/pull/8765/files#diff-d793b741301f371dc58b28c87a401fa3R116

  1. Suspicious code that looks like a race condition.

https://github.com/ClickHouse/ClickHouse/pull/8765/files#diff-d98bf273b2e4b64c454efaa43e721e10R31

The code was ignorant because of:

  1. ext::Singleton

https://github.com/ClickHouse/ClickHouse/pull/8765/files#diff-9fca6c43c4e7d9e86c5a4c532c62a249R1

The singleton class with very unobvious use but without any comments. We already have singletons in our code, but creating another specific singleton class and naming is just "Singleton" is very confusing. Also it looks half done.

  1. Lack of comments in TraceCollector.

Two "collect" methods with unknown purpose:
https://github.com/ClickHouse/ClickHouse/pull/8765/files#diff-d98bf273b2e4b64c454efaa43e721e10R33

  1. Copy-paste in TraceCollector

https://github.com/ClickHouse/ClickHouse/pull/8765/files#diff-bdbb6c9c9b356ce37fb0fbe0bdbfbd0dR120

@alexey-milovidov
Copy link
Copy Markdown
Member Author

QueryProfiler test was broken in #9433
For unknown reason, libunwind cannot unwind the stack above the signal handler if signal has come to clock_nanosleep function. Probably something is wrong with unwind tables or linker.

┌─trace─────────────────────────────────────────────────────────────────────────────────────────────────────┬─count()─┐
│ StackTrace::StackTrace(ucontext_t const&)
DB::(anonymous namespace)::signalHandler(int, siginfo_t*, void*)
 │       1 │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────┘

@alexey-milovidov
Copy link
Copy Markdown
Member Author

It requires further investigation.

@alexey-milovidov alexey-milovidov added no-docs-needed pr-bugfix Pull request with bugfix, not backported by default labels Mar 3, 2020
@alexey-milovidov
Copy link
Copy Markdown
Member Author

alexey-milovidov commented Mar 3, 2020

$ grep memory_profiler *.sql
$ grep memory_profiler *.sh

Also there was no test for memory profiler.

@alexey-milovidov
Copy link
Copy Markdown
Member Author

Ok, now it works perfectly:

SELECT ignore(groupArray(number)) FROM numbers(10000000)
SELECT 
    arrayStringConcat(arrayMap(x -> demangle(addressToSymbol(x)), trace), '\n') AS trace, 
    formatReadableSize(sum(size) AS s)
FROM system.trace_log
WHERE (event_date = today()) AND (query_id = 'd464dfed-cb8e-466e-a936-5e760437a7bf') AND (trace_type = 'Memory')
GROUP BY trace
ORDER BY s DESC

Row 1:
──────
trace:                         StackTrace::StackTrace()
MemoryTracker::alloc(long)
MemoryTracker::alloc(long)
Allocator<false, false>::realloc(void*, unsigned long, unsigned long, unsigned long)
void DB::PODArrayBase<8ul, 4096ul, Allocator<false, false>, 15ul, 16ul>::reserve<>(unsigned long) (.part.0)
DB::GroupArrayNumericImpl<unsigned long, DB::GroupArrayTrait<false, (DB::Sampler)0> >::insertResultInto(char const*, DB::IColumn&) const
DB::Block DB::Aggregator::prepareBlockAndFill<DB::Aggregator::prepareBlockAndFillWithoutKey(DB::AggregatedDataVariants&, bool, bool) const::'lambda'(std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, std::__1::vector<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*, std::__1::allocator<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*> >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, bool)&>(DB::AggregatedDataVariants&, bool, unsigned long, DB::Aggregator::prepareBlockAndFillWithoutKey(DB::AggregatedDataVariants&, bool, bool) const::'lambda'(std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, std::__1::vector<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*, std::__1::allocator<DB::PODArray<char*, 4096ul, Allocator<false, false>, 15ul, 16ul>*> >&, std::__1::vector<COW<DB::IColumn>::mutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::mutable_ptr<DB::IColumn> > >&, bool)&) const (.constprop.0)
DB::Aggregator::prepareBlockAndFillWithoutKey(DB::AggregatedDataVariants&, bool, bool) const
DB::ConvertingAggregatedToChunksTransform::initialize()
std::__1::__function::__func<DB::PipelineExecutor::addJob(DB::PipelineExecutor::ExecutionState*)::'lambda'(), std::__1::allocator<DB::PipelineExecutor::addJob(DB::PipelineExecutor::ExecutionState*)::'lambda'()>, void ()>::operator()()
DB::PipelineExecutor::executeSingleThread(unsigned long, unsigned long)
ThreadFromGlobalPool::ThreadFromGlobalPool<DB::PipelineExecutor::executeImpl(unsigned long)::'lambda0'()>(DB::PipelineExecutor::executeImpl(unsigned long)::'lambda0'()&&)::'lambda'()::operator()() const
ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>)
void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()> >(void*)
start_thread
clone
formatReadableSize(sum(size)): 255.08 MiB

Row 2:
──────
trace:                         StackTrace::StackTrace()
MemoryTracker::alloc(long)
MemoryTracker::alloc(long)
Allocator<false, false>::realloc(void*, unsigned long, unsigned long, unsigned long)
void DB::PODArrayBase<8ul, 32ul, DB::MixedArenaAllocator<4096ul, Allocator<false, false>, DB::AlignedArenaAllocator<8ul>, 8ul>, 0ul, 0ul>::reserveForNextSize<DB::Arena*&>(DB::Arena*&)
DB::IAggregateFunctionHelper<DB::GroupArrayNumericImpl<unsigned long, DB::GroupArrayTrait<false, (DB::Sampler)0> > >::addBatchSinglePlace(unsigned long, char*, DB::IColumn const**, DB::Arena*) const
DB::Aggregator::executeWithoutKeyImpl(char*&, unsigned long, DB::Aggregator::AggregateFunctionInstruction*, DB::Arena*) const
DB::Aggregator::executeOnBlock(std::__1::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::__1::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn> > >, unsigned long, DB::AggregatedDataVariants&, std::__1::vector<DB::IColumn const*, std::__1::allocator<DB::IColumn const*> >&, std::__1::vector<std::__1::vector<DB::IColumn const*, std::__1::allocator<DB::IColumn const*> >, std::__1::allocator<std::__1::vector<DB::IColumn const*, std::__1::allocator<DB::IColumn const*> > > >&, bool&)
DB::AggregatingTransform::consume(DB::Chunk)
DB::AggregatingTransform::work()
std::__1::__function::__func<DB::PipelineExecutor::addJob(DB::PipelineExecutor::ExecutionState*)::'lambda'(), std::__1::allocator<DB::PipelineExecutor::addJob(DB::PipelineExecutor::ExecutionState*)::'lambda'()>, void ()>::operator()()
DB::PipelineExecutor::executeSingleThread(unsigned long, unsigned long)
ThreadFromGlobalPool::ThreadFromGlobalPool<DB::PipelineExecutor::executeImpl(unsigned long)::'lambda0'()>(DB::PipelineExecutor::executeImpl(unsigned long)::'lambda0'()&&)::'lambda'()::operator()() const
ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>)
void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()> >(void*)
start_thread
clone
formatReadableSize(sum(size)): 157.06 MiB

Row 3:
──────
trace:                         StackTrace::StackTrace()
MemoryTracker::alloc(long)
MemoryTracker::alloc(long)
operator new(unsigned long)
std::__1::deque<DB::PipelineExecutor::ExecutionState*, std::__1::allocator<DB::PipelineExecutor::ExecutionState*> >::__add_back_capacity()
DB::PipelineExecutor::prepareProcessor(unsigned long, unsigned long, std::__1::queue<DB::PipelineExecutor::ExecutionState*, std::__1::deque<DB::PipelineExecutor::ExecutionState*, std::__1::allocator<DB::PipelineExecutor::ExecutionState*> > >&, std::__1::unique_lock<std::__1::mutex>)
DB::PipelineExecutor::tryAddProcessorToStackIfUpdated(DB::PipelineExecutor::Edge&, std::__1::queue<DB::PipelineExecutor::ExecutionState*, std::__1::deque<DB::PipelineExecutor::ExecutionState*, std::__1::allocator<DB::PipelineExecutor::ExecutionState*> > >&, unsigned long)
DB::PipelineExecutor::prepareProcessor(unsigned long, unsigned long, std::__1::queue<DB::PipelineExecutor::ExecutionState*, std::__1::deque<DB::PipelineExecutor::ExecutionState*, std::__1::allocator<DB::PipelineExecutor::ExecutionState*> > >&, std::__1::unique_lock<std::__1::mutex>)
DB::PipelineExecutor::tryAddProcessorToStackIfUpdated(DB::PipelineExecutor::Edge&, std::__1::queue<DB::PipelineExecutor::ExecutionState*, std::__1::deque<DB::PipelineExecutor::ExecutionState*, std::__1::allocator<DB::PipelineExecutor::ExecutionState*> > >&, unsigned long)
DB::PipelineExecutor::prepareProcessor(unsigned long, unsigned long, std::__1::queue<DB::PipelineExecutor::ExecutionState*, std::__1::deque<DB::PipelineExecutor::ExecutionState*, std::__1::allocator<DB::PipelineExecutor::ExecutionState*> > >&, std::__1::unique_lock<std::__1::mutex>)
DB::PipelineExecutor::executeSingleThread(unsigned long, unsigned long)
ThreadFromGlobalPool::ThreadFromGlobalPool<DB::PipelineExecutor::executeImpl(unsigned long)::'lambda0'()>(DB::PipelineExecutor::executeImpl(unsigned long)::'lambda0'()&&)::'lambda'()::operator()() const
ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>)
void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void ThreadPoolImpl<std::__1::thread>::scheduleImpl<void>(std::__1::function<void ()>, int, std::__1::optional<unsigned long>)::'lambda1'()> >(void*)
start_thread
clone
formatReadableSize(sum(size)): 4.00 MiB

@alexey-milovidov
Copy link
Copy Markdown
Member Author

We have our own clock_nanosleep from Musl.

@alexey-milovidov
Copy link
Copy Markdown
Member Author

Almost obviously, the issue with lld linker has the same cause: #9049

@alexey-milovidov
Copy link
Copy Markdown
Member Author

Remove weak aliases from musl

It didn't help.

Now I think the issue is the lack of unwind tables (CFI) in asm sources in Musl.

@alexey-milovidov
Copy link
Copy Markdown
Member Author

And this solves the issue: 8d679b5

@alexey-milovidov alexey-milovidov changed the title Fixed wrong code aroung Memory Profiler Fixed wrong code around Memory Profiler Mar 3, 2020
@alexey-milovidov
Copy link
Copy Markdown
Member Author

The new test "memory profiler" is only for release build.

@alexey-milovidov
Copy link
Copy Markdown
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

no-docs-needed pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants