Skip to content

Disable async_socket_for_remote/use_hedged_requests for buggy linux kernels#22109

Merged
alexey-milovidov merged 1 commit intoClickHouse:masterfrom
azat:nested-epoll-wait
Mar 29, 2021
Merged

Disable async_socket_for_remote/use_hedged_requests for buggy linux kernels#22109
alexey-milovidov merged 1 commit intoClickHouse:masterfrom
azat:nested-epoll-wait

Conversation

@azat
Copy link
Copy Markdown
Member

@azat azat commented Mar 25, 2021

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Disable async_socket_for_remote/use_hedged_requests for buggy linux kernels

Detailed description / Documentation draft:
async_socket_for_remote/use_hedged_requests uses nested epolls, which
may not be reliable in 5.5+ 1, but it has been fixed in shortly in 5.7+ 2 or 5.6.13+.

P.S. marked as Bug fix, but not sure about this

Details

It is possible to lock the pipeline, It is not that easy to reproduce, but it is pretty "stable" reproduced with the following benchmark command, using linux 5.5:

$ i=0; while ./clickhouse benchmark -c 30 --timelimit=3 -d0 --cumulative --query "select * from remote('127.{2,3}', system, one)"; do echo Iteration=$i; ((i+=1)); done
Loaded 1 queries.

Note, that use_hedged_requests and async_socket_for_remote is both
required for reproducing.

Pipeline:

digraph
{
  rankdir="LR";
  { node [shape = box]
    n140624128624664[label="Remote(10 jobs) (Async)"];
    n140627179876376[label="Remote(11 jobs) (Finished)"];
    n140624730927768[label="ExpressionTransform(5 jobs) (NeedData)"];
    n140624152969048[label="ExpressionTransform(5 jobs) (Finished)"];
    n140624165957208[label="Resize(0 jobs) (NeedData)"];
    n140624293623256[label="LimitsCheckingTransform(10 jobs) (NeedData)"];
    n140627179694168[label="NullSource(0 jobs) (NeedData)"];
    n140624245605656[label="NullSource(0 jobs) (NeedData)"];
    n140624145010712[label="LazyOutputFormat(10 jobs) (NeedData)"];
  }
  n140624128624664 -> n140624730927768;
  n140627179876376 -> n140624152969048;
  n140624730927768 -> n140624165957208;
  n140624152969048 -> n140624165957208;
  n140624165957208 -> n140624293623256;
  n140624293623256 -> n140624145010712;
  n140627179694168 -> n140624145010712;
  n140624245605656 -> n140624145010712;
}

Threads stack traces:

WITH
    arrayMap(x -> demangle(addressToSymbol(x)), trace) AS trace_array,
    arrayStringConcat(trace_array, '\n') AS trace_string
SELECT
    thread_id,
    trace_string
FROM system.stack_trace
WHERE thread_id IN
(
    SELECT thread_id
    FROM system.processes
    ARRAY JOIN thread_ids AS thread_id
    WHERE query NOT LIKE '%system.%'
)

Row 1:
──────
thread_id:    6490
trace_string: pthread_cond_wait@@GLIBC_2.3.2
std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&)
DB::PipelineExecutor::executeStepImpl(unsigned long, unsigned long, std::__1::atomic<bool>*)
void std::__1::__function::__policy_invoker<void ()>::__call_impl<>(std::__1::__function::__policy_storage const*)
ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>)
void* std::__1::__thread_proxy<>(void*)
start_thread
__clone

Row 2:
──────
thread_id:    6537
trace_string: pthread_cond_wait@@GLIBC_2.3.2
std::__1::condition_variable::wait(std::__1::unique_lock<std::__1::mutex>&)
DB::PipelineExecutor::executeStepImpl(unsigned long, unsigned long, std::__1::atomic<bool>*)
void std::__1::__function::__policy_invoker<void ()>::__call_impl<>(std::__1::__function::__policy_storage const*)
ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>)
void* std::__1::__thread_proxy<>(void*)
start_thread
__clone

Row 3:
──────
thread_id:    6552
trace_string:
epoll_wait
DB::Epoll::getManyReady(int, epoll_event*, bool) const
DB::PollingQueue::wait(std::__1::unique_lock<std::__1::mutex>&)
DB::PipelineExecutor::executeImpl(unsigned long)
DB::PipelineExecutor::execute(unsigned long)
void std::__1::__function::__policy_invoker<void ()>::__call_impl<>(std::__1::__function::__policy_storage const*)
ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>)
void* std::__1::__thread_proxy<>(void*)
start_thread
__clone

Row 4:
──────
thread_id:    31278
trace_string: pthread_cond_timedwait@@GLIBC_2.3.2
Poco::SemaphoreImpl::waitImpl(long)
ConcurrentBoundedQueue<DB::Chunk>::tryPop(DB::Chunk&, unsigned long)
DB::LazyOutputFormat::getChunk(unsigned long)
DB::PullingAsyncPipelineExecutor::pull(DB::Chunk&, unsigned long)
DB::PullingAsyncPipelineExecutor::pull(DB::Block&, unsigned long)
DB::TCPHandler::processOrdinaryQueryWithProcessors()
DB::TCPHandler::runImpl()
DB::TCPHandler::run()
Poco::Net::TCPServerConnection::start()
Poco::Net::TCPServerDispatcher::run()
Poco::PooledThread::run()
Poco::ThreadImpl::runnableEntry(void*)
start_thread
__clone

And to be presice the PipelineExecutor is waiting here 1:

context->condvar.wait(lock, [&]
{
    return finished || context->wake_flag;
});

State of PipelineExecutor at this time:

task_queue = {
  queues = size=2 {
    [0] = size=0 {}
    [1] = size=0 {}
  }
  num_tasks = 0
}

async_task_queue = {
  epoll = {
    epoll_fd = 271
    events_count = {
      Value = 2
    }
    fd_description = "epoll"
  }
  pipe_fd = ([0] = 321, [1] = 324)
  is_finished = {
    Value = false
  }
  tasks = size=1 {
    [0] = {
      __cc = {
        first = 139970762314112
        second = (thread_num = 0, data = 0x00007f4d7b911180, fd = 82)
      }
    }
  }
}

num_waiting_async_tasks = 0
threads_queue = {
  stack = size=2 {
    [0] = 1
    [1] = 0
  }
  thread_pos_in_stack = size=2 {
    [0] = 1
    [1] = 0
  }
  stack_size = 2
}

Cc: @KochetovNicolai
Cc: @Avogar

@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label Mar 25, 2021
@azat azat marked this pull request as draft March 25, 2021 07:06
@azat azat marked this pull request as ready for review March 25, 2021 20:40
@azat azat force-pushed the nested-epoll-wait branch from 9716fc4 to 8e445c5 Compare March 25, 2021 20:40
@alexey-milovidov alexey-milovidov merged commit f90a328 into ClickHouse:master Mar 29, 2021
robot-clickhouse pushed a commit that referenced this pull request Mar 29, 2021
@azat azat deleted the nested-epoll-wait branch March 29, 2021 17:12
alexey-milovidov added a commit that referenced this pull request Mar 30, 2021
Backport #22109 to 21.3: Disable async_socket_for_remote/use_hedged_requests for buggy linux kernels
robot-clickhouse pushed a commit that referenced this pull request Mar 30, 2021
robot-clickhouse pushed a commit that referenced this pull request Mar 30, 2021
@azat azat mentioned this pull request Apr 19, 2021
azat added a commit to azat/ClickHouse that referenced this pull request Jun 14, 2021
The ClickHouse#22109 adds the check but at compilation time, which is pointless,
move the check into runtime.

Remember nested epoll is required for
async_socket_for_remote/use_hedged_requests, otherwise remote queries
may stuck.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants