Skip to content

Is there a race in code? #7711

@nikitamikhaylov

Description

@nikitamikhaylov

Could you please explain this piece of code?

rocksdb/util/timer.h

Lines 78 to 105 in 0355d14

bool Start() {
MutexLock l(&mutex_);
if (running_) {
return false;
}
thread_.reset(new port::Thread(&Timer::Run, this));
running_ = true;
return true;
}
// Shutdown the Timer
bool Shutdown() {
{
MutexLock l(&mutex_);
if (!running_) {
return false;
}
CancelAllWithLock();
running_ = false;
cond_var_.SignalAll();
}
if (thread_) {
thread_->join();
}
return true;
}

I think there is a race on thread_ variable, because we can call Start(), then call Start() and Shutdown() in parallel. So, there will be a situation when running_=false, but thread_ is not nullptr, so if reset will be called a bit earlier than join it will lead to std::terminate according to C++ standart.

It reproduces in stress tests in ClickHouse with this trace https://gist.github.com/alesapin/3749e00d962a0843d61866e4565c5aee

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions