Skip to content

deadlock in system.text_log#12339

Closed
nikitamikhaylov wants to merge 11 commits intoClickHouse:masterfrom
nikitamikhaylov:text-log-fix
Closed

deadlock in system.text_log#12339
nikitamikhaylov wants to merge 11 commits intoClickHouse:masterfrom
nikitamikhaylov:text-log-fix

Conversation

@nikitamikhaylov
Copy link
Copy Markdown
Member

@nikitamikhaylov nikitamikhaylov commented Jul 9, 2020

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Try to avoid deadlock in system.text_log. Closes #12325

@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label Jul 9, 2020
@akuzm
Copy link
Copy Markdown
Contributor

akuzm commented Jul 9, 2020

It may be interesting to add a scoped class like getCurrentMemoryTrackerActionLock that disables memory allocation at all, i.e. makes all new() calls throw. This way we can be sure that there is no allocation.

}

queue.push_back(element);
queue.push_back(std::forward<T>(element));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should avoid copy-construction where possible, but not always -- depends on what kind of reference you pass. Maybe we should require rvalue reference input instead?

Also there is still reallocation of queue buffer itself -- we can probably grow it to max size right away. If you do this, don't forget also initialize the local queue in the flushing thread, that is swapped with this one.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I thought the problem was only with text_log. Other logs will pass lvalue as before.
  2. That is the problem.

@akuzm akuzm self-assigned this Jul 9, 2020
@nikitamikhaylov
Copy link
Copy Markdown
Member Author

@akuzm I forgot why we have to avoid allocations in log function? The memory tracker is already off via getCurrentMemoryTrackerActionLock. Also it is not possible to use std::move everywhere, so I returned back universal link with template parameter. And added a function to test it somehow. It seems working.

@nikitamikhaylov nikitamikhaylov marked this pull request as ready for review July 31, 2020 11:24
class FunctionPrintToLog : public IFunction
{
public:
static constexpr auto name = "printToLog";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I now run select printToLog(repeat('a', 1000)) from system.numbers to bring down the playground? It should require sysadmin priveleges and be disabled in read only mode.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

allow_introspection_functions

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wanted this function, see #9434

Details:

  • only trace level;
  • no quantity parameter (I think it's harmful);
  • different name (log is also a verb);

Comment on lines +95 to +101
std::unique_lock<std::mutex> lock(text_log_mutex);
if (auto log = text_log.lock())
log->add(elem);
{
lock.unlock();
log->add(std::move(elem));
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can already imagine how I fail to notice the lock.unlock() inside the if, add some code after if and believe that it's under the lock. Let's restructure:

shared_ptr<TextLog> text_log_locked;
{
    std::lock_guard<std::mutex> lock(text_log_mutex);
    text_log_locked = text_log.lock();
}
if (text_log_locked)
{
   text_log_locked->add(elem);
}

@akuzm
Copy link
Copy Markdown
Contributor

akuzm commented Jul 31, 2020

@akuzm I forgot why we have to avoid allocations in log function? The memory tracker is already off via getCurrentMemoryTrackerActionLock. Also it is not possible to use std::move everywhere, so I returned back universal link with template parameter. And added a function to test it somehow. It seems working.

Idk, why do we have to use std::move then? Maybe we should leave only the locking changes in OwnSplitChannel?

@akuzm
Copy link
Copy Markdown
Contributor

akuzm commented Jul 31, 2020

The new function is probably not something we should backport. So I guess we should have a one-line PR with locking changes and bugfix category, and everything else in another PR. Or do we have to backport it at all? I remember Alexey fixed something, maybe it's enough for the back branches?

@akuzm
Copy link
Copy Markdown
Contributor

akuzm commented Jul 31, 2020

Idk, why do we have to use std::move then? Maybe we should leave only the locking changes in OwnSplitChannel?

But I think the changes themselves are OK and we can keep them. These logs can be used in surprisingly intensive ways e.g. when tracing all memory allocations -- I had to fix several performance problems before. So saving excessive data copying and allocation is helpful.


DataTypePtr getReturnTypeImpl(const DataTypes &) const override
{
return std::make_shared<DataTypeString>();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd return 0 of type UInt8.

@akuzm
Copy link
Copy Markdown
Contributor

akuzm commented Aug 31, 2020

The deadlock is fixed.
The std::move changes I'm not sure are needed at all.

What's left is the print function. @nikitamikhaylov do you want to continue? Maybe we should close this PR and concentrate on something more vital.

@nikitamikhaylov
Copy link
Copy Markdown
Member Author

The deadlock is fixed.

The std::move changes I'm not sure are needed at all.

What's left is the print function. @nikitamikhaylov do you want to continue? Maybe we should close this PR and concentrate on something more vital.

Of course, maybe I will create PR only with function in the near future.

@akuzm
Copy link
Copy Markdown
Contributor

akuzm commented Aug 31, 2020

Of course, maybe I will create PR only with function in the near future.

OK, I'll close this one then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-bugfix Pull request with bugfix, not backported by default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Deadlock in system.text_log

4 participants