Fixing ThreadSanitizer data race error in the LIVE VIEW when accessing no_users_thread variable#7353
Conversation
akuzm
left a comment
There was a problem hiding this comment.
I have a hunch that a proper fix would involve rethinking the architecture, not adding more mutexes, but have no resources for an in-depth review now. Will merge as is.
I totally agree. Adding mutexes right now is a just quick fix. The implementation has to be revisited. Maybe the support for live views that are automatically dropped should be removed as this feature added the usage of an anti-pattern where we are dropping a table in a thread that later can't be joined properly. |
| { | ||
| std::lock_guard lock(no_users_thread_mutex); | ||
| if (no_users_thread.joinable()) | ||
| no_users_thread.detach(); |
There was a problem hiding this comment.
The fix is totally incorrect:
Calling std::thread::detach is just guaranteed error.
There was a problem hiding this comment.
I don't think it is totally incorrect but I fully agree that calling and using std::thread::detach is bad. So what would be a better way to asynchronously drop table? The use case here is that if TIMEOUT is specified during the creation of LIVE VIEW we want to drop it automatically if there are no users that are watching this view after some N number of seconds.
There was a problem hiding this comment.
Maybe just remove the TIMEOUT feature?
There was a problem hiding this comment.
So you think that there is nothing in ClickHouse where we could schedule a callback?
There was a problem hiding this comment.
Maybe we should do something like this?
There was a problem hiding this comment.
Ok, perfect. I will try it out and create a PR.
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Category (leave one):
Fixing ThreadSanitizer data race error in the LIVE VIEW when accessing no_users_thread variable.