Skip to content

Add config option system_tables_lazy_load.#9642

Closed
DeifyTheGod wants to merge 4 commits intoClickHouse:masterfrom
DeifyTheGod:master
Closed

Add config option system_tables_lazy_load.#9642
DeifyTheGod wants to merge 4 commits intoClickHouse:masterfrom
DeifyTheGod:master

Conversation

@DeifyTheGod
Copy link
Copy Markdown

@DeifyTheGod DeifyTheGod commented Mar 13, 2020

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add config option system_tables_lazy_load that, if set to false, loads system tables with logs at the server startup.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shared->system_logs->*_log is nullptr if the log is not enabled in config (see SystemLogs::SystemLogs(...))

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrote to if (table) {do stuff}

@tavplubix tavplubix added can be tested pr-improvement Pull request with some product improvements labels Mar 13, 2020
@alexey-milovidov
Copy link
Copy Markdown
Member

00834_kill_mutation

  • is a flacky test.

Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov Mar 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but move try/catch inside if.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But there is struct SystemLogs that encapsulate the set of available system logs.
It's no good to duplicate the list of logs here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean "cluster" initialization?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand, why it is supposed to be non empty?
If we get empty TSV result, it will be an empty string?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it is done after initializeSystemLogs, it's a race condition.

Copy link
Copy Markdown
Member

@alexey-milovidov alexey-milovidov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coding style issue, correct is

if (foo)
{
}
else
{
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexey-milovidov @DeifyTheGod maybe it worth thinking making it false by default?

Copy link
Copy Markdown
Member

@azat azat Apr 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, it is better to set shared->system_logs once everything initialized, since otherwise a lot of things can happen, for example deadlock which can be produced with existing data for some system tables and different structure, so that the rename should be performed, and in this case rename stacktrace will be the following:

Thread 1 (Thread 0x7f0c3f808040 (LWP 22383)):
6  DB::RWLockImpl::getLock () at ../dbms/src/Common/RWLock.cpp:213
7  0x0000000008dd245d in DB::IStorage::lockExclusively ()
8  0x0000000008b2cae2 in DB::InterpreterRenameQuery::execute ()
9  0x000000000501893b in DB::SystemLog<DB::TraceLogElement>::prepareTable ()
10 0x0000000004ffea92 in DB::SystemLogs::initializeSystemLogs (this=0x7f0c3f577e98, which=25)
11 0x0000000008aa0346 in DB::Context::createSystemLogs (this=0x7f0c3f578400)
12 0x0000000004f4dcdc in DB::Server::main (this=<optimized out>)
13 0x000000000bbc3013 in Poco::Util::Application::run () at ../contrib/poco/Util/src/Application.cpp:334
14 0x0000000004ffecae in DB::Server::run () at ../dbms/programs/server/Server.cpp:178
15 0x0000000004ff419c in mainEntryClickHouseServer (argc=3, argv=0x7f0c3f458bc0) at ../dbms/programs/server/Server.cpp:1060

So it has the following lock order:

  • Context::getLock (in createSysmteLogs)
  • IStorage::lockExclusively

However somewhere in parallel the following can happen:

Thread 16 (Thread 0x7f0c33dfb700 (LWP 22407)):
6  DB::Context::getLock (this=0x7f0c3f578400) at ../dbms/src/Interpreters/Context.cpp:466
7  0x0000000008aa28aa in DB::Context::getStoragePolicy (this=this@entry=0x7f0c3f578400, name=...) at ../dbms/src/Interpreters/Context.cpp:1707
8  0x0000000008f5fbc1 in DB::MergeTreeData::getStoragePolicy (this=0x7f0c37e31000)
9  0x0000000008f929a4 in DB::MergeTreeDataMergerMutator::getMaxSourcePartsSizeForMerge () at ../dbms/src/Storages/MergeTree/MergeTreeDataMergerMutator.cpp:179
10 0x0000000008e61046 in DB::StorageMergeTree::merge () at ../dbms/src/Storages/StorageMergeTree.cpp:560
11 0x0000000008e68c3b in DB::StorageMergeTree::mergeMutateTask (this=0x7f0c37e31000)
... BackgroundProcessingPool ...
27 0x00007f0c3f819fb7 in start_thread (arg=<optimized out>)
28 0x00007f0c3fa711af in clone (

While this has the following lock order:

  • IStorage::lockExclusively (in StorageMergeTree::merge)
  • Context::getLock

P.S. this is actually the real thing

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And acquire the lock after everything initialized of course

@DeifyTheGod DeifyTheGod reopened this Apr 7, 2020
@alexey-milovidov
Copy link
Copy Markdown
Member

Please don't worry about PR assigments by Ivan Blinkov. He has written a script that assigns pull request to random people. We decided that we will remove this script but Ivan has not stopped it yet.

}

<<<<<<< HEAD
=======
Copy link
Copy Markdown
Member

@azat azat Apr 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to squash all commits and force push, to at least avoid code that does not compile in the repo
(and besides now it has pretty heavy merges)

@alexey-milovidov or this is fine to? what is the policy?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@azat Yes, we should squash if the history become too messy.
But I can do it directly on GitHub.

system_logs->initializeSystemLogs(type);

auto lock = getLock();
shared->system_logs.swap(system_logs);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang reports:

2020-04-09 15:42:58 In file included from ../src/Interpreters/PartLog.h:3:
2020-04-09 15:42:58 ../src/Interpreters/SystemLog.h:86:5: error: definition of implicit copy constructor for 'SystemLogs' is deprecated because it has a user-declared destructor [-Werror,-Wdeprecated]
2020-04-09 15:42:58     ~SystemLogs();
2020-04-09 15:42:58     ^
2020-04-09 15:42:58 ../contrib/libcxx/include/type_traits:3695:9: note: in implicit copy constructor for 'DB::SystemLogs' first required here
2020-04-09 15:42:58     _Tp __t(_VSTD::move(__x));
2020-04-09 15:42:58         ^
2020-04-09 15:42:58 ../contrib/libcxx/include/optional:857:17: note: in instantiation of function template specialization 'std::__1::swap<DB::SystemLogs>' requested here
2020-04-09 15:42:58                 swap(this->__get(), __opt.__get());
2020-04-09 15:42:58                 ^
2020-04-09 15:42:58 ../src/Interpreters/Context.cpp:1570:25: note: in instantiation of member function 'std::__1::optional<DB::SystemLogs>::swap' requested here
2020-04-09 15:42:58     shared->system_logs.swap(system_logs);

I don't see any option except for adding default copy/assigment, i.e.:

diff --git a/src/Interpreters/SystemLog.h b/src/Interpreters/SystemLog.h
index 87da342ae1..5b3a3fcba2 100644
--- a/src/Interpreters/SystemLog.h
+++ b/src/Interpreters/SystemLog.h
@@ -72,6 +72,9 @@ class MetricLog;
 struct SystemLogs
 {
     SystemLogs(Context & global_context, const Poco::Util::AbstractConfiguration & config);
+    SystemLogs(const SystemLogs &) = default;
+    SystemLogs& operator=(const SystemLogs &) = default;
+
     ~SystemLogs();
 
     void shutdown();

@alexey-milovidov
Copy link
Copy Markdown
Member

Also we decided to rewrite it completely:

Alexey Milovidov, [09.04.20 21:16]
[In reply to nonexistence]
Да, у меня до этого были вопросы про порядок инициализации... сейчас посмотрю ещё раз.

Alexey Milovidov, [09.04.20 21:18]
Очень не нравится битовая маска enum с системными таблицами. Я не успел сформулировать это в комментарии на GitHub: https://github.com/ClickHouse/ClickHouse/pull/9642/files#diff-d01c9c952054d9bd8b82d552c516eec2R70

Плохо, потому что:
- вписывать все системные таблицы в этот список, пусть лучше будет одно место, где их список;
- битовая маска, не мотивировано, даже std::vector<std::string> лучше;
- ещё придумываются имена типа QUERY_LOG, но уже есть имена типа query_log.
Поэтому лучше просто удалить enum, сейчас подумаю, как сделать...

Alexey Milovidov, [09.04.20 21:19]
https://github.com/ClickHouse/ClickHouse/pull/9642/files#diff-b95fa9cb3e38c0792a6cd9b87ebfc12aR1556

Выглядит так, как будто сначала всё мучительно засовывается в битовую маску только для того, чтобы из неё сразу же всё достать обратно.

Alexey Milovidov, [09.04.20 21:21]
https://github.com/ClickHouse/ClickHouse/pull/9642/files#diff-28d8e47f9a229c958dad54e3b9b499f3R495

- before using system logs.

Но что защищает от того, что text_log может уже использоваться к этому моменту?

Alexey Milovidov, [09.04.20 21:22]
https://github.com/ClickHouse/ClickHouse/pull/9642/files#diff-c98d71e62ac2a4061e8da8ccbc2a5134R569

try/catch слегка избыточно, и так всё нормально напишется.

Alexey Milovidov, [09.04.20 21:22]
https://github.com/ClickHouse/ClickHouse/pull/9642/files#diff-c98d71e62ac2a4061e8da8ccbc2a5134R560

А в этой строчке ведь уже может использоваться text_log, part_log?

Alexey Milovidov, [09.04.20 21:24]
Можно удобнее всё сделать...

Кстати, а если инициализация будет асинхронная (через ту же очередь, что и всё остальное передаётся в SystemLog), но при старте сервера, это Ок?

nonexistence, [09.04.20 21:26]
[In reply to Alexey Milovidov]
Вроде да

Alexander Burmak, [10.04.20 11:32]
Да, выглядит как ОК. Главное чтобы таблицы создались до окончания старта сервера, когда ClickHouse начинает принимать входящие запросы.

@alexey-milovidov
Copy link
Copy Markdown
Member

Will be reimplemented in a different way.

@azat
Copy link
Copy Markdown
Member

azat commented Apr 19, 2020

Will be reimplemented in a different way.

By the way, will it be done in the nearest feature?
I'm kind of applied it to my tree, and that said that I'm using it (with some fixes).

@alexey-milovidov
Copy link
Copy Markdown
Member

Threre is a very small and simple patch from Alexander Burmak, I will apply it in nearest... hours.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants