Support system replicas queries for distributed#4935
Support system replicas queries for distributed#4935alesapin merged 4 commits intoClickHouse:masterfrom
Conversation
39006fe to
c13b82f
Compare
| target_table->checkPartitionCanBeDropped(partition); | ||
| } | ||
|
|
||
| ActionLock StorageMaterializedView::getActionLock(StorageActionBlockType type) |
There was a problem hiding this comment.
For the materialized view, maybe we should push down ActionLock ?
There was a problem hiding this comment.
Using of same queries for Replicated and Distributed tables is confusing. It would be better to add another query, for example SYSTEM SYNC DISTRIBUTED, SYSTEM STOP DISTRIBUTED SENDS, etc.
Also I'm not sure that this type of queries for Distributed tables are useful. It only triggers additional findFile and fails in case of exception in this function. But StorageDistributedDirectoryMonitor already triggers this function as frequent as possible in background thread and sleeps only in case of exceptions in findFiles. Which problem these queries solve? Maybe exponent in backoff calculation is too big?
| std::chrono::milliseconds{Int64(default_sleep_time.count() * std::exp2(error_count))}, | ||
| std::chrono::milliseconds{max_sleep_time}); | ||
| tryLogCurrentException(getLoggerName().data()); | ||
| } |
There was a problem hiding this comment.
Need to write something to log about it.
| void StorageDistributedDirectoryMonitor::syncReplicaSends() | ||
| { | ||
| if (quit || monitor_blocker.isCancelled()) | ||
| throw Exception("Cancelled sync distributed sync replica sends.", ErrorCodes::ABORTED); |
Done
|
We don't wait here https://github.com/yandex/ClickHouse/pull/4935/files#diff-8890e3b1de70b013b79201d37463a0d4R98. |
We will call SYSTEM STOP DISTRIBUTED SENDS;
INSERT INTO distributed_xxx VALUES(1)(2)(3);
SYSTEM SYNC DISTRIBUTED;
INSERT INTO distributed_xxx VALUES(4)(5)(6);
SYSTEM SYNC DISTRIBUTED;
This may not happen in my understanding, as ClickHouse uses a hard link to synchronize blocks of replicas data(https://github.com/yandex/ClickHouse/blob/master/dbms/src/Storages/Distributed/DistributedBlockOutputStream.cpp#L563). at the same time, DirectoryMonitor lock are always acquired when |
|
|
||
| static ConnectionPoolPtr createPool(const std::string & name, const StorageDistributed & storage); | ||
|
|
||
| void syncReplicaSends(); |
There was a problem hiding this comment.
Misleading method name, because it's not about replicas.
(Distributed table may look at shards without replicas at all.)
|
|
||
| static ConnectionPoolPtr createPool(const std::string & name, const StorageDistributed & storage); | ||
|
|
||
| void syncReplicaSends(); |
| throw Exception("Cancelled sync distributed sends.", ErrorCodes::ABORTED); | ||
|
|
||
| std::unique_lock lock{mutex}; | ||
| findFiles(); |
There was a problem hiding this comment.
The method findFiles must be renamed.
|
Do you really need this command? In my opinion, it's much better to use synchronous distributed inserts (the setting |
ce78f90 to
80788cd
Compare
Motivation(from my friend):Currently their clickhouse cluster is deployed in japan, us and china, and he needs clickhouse to provide a regular ability to synchronize between different nodes. For now I recommend using the |
|
This is the plan I gave him(translated from google translation): |
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Category :
Short description :
Support SYSTEM SYNC REPLICA for distributed storage
Support SYSTEM START|STOP REPLICATED SENDS for distributed storage