CLICKHOUSE-606: query deduplication based on parts' UUID#17348
CLICKHOUSE-606: query deduplication based on parts' UUID#17348alesapin merged 5 commits intoClickHouse:masterfrom
Conversation
There was a problem hiding this comment.
I have an awful failure case for this where it will never make any progress:
- 3 shard cluster
- shard 1 has part 1, shard 2 has part 2 and shard 3 has both part 1 and part 2 (2 concurrent part moves).
- the packet with part uuids is received first from shard 1
- then from shard 3
- now this query is resent as it contains part duplicates (in flight)
- then from shard 3
- packet with part uuids is received from shard 2
- packet with part uuids is received from shard 3 which again contains duplicates
One solution is multiple retries with a very awful worst case.
I'm ok with ignoring this case for now but it is worth being mentioned.
There was a problem hiding this comment.
One solution is multiple retries with a very awful worst case.
From the initial proposal #13574
duplicated parts can be found during the query processing only during a short period of time:
- destination shard attaches a recently moved part
X - (DEDUPLICATION HAPPENS HERE) - source shard detaches moving part
it multi parts movement it's a know tradeoff -> better fail and retry later, when switching is done.
surely, number of retries can be configured later if necessary, but I don't think we should proceed with this.
There was a problem hiding this comment.
And max_distributed_connections=1 is a workaround for such case, right?
99bfb0a to
b1ca682
Compare
b1ca682 to
17ffbe8
Compare
There was a problem hiding this comment.
Better to make get*PartUUIDs constant than const_cast context each time.
There was a problem hiding this comment.
but there's initialization inside getter:
auto lock = getLock();
if (!part_uuids)
part_uuids = std::make_shared<PartUUIDs>();hence this dirty trick...
There was a problem hiding this comment.
As far as I know, currently, all parts have UUIDs. So we will send all of them each time?
There was a problem hiding this comment.
only when setting allow_experimental_query_deduplication=1, later, when moving part logic is implemented, it's possible to look at memtable/zk to say which parts are switching/should be deduplecated.
programs/client/Client.cpp
Outdated
There was a problem hiding this comment.
Seems that client should not get them (if so LOGICAL_ERROR looks better), or I'm missing something?
There was a problem hiding this comment.
it can potentially, because server can have parts with UUIDs and if settings is enabled, it will send them.
I'd just ignore this packet for now, rather than changing the protocol where client explicitly asking for those UUIDs to be sent back (which is done now via setting).
4dc30fe to
2b29c56
Compare
2b29c56 to
1e55c7e
Compare
|
committed changed submodule by mistake, fixed |
0545759 to
98c3f25
Compare
* add the query data deduplication excluding duplicated parts in MergeTree family engines. query deduplication is based on parts' UUID which should be enabled first with merge_tree setting assign_part_uuids=1 allow_experimental_query_deduplication setting is to enable part deduplication, default ot false. data part UUID is a mechanism of giving a data part a unique identifier. Having UUID and deduplication mechanism provides a potential of moving parts between shards preserving data consistency on a read path: duplicated UUIDs will cause root executor to retry query against on of the replica explicitly asking to exclude encountered duplicated fingerprints during a distributed query execution. NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will update part's UUID. * add _part_uuid virtual column, allowing to use UUIDs in predicates. Signed-off-by: Aleksei Semiglazov <[email protected]> address comments
4cf8449 to
d05c644
Compare
alesapin
left a comment
There was a problem hiding this comment.
Ok, the code quite isolated, so we can merge it with the experimental flag.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Distributed query deduplication is a followup to #16033 and partially resolves the proposal #13574
query deduplication is based on parts' UUID which should be enabled first with merge_tree setting
assign_part_uuids=1
allow_experimental_query_deduplication setting is to enable part deduplication, default to false.
data part UUID is a mechanism of giving a data part a unique identifier.
Having UUID and deduplication mechanism provides a potential of moving parts
between shards preserving data consistency on a read path:
duplicated UUIDs will cause root executor to retry query against on of the replica explicitly
asking to exclude encountered duplicated fingerprints during a distributed query execution.
NOTE: this implementation don't provide any knobs to lock part and hence its UUID. Any mutations/merge will
update part's UUID.