Improved IN PARTITION mutations#41344
Conversation
IN PARTITION mutations.IN PARTITION mutations
src/Storages/StorageMergeTree.cpp
Outdated
There was a problem hiding this comment.
Otherwise it will be empty (NULL).
There was a problem hiding this comment.
So for example 3 is not possible?
If getPartitionIdsAffectedByCommands can return empty or single element list - why it return list? May be it should use optional?
There was a problem hiding this comment.
This is a common code with StorageReplicatedMergeTree, we can't change it.
There was a problem hiding this comment.
Nobody asks to change it (I think the comment above was about reliability and readability of this code that relies on implicit invariant that probably does not even exist). Please add a comment that explains why this function always returns one partition for StorageMergeTree (I don't understand why). And throw LOGICAL_ERROR if it returned more than one partition.
There was a problem hiding this comment.
@tavplubix it's not me saying it returns only one partition for StorageMergeTree, but @filimonov . When it returns more than one partition, the optimization will not be applied.
401e96c to
1be12b8
Compare
|
Seems like this implementation is something completely different from the solution proposed by @filimonov. Is it expected? Please add a comment with something like design doc describing you solution and explaining how (and why) it works. |
1be12b8 to
47c9fd5
Compare
I fixed this in 47c9fd5, implementation is in sync with the proposal now. Mutation by partition map allows to stop iterating by |
| if (partition_id) | ||
| { | ||
| *out << "partition id: "; | ||
| writeEscapedString(*partition_id, *out); | ||
| *out << "\n"; | ||
| } |
There was a problem hiding this comment.
Do we really need to serialize partition_id here? Seems like we already have partition id inside commands
There was a problem hiding this comment.
We don't need serialization of it specifically, but we need a way to distinguish old storage-wide mutations from new. If we don't change format, we will try to recalculate paritition_id for older and possibly unfinished partition-specific mutations, skip unrelated partitions and stick on merges when in the same partition subset of parts is mutated beyond correct version, or stick being unable to mutate another subset beyond it because there is no related mutation for that. I think this serialization is a neat way to avoid that.
There was a problem hiding this comment.
subset of parts is mutated beyond correct version
AFAIK it should not be a problem
There was a problem hiding this comment.
I mean that we have different mutations:
- with
partition_idandIN PARTITIONexpression, which are new partition-specific mutations — these shall create partition-specific changes; - without
partition_idandIN PARTITIONexpression, which are old or new storage-wide mutations — these create storage-wide changes; - without
partition_idbut withIN PARTITIONexpression, which are old partition-specific mutations. They were supposed to make storage-wide changes earlier (before this patch) and may be partially finished on some parts, so we must continue working the same way on them. If we remove these guys from remaining really unaffected parts (parts_to_do), we would have to rollback unaffected parts which already mutated to that version, or we will have finished mutation and different versions in a partition which will prevent merges to happen until new mutation (in that partition or whole storage) appears.
Generally, I think it is a good thing to have partition set calculated at the moment when the mutation is created, anyway.
There was a problem hiding this comment.
I understand your point, but I'm not sure about
we must continue working the same way on them. If we remove these guys from remaining really unaffected parts (parts_to_do), we would have to rollback unaffected parts which already mutated to that version
I agree that it's undesirable to have some parts with data version greater than expected version for the partition. But AFAIK it will not break anything.
will prevent merges to happen until new mutation (in that partition or whole storage) appears
It will prevent only merges of parts with different data versions, seems like it's not critical. Also it probably can be addressed by taking last_mutation_by_partition into account in getCurrentMutationVersion.
I don't like serializing partition id for two reasons:
- It duplicates partition id that we already have in
IN PARTITIONclause (and we already serialize it). - It unconditionally changes format of mutation entries without any settings. It will make downgrade impossible once you run
IN PARTITIONmutation.IN PARTITIONis not a new feature, so this change is backward incompatible. We can introduce a setting likemerge_tree_mutation_entry_versionorserialize_partition_id_for_in_partition_mutationsfor compatibility, but I don't like this option. It's better to avoid changing format, and seems like it's possible to avoid it.
| /// This set have to be used with `currently_processing_in_background_mutex`. | ||
| DataParts currently_merging_mutating_parts; | ||
|
|
||
| /// Accessed under `currently_processing_in_background_mutex` lock. |
There was a problem hiding this comment.
@tavplubix it does not quite fit in this case because there is (at least) one usage of std::unique_lock instead of std::lock_guard in selectPartsToMerge() and we would not be able to build complete tree of function calls properly annotated.
There was a problem hiding this comment.
You can use TSA_SUPPRESS_WARNING_FOR_READ in selectPartsToMerge
There was a problem hiding this comment.
This is impossible because std::unique_lock has no thread annotations. There is nothing to suppress, however, but it is needed to mark mutex as locked for Clang instead. It would work if we only used std::lock_guard.
There was a problem hiding this comment.
It is possible and it works fine with std::unique_lock and TSA_SUPPRESS_WARNING_FOR_READ
1653b7e to
557858f
Compare
26f8045 to
a6ec470
Compare
…es errors in logs when run on replicated ClickHouse instances).
… generates errors in logs when run on replicated ClickHouse instances)." This reverts commit 735e062.
|
|
|
Problem with exceptions is caused by wrong interoperability of |
b5fd89d to
1004bed
Compare
@quickhouse how exactly? |
|
I apologize, I am completely exhausted and can't look at this in the near
future.
…On Sun, Jan 29, 2023, 8:11 PM Alexey Milovidov ***@***.***> wrote:
Improved IN PARTITION mutations.
@quickhouse <https://github.com/quickhouse> how exactly?
—
Reply to this email directly, view it on GitHub
<#41344 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZ5CUDWSWFIWJX233NGUN6DWU2QCZANCNFSM6AAAAAAQNG5KOU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
|
@alexey-milovidov the implementation of ALTER UPDATE/DELETE IN PARTITION for MergeTree was incorrect and was rolled back #40589 That PR was implementing that concept https://gist.github.com/filimonov/a2c79385159efe2935e616ac7492632b (discussed with @tavplubix ) But since Vladimir ( @quickhouse ) did not finish that task, I'm closing the PR we will reassign the task to finish it (we will send a new PR) |
|
@filimonov : please also fix the user experience of this feature, if the IN PARTITION is touching selected parts (by virtue of partition pruning), Current implementation would show all parts (I wonder if it is even processing them as I see the counter parts_to_do decreases over time !). My test was mutating 1 partition and I saw thousands of them in system.mutations. |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Improved
IN PARTITIONmutations.