Move parts between storage volumes according to configured TTL expressions#7420
Move parts between storage volumes according to configured TTL expressions#7420alesapin merged 50 commits intoClickHouse:masterfrom excitoon-favorites:ttls
Conversation
lol |
alesapin
left a comment
There was a problem hiding this comment.
I didn't understand why we have so different logic for common TTL and MoveTTL. Common TTL is just special case of MoveTTL.
dbms/src/Parsers/ASTTTLElement.h
Outdated
There was a problem hiding this comment.
MoveDestinationType was more clear.
There was a problem hiding this comment.
Why it's here? It used not only with TTLs. Let's make it separate enum class,
There was a problem hiding this comment.
Seems like we haven't any test for TTL moves?
There was a problem hiding this comment.
the bad cast of there will be a number. Better tryGet and apparent exception in case of error.
There was a problem hiding this comment.
Sorry I can not get the idea, it looks like we do not do that in way more other places.
There was a problem hiding this comment.
That was a mistake.
There was a problem hiding this comment.
There was a problem hiding this comment.
Updated link: https://github.com/ClickHouse/ClickHouse/pull/7420/files/7a171b46f59b7fecd5286161c83a3a3aa199c47b#r337920407
In this case we will go into this code. And part won't be moved by any of policy or table TTL.
There was a problem hiding this comment.
Why we store expression here? It should be stored in table definition.
There was a problem hiding this comment.
This thing is basically an additional protection level, to not make wrong moves. Expressions from here are not being analysed for anything but rather used by keys for min/max values. If someone ALTER TTL expressions, data will not be moved according to stale TTLs.
There was a problem hiding this comment.
Why we need template here? Isn't TTLEntry a single class?
There was a problem hiding this comment.
Yes, that's right. In previous version it wasn't :-P
There was a problem hiding this comment.
Maybe rename this enum to something like PartDestinationType, because it is used not only for TTL. And in cases when it is used for TTL, structures that contain it already have TTL prefix.
…disk selection in `ReplicatedMergeTree`.
alesapin
left a comment
There was a problem hiding this comment.
Almost done. Need to fix naming, add comments and make small changes in code,
| ttl_entry.expression->execute(block); | ||
|
|
||
| auto & ttl_info = (column_name.empty() ? ttl_infos.table_ttl : ttl_infos.columns_ttl[column_name]); | ||
| remove_column = true; |
There was a problem hiding this comment.
Why we remove columns after execution? Looks like side effect of this function. At least we need to update comment.
There was a problem hiding this comment.
Because we don't want to spoil block with extra columns. In this case merge would complain about extra columns (different table structure).
There was a problem hiding this comment.
So we just store expression result in block, than use to update ttl_info, and after that we remove it. OK.
| if (!ttl_table.empty()) | ||
| out << "ttl: " << ttl_table << "\n"; | ||
|
|
||
| if (!ttl_move.empty()) |
There was a problem hiding this comment.
Why not at the end of the list?
There was a problem hiding this comment.
In order to group similar things together (ttl and move_ttl).
| return {}; | ||
| } | ||
|
|
||
| bool MergeTreeData::TTLEntry::isPartInDestination(const DiskSpace::StoragePolicyPtr & policy, const MergeTreeDataPart & part) const |
There was a problem hiding this comment.
Seems like it can be a static method of MergeTreeData with three arguments?
There was a problem hiding this comment.
Will it be the better solution? Idea was to separate scanning of ttl_entry fields, because they contain non-trivial logic of selecting something depending on entry type. I do not want to spread this things to MergeTreeData in order to have locality of data and logic of TTLEntry.
| DiskSpace::SpacePtr destination_ptr = ttl_entry->getDestination(storage_policy); | ||
| if (!destination_ptr) | ||
| { | ||
| if (ttl_entry->destination_type == PartDestinationType::VOLUME) |
There was a problem hiding this comment.
How is this possible? Disks/Volumes configuration cannot be changed without server restart. So why we don't check it during the creation of TTL expression? And here the exception is better.
There was a problem hiding this comment.
It is planned to allow user to change disks/volumes without restart, that shall be one of upcoming pull requests. Typically, we call this method when inserting and merging data. And it shall reserve any space even if space by TTL is unavailable.
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en.
For changelog. Remove if this is non-significant change.
Category (leave one):
Short description (up to few sentences):
Move parts between storage volumes according to configured TTL expressions.
Detailed description (optional):
...