Skip to content

Move parts between storage volumes according to configured TTL expressions#7420

Merged
alesapin merged 50 commits intoClickHouse:masterfrom
excitoon-favorites:ttls
Dec 11, 2019
Merged

Move parts between storage volumes according to configured TTL expressions#7420
alesapin merged 50 commits intoClickHouse:masterfrom
excitoon-favorites:ttls

Conversation

@excitoon
Copy link
Copy Markdown
Contributor

@excitoon excitoon commented Oct 22, 2019

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en.

For changelog. Remove if this is non-significant change.

Category (leave one):

  • New Feature

Short description (up to few sentences):

Move parts between storage volumes according to configured TTL expressions.

Detailed description (optional):

...

@amosbird
Copy link
Copy Markdown
Collaborator

I hereby not agree

lol

@alexey-milovidov alexey-milovidov changed the title Ttls Move parts between storage volumes according to configured TTL expressions. Oct 22, 2019
@alesapin alesapin self-assigned this Oct 22, 2019
Copy link
Copy Markdown
Member

@alesapin alesapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't understand why we have so different logic for common TTL and MoveTTL. Common TTL is just special case of MoveTTL.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MoveDestinationType was more clear.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why it's here? It used not only with TTLs. Let's make it separate enum class,

Copy link
Copy Markdown
Contributor Author

@excitoon excitoon Oct 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const &

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we haven't any test for TTL moves?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes so far.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the bad cast of there will be a number. Better tryGet and apparent exception in case of error.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I can not get the idea, it looks like we do not do that in way more other places.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear message.

Copy link
Copy Markdown
Contributor Author

@excitoon excitoon Oct 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need assignment?

Copy link
Copy Markdown
Contributor Author

@excitoon excitoon Oct 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

???

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was a mistake.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if can_move == false?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we store expression here? It should be stored in table definition.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thing is basically an additional protection level, to not make wrong moves. Expressions from here are not being analysed for anything but rather used by keys for min/max values. If someone ALTER TTL expressions, data will not be moved according to stale TTLs.

@excitoon excitoon requested a review from alesapin November 11, 2019 12:19
@CurtizJ CurtizJ self-requested a review November 18, 2019 09:35
Copy link
Copy Markdown
Member

@CurtizJ CurtizJ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok. Have some minor issues. Now waiting for tests.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need template here? Isn't TTLEntry a single class?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. In previous version it wasn't :-P

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename this enum to something like PartDestinationType, because it is used not only for TTL. And in cases when it is used for TTL, structures that contain it already have TTL prefix.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

@excitoon excitoon requested a review from CurtizJ December 1, 2019 15:21
@excitoon excitoon changed the title Move parts between storage volumes according to configured TTL expressions. Move parts between storage volumes according to configured TTL expressions Dec 2, 2019
Copy link
Copy Markdown
Member

@alesapin alesapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost done. Need to fix naming, add comments and make small changes in code,

ttl_entry.expression->execute(block);

auto & ttl_info = (column_name.empty() ? ttl_infos.table_ttl : ttl_infos.columns_ttl[column_name]);
remove_column = true;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we remove columns after execution? Looks like side effect of this function. At least we need to update comment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't want to spoil block with extra columns. In this case merge would complain about extra columns (different table structure).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we just store expression result in block, than use to update ttl_info, and after that we remove it. OK.

if (!ttl_table.empty())
out << "ttl: " << ttl_table << "\n";

if (!ttl_move.empty())
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not at the end of the list?

Copy link
Copy Markdown
Contributor Author

@excitoon excitoon Dec 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to group similar things together (ttl and move_ttl).

return {};
}

bool MergeTreeData::TTLEntry::isPartInDestination(const DiskSpace::StoragePolicyPtr & policy, const MergeTreeDataPart & part) const
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it can be a static method of MergeTreeData with three arguments?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be the better solution? Idea was to separate scanning of ttl_entry fields, because they contain non-trivial logic of selecting something depending on entry type. I do not want to spread this things to MergeTreeData in order to have locality of data and logic of TTLEntry.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

DiskSpace::SpacePtr destination_ptr = ttl_entry->getDestination(storage_policy);
if (!destination_ptr)
{
if (ttl_entry->destination_type == PartDestinationType::VOLUME)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this possible? Disks/Volumes configuration cannot be changed without server restart. So why we don't check it during the creation of TTL expression? And here the exception is better.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is planned to allow user to change disks/volumes without restart, that shall be one of upcoming pull requests. Typically, we call this method when inserting and merging data. And it shall reserve any space even if space by TTL is unavailable.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@excitoon excitoon requested a review from alesapin December 5, 2019 08:05
Copy link
Copy Markdown
Member

@alesapin alesapin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

@alesapin alesapin merged commit b708f86 into ClickHouse:master Dec 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants