Polymorphic parts (compact format).#8290
Polymorphic parts (compact format).#8290alexey-milovidov merged 99 commits intoClickHouse:masterfrom
Conversation
alesapin
left a comment
There was a problem hiding this comment.
Need one more test with compact parts and skip indices. Also, we need at least one detailed comment about compact parts.
| } | ||
|
|
||
|
|
||
| static bool arrayHasNoElementsRead(const IColumn & column) |
| set min_insert_block_size_rows=1; | ||
| insert into mt_compact select number, 'aaa' from numbers(100); | ||
|
|
||
| select count() from system.parts where table = 'mt_compact' and database = currentDatabase() and active; |
There was a problem hiding this comment.
i'm wondering - will system.columns and part_columns be able to show proper sizes for those parts.
Shouldn't those parts be marked in system.parts somehow?
There was a problem hiding this comment.
i'm wondering - will system.columns and part_columns be able to show proper sizes for those parts.
No, per-column sizes wouldn't be counted for compact parts. Count compressed size of every column is almost impossible. Uncompressed size can be counted, but it's not very usefull, and it's not implemented for simplicity.
Shouldn't those parts be marked in system.parts somehow?
system.parts table would have part_type column.
There was a problem hiding this comment.
That will definitely be reported as bug ("size on disk doesn't match the size in system.columns" etc.) :\ AFAIK a lot of users look in system.columns.
There was a problem hiding this comment.
Compact parts are intended to have only small ratio in size to total size of the table.
I've added a smoke test to check basic functionality with compact parts and skip indices. |
|
👍 Great feature !! For |
It will be implemented as in-memory parts with WAL, which is the append-only file in Native format. |
|
But the parts itself remain atomic and immutable. |
|
|
||
| void read(ReadBuffer & buffer, size_t from, size_t count) | ||
| { | ||
| buffer.readStrict(reinterpret_cast<char *>(data() + from), count * sizeof(MarkInCompressedFile)); |
There was a problem hiding this comment.
Looks dangerous because it neither checks nor resizes the array.
|
Now let's enable them for system logs for size less than 1 MB. |
But reduce scope, to avoid leaking too much memory, since there are old values in last_block_index_columns. The scope of the MemoryTracker::BlockerInThread has been increased in ClickHouse#8290
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (up to few sentences, required except for Non-significant/Documentation categories):
Add new compact format of parts in
MergeTree-family tables in which all columns are stored in one file. It helps to increase performance of small and frequent inserts. The old format (one file per column) is now called wide. Data storing format is controlled by settingsmin_bytes_for_wide_partandmin_rows_for_wide_part.