Improve checksum checks for async INSERT into Distributed on the sender by azat · Pull Request #18853 · ClickHouse/ClickHouse

azat · 2021-01-08T00:33:56Z

Changelog category (leave one):

Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Check per-block checksum of the distributed batch on the sender before sending (without reading the file twice, the checksums will be verified while reading), this will avoid stuck of the INSERT on the receiver (on truncated .bin file on the sender)
Avoid reading .bin files twice for batched INSERT (it was required to calculate rows/bytes to take squashing into account, now this information included into the header, backward compatible is preserved).

src/Storages/Distributed/DirectoryMonitor.cpp

…e sending This is already done for distributed_directory_monitor_batch_inserts=1, so let's do the same for the non batched mode, since otherwise in case the file will be truncated the receiver will just stuck (since it will wait for the block, but the sender will not send it).

… on sending Before this patch StorageDistributedDirectoryMonitor reading .bin files in batch mode, just to calculate number of bytes/rows, this is very ineffective, let's just store them in the header (rows/bytes).

Buffer for reading from a compressed file with just checking checksums of the compressed blocks, without any decompression, so result can be proxied.

Before this patch the DirectoryMonitor was checking the compressed file by reading it one more time (since w/o this receiver may stuck on truncated file), while this is ineffective and we can just check the checksums before sending. But note that this may decrease batch size that is used for sending over network.

…Monitor

alexey-milovidov · 2021-01-11T02:28:42Z

Fuzzer has found unrelated issue #18887

Since now CompressedReadBufferBase read checksum and header at once it will expect 25 (16+9) bytes not 16 (only checksum)

src/Storages/Distributed/DirectoryMonitor.cpp

azat · 2021-01-13T20:19:22Z

Here are some numbers, in general, they shows that CheckingCompressedReadBuffer has almost zero overhead.

patched

create table data (key Int, value String) engine=Null();
create table dist (key Int, value String) engine=Distributed(test_shard_localhost, currentDatabase(), data);
system stop distributed sends dist;

-- insert into Null() directly, just foward blocks to Null engine.
insert into dist select number, reinterpretAsString(number) from system.numbers limit 1e9 settings prefer_localhost_replica=1;
-- 0 rows in set. Elapsed: 16.888 sec. Processed 1.00 billion rows, 8.00 GB (59.23 million rows/s., 473.85 MB/s.)

-- async INSERT, with storing blocks on disk
insert into dist select number, reinterpretAsString(number) from system.numbers limit 1e9 settings prefer_localhost_replica=0;
-- 0 rows in set. Elapsed: 44.010 sec. Processed 1.00 billion rows, 8.00 GB (22.73 million rows/s., 181.83 MB/s.)

-- distributed_directory_monitor_batch_inserts=0
system flush distributed dist;
-- pending *.bin size is 7.5G
-- 0 rows in set. Elapsed: 22.445 sec.

-- distributed_directory_monitor_batch_inserts=1
system flush distributed dist;
-- first run
-- 0 rows in set. Elapsed: 50.845 sec.;
-- second run
-- 0 rows in set. Elapsed: 39.750 sec.;

upstream

-- distributed_directory_monitor_batch_inserts=0
system flush distributed dist;
-- 0 rows in set. Elapsed: 21.459 sec.

-- distributed_directory_monitor_batch_inserts=1
system flush distributed dist;
-- 0 rows in set. Elapsed: 45.488 sec.

And I believe that batching mode overhead can be eliminated by removing extra file reading.

NOTE: INSERT was done using the same version as system flush distributed dist, this is required to fill metadata correctly

azat · 2021-01-13T20:28:25Z

And I believe that batching mode overhead can be eliminated by removing extra file reading.

Confirmed, w/o splitting by block header it is 2x faster (since does not requires extra file reading after other patches in this series)

Before this patch batched mode of the DirectoryMonitor is 2x slower then non-batched, after it should be more or less the same as non-batched.

azat · 2021-01-15T06:46:53Z

AST fuzzer — Logical error: 'Block structure mismatch in PipelineExecuting stream: different columns:

#19007

- the sender will got ATTEMPT_TO_READ_AFTER_EOF (added in 946c275) when the client just go away, i.e. server had been restarted, and this is incorrect to mark the file as broken in this case. - since ClickHouse#18853 the file will be checked on the sender locally, and in case the file was truncated CANNOT_READ_ALL_DATA will be thrown. But before ClickHouse#18853 the sender will not receive ATTEMPT_TO_READ_AFTER_EOF from the client in case of file was truncated on the sender, since the client will just wait for more data, IOW just hang. - and I don't see how ATTEMPT_TO_READ_AFTER_EOF can be received while reading local file.

robot-clickhouse added the pr-improvement Pull request with some product improvements label Jan 8, 2021

azat marked this pull request as draft January 8, 2021 00:34

azat commented Jan 8, 2021

View reviewed changes

src/Storages/Distributed/DirectoryMonitor.cpp Outdated Show resolved Hide resolved

alexey-milovidov self-assigned this Jan 8, 2021

azat mentioned this pull request Jan 8, 2021

Do not check bit flips for big buffers (since the size can be corrupted) #18852

Merged

azat changed the title ~~Check per-block checksum of the distributed batch on the sender before sending~~ Improve checksum checks for async INSERT into Distributed on the sender Jan 10, 2021

azat added 4 commits January 10, 2021 18:17

Add a test for corruption during distributed send

f09a531

Refactoring distributed header parsing

fce8b6b

azat force-pushed the dist-send-checksums branch 2 times, most recently from f070693 to b7a5bd3 Compare January 10, 2021 17:26

azat added 2 commits January 10, 2021 21:23

Add CheckingCompressedReadBuffer

946a126

Buffer for reading from a compressed file with just checking checksums of the compressed blocks, without any decompression, so result can be proxied.

azat force-pushed the dist-send-checksums branch from b7a5bd3 to 2565d2a Compare January 10, 2021 18:23

azat marked this pull request as ready for review January 10, 2021 18:24

Fix readability-static-definition-in-anonymous-namespace in Directory…

5647577

…Monitor

Update test_check_table expectations

22c07e7

Since now CompressedReadBufferBase read checksum and header at once it will expect 25 (16+9) bytes not 16 (only checksum)

azat commented Jan 13, 2021

View reviewed changes

src/Storages/Distributed/DirectoryMonitor.cpp Outdated Show resolved Hide resolved

Avoid reading the distributed batch just to read the block header

ecae6c1

Before this patch batched mode of the DirectoryMonitor is 2x slower then non-batched, after it should be more or less the same as non-batched.

alexey-milovidov approved these changes Jan 15, 2021

View reviewed changes

alexey-milovidov merged commit ca825f3 into ClickHouse:master Jan 15, 2021

filimonov mentioned this pull request Jan 15, 2021

Distributed Sends got stuck #18369

Closed

azat deleted the dist-send-checksums branch January 15, 2021 18:13

azat mentioned this pull request Jan 19, 2021

Do not mark file for distributed send as broken on EOF #19290

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve checksum checks for async INSERT into Distributed on the sender#18853

Improve checksum checks for async INSERT into Distributed on the sender#18853
alexey-milovidov merged 9 commits intoClickHouse:masterfrom
azat:dist-send-checksums

azat commented Jan 8, 2021 •

edited

Loading

Uh oh!

Uh oh!

alexey-milovidov commented Jan 11, 2021 •

edited

Loading

Uh oh!

Uh oh!

azat commented Jan 13, 2021 •

edited

Loading

Uh oh!

azat commented Jan 13, 2021

Uh oh!

azat commented Jan 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

azat commented Jan 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

alexey-milovidov commented Jan 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

azat commented Jan 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

patched

upstream

Uh oh!

azat commented Jan 13, 2021

Uh oh!

azat commented Jan 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

azat commented Jan 8, 2021 •

edited

Loading

alexey-milovidov commented Jan 11, 2021 •

edited

Loading

azat commented Jan 13, 2021 •

edited

Loading