Do not check bit flips for big buffers (since the size can be corrupted)#18852
Do not check bit flips for big buffers (since the size can be corrupted)#18852alexey-milovidov merged 2 commits intoClickHouse:masterfrom
Conversation
| for (size_t bit_pos = 0; bit_pos < size * 8; ++bit_pos) | ||
| /// If size is too huge, then this may be caused by corruption. | ||
| /// And anyway this is pretty heavy, so avoid burning too much CPU here. | ||
| if (size < 1<<20) |
There was a problem hiding this comment.
It should take about a hour for one megabyte.
Compressed block of size 1 MiB is not huge but normal (for 1 MiB of uncompressable data).
Did you encounter this issue in practice?
Maybe also add a check that compressed size is not too bigger (more than 2 times) than uncompressed size (and do this check also in compression)?
There was a problem hiding this comment.
It should take about a hour for one megabyte.
Something like this (approx)?
WITH
1024 * 1024 AS size,
8 * size AS bits,
((5.3 * 1024) * 1024) * 1024 AS speed_of_cityhash64
SELECT ((bits * size) / (speed_of_cityhash64 / 2)) / 3600
┌─divide(divide(multiply(bits, size), divide(speed_of_cityhash64, 2)), 60)─┐
│ 0.8050314465408805 │
└──────────────────────────────────────────────────────────────────────────┘Corrupting file to exactly one megabyte is hard but possible.
1mb was chosen because it matches with the max_compress_block_size.
But we can use something like min_compress_block_size*2, it will take ~1min
Did you encounter this issue in practice?
Yes, in the #18853 (there is a note that it is likely will require this PR)
And you can see that the test failed with timeout, while w/ this patch they will pass
Maybe also add a check that compressed size is not too bigger (more than 2 times) than uncompressed size (and do this check also in compression)?
Hm interesting, yes this should be done (although 2x maybe too much, also I guess it need some padding), and also it should check that those size are not exceeded file size but this will require some refactoring.
Changelog category (leave one):