Make cache composable, allow not to evict certain files (regarding .idx, .mrk, ..), delete old cache version#36171
Conversation
e41e30c to
37242a0
Compare
6bbed28 to
0c43b7b
Compare
58a709c to
e501a5a
Compare
9fa9d02 to
3a087f5
Compare
688eab1 to
0ff5609
Compare
2aabe80 to
8faaa6d
Compare
alesapin
left a comment
There was a problem hiding this comment.
Huge amount of work. It's hard to do really detailed review. But in general, it's a big improvement in flexibility.
| return offset; | ||
| if (file_segments_holder && current_file_segment_it != file_segments_holder->file_segments.end()) | ||
| { | ||
| // auto & file_segments = file_segments_holder->file_segments; |
There was a problem hiding this comment.
I want to leave this commented code, because it should be a good improvement to reduce number of seeks, but it is very complex and need to be done very carefully and I will not do it right now, but having it there will remind that it needs to be done.
Or I can remove this commented code and add a TODO if it is better.
|
|
||
| if (allow_seeks) | ||
| { | ||
| if (whence != SEEK_SET && whence != SEEK_CUR) |
There was a problem hiding this comment.
But when !allow_seeks only SEEK_SET is allowed. So current way seems more correct.
| { | ||
| return Range{ | ||
| .left = static_cast<size_t>(offset), | ||
| .right = read_until_position ? std::optional{read_until_position - 1} : std::nullopt |
There was a problem hiding this comment.
Yes. A little inconsistent, but it happened. Because used as included in http read buffer.
| }; | ||
|
|
||
| if (objects.size() != 1) | ||
| throw Exception(ErrorCodes::CANNOT_USE_CACHE, "Unable to read multiple objects, support not added"); |
There was a problem hiding this comment.
| throw Exception(ErrorCodes::CANNOT_USE_CACHE, "Unable to read multiple objects, support not added"); | |
| throw Exception(ErrorCodes::CANNOT_USE_CACHE, "Read of multiple objects is unsupported"); |
But does it mean that *Log tables (which do appends) will not work?
There was a problem hiding this comment.
Well, I think I can remove this exception and multiple objects will work fine, just cache key will be deduces from the first path only. Anyway this logic is not applied for s3 and for it we have buffer->isIntegratedWithFIlesystemCache() == 1 and another implementation is used. For non-s3 I should add a test for *Log.
3b4a042 to
52832ea
Compare
f10b9a0 to
93816e7
Compare
|
Changelog entry should explain what exactly is incompatible and how to do upgrade |
54b68f7 to
d193eee
Compare
bab6dd1 to
f9b6091
Compare
the same |
|
@kssenii, Stateless tests flaky check (address) failure was related, you merged PR with flaky test |
| /// This is a workaround of a read pass EOF bug in linux kernel with pread() | ||
| if (file_size.has_value() && file_offset_of_buffer_end >= *file_size) | ||
| return false; | ||
| return false; |
There was a problem hiding this comment.
We should add a check for indentation to Style Check
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Make cache composable, allow not to evict certain files (regarding idx, mrk, ..), delete old cache version. Now it is possible to configure cache over Azure blob storage disk, over Local disk, over StaticWeb disk, etc. This PR is marked backward incompatible because cache configuration changes and in order for cache to work need to update the config file. Old cache will still be used with new configuration. The server will startup fine with the old cache configuration. Closes #36140. Closes #37889.