-
Notifications
You must be signed in to change notification settings - Fork 8.3k
Implement Query Condition Cache #69236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
alexey-milovidov
merged 66 commits into
ClickHouse:master
from
zhongyuankai:query_condition_cache
Mar 3, 2025
Merged
Changes from all commits
Commits
Show all changes
66 commits
Select commit
Hold shift + click to select a range
f8ff179
mark filter cache
zhongyuankai 7210854
Complete basic functions
zhongyuankai 7753160
support analyzer
zhongyuankai 6d78d37
batter
zhongyuankai 92d10a3
rename
zhongyuankai dd29d53
Refactor
zhongyuankai ce76c0a
fix
zhongyuankai 7701ebe
Merge branch 'master' into query_condition_cache
zhongyuankai 409219b
resolve conflicts
zhongyuankai 0e065a5
fix
zhongyuankai c9591eb
fix style
zhongyuankai 6a077b7
fix test
zhongyuankai fc8ab23
fix test
zhongyuankai da59d70
Merge branch 'master' into query_condition_cache
zhongyuankai 7858f72
fix test
zhongyuankai 708dec0
batter
zhongyuankai ad1b4c1
fix style
zhongyuankai ca4d835
fix test
zhongyuankai 6f68b4c
fix prewhere dag hash
zhongyuankai b42fa34
fix test
zhongyuankai 014e00d
batter
zhongyuankai a11f45c
Merge branch 'master' into query_condition_cache
zhongyuankai 1879115
Merge remote-tracking branch 'ClickHouse/master' into query_condition…
rschu1ze c098b69
Some fixups
rschu1ze 377b992
Mini fixup
rschu1ze ade09eb
Merge remote-tracking branch 'ClickHouse/master' into query_condition…
rschu1ze a339c08
Fixups
rschu1ze 4078567
Fix build
rschu1ze 4432e4b
Fixups
rschu1ze d354d94
Update test
rschu1ze c1d5079
Fixups
rschu1ze 0acd87b
fix style
zhongyuankai d86801b
Merge branch 'master' into query_condition_cache
zhongyuankai 73bfce7
fix test
zhongyuankai 03803fc
Merge branch 'master' into query_condition_cache
zhongyuankai 477a4a5
add log
zhongyuankai 6d899cb
fix test
zhongyuankai b322ffb
add debug log
zhongyuankai 9ce438c
fix test
zhongyuankai b9d3a1d
not support old analyzer
zhongyuankai 55b5799
fix test
zhongyuankai 5e42c1e
Merge branch 'master' into query_condition_cache
zhongyuankai 316b841
fix build
zhongyuankai 85f7760
Merge branch 'master' into query_condition_cache
zhongyuankai 5f04d28
Disable some tests using query condition cache because it will cause …
zhongyuankai 5d39ae4
Merge branch 'master' into query_condition_cache
zhongyuankai c124551
fix build
zhongyuankai fd33743
fix fast test
zhongyuankai bb8e09c
add debug log
zhongyuankai 774d3a0
Merge branch 'master' into query_condition_cache
zhongyuankai 5785125
Merge branch 'master' into query_condition_cache
alexey-milovidov 3687aee
Merge branch 'master' into query_condition_cache
zhongyuankai d24e380
Merge branch 'master' into query_condition_cache
alexey-milovidov 62f2e69
entry weight
zhongyuankai f2bcff6
Merge branch 'master' into query_condition_cache
zhongyuankai c8e3f30
Update QueryConditionCache.cpp
alexey-milovidov 196bbcf
add log
zhongyuankai 7a458b5
Merge branch 'master' into query_condition_cache
zhongyuankai 66778a7
log
zhongyuankai c41f9c2
Merge branch 'master' into query_condition_cache
zhongyuankai 9dc1b8d
Merge branch 'master' into query_condition_cache
zhongyuankai 4dfb32c
fix test
zhongyuankai 94244b7
Avoid intersecting ranges
zhongyuankai 0dd838d
Merge branch 'master' into query_condition_cache
zhongyuankai 6e2622b
fix style
zhongyuankai d9a0fe8
Merge branch 'master' into query_condition_cache
zhongyuankai File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,101 @@ | ||
| #include <Interpreters/Cache/QueryConditionCache.h> | ||
| #include <Storages/MergeTree/MergeTreeData.h> | ||
| #include "Interpreters/Cache/FileSegmentInfo.h" | ||
|
|
||
| namespace ProfileEvents | ||
| { | ||
| extern const Event QueryConditionCacheHits; | ||
| extern const Event QueryConditionCacheMisses; | ||
| }; | ||
|
|
||
| namespace DB | ||
| { | ||
|
|
||
| QueryConditionCache::QueryConditionCache(const String & cache_policy, size_t max_size_in_bytes, double size_ratio) | ||
| : cache(cache_policy, max_size_in_bytes, 0, size_ratio) | ||
| { | ||
| } | ||
|
|
||
| std::optional<QueryConditionCache::MatchingMarks> QueryConditionCache::read(const UUID & table_id, const String & part_name, size_t condition_hash) | ||
| { | ||
| Key key = {table_id, part_name, condition_hash}; | ||
|
|
||
| if (auto entry = cache.get(key)) | ||
| { | ||
| ProfileEvents::increment(ProfileEvents::QueryConditionCacheHits); | ||
|
|
||
| std::lock_guard lock(entry->mutex); | ||
| return {entry->matching_marks}; | ||
| } | ||
|
|
||
| ProfileEvents::increment(ProfileEvents::QueryConditionCacheMisses); | ||
|
|
||
| return std::nullopt; | ||
| } | ||
|
|
||
| void QueryConditionCache::write(const UUID & table_id, const String & part_name, size_t condition_hash, const MarkRanges & mark_ranges, size_t marks_count, bool has_final_mark) | ||
| { | ||
| Key key = {table_id, part_name, condition_hash}; | ||
|
|
||
| auto load_func = [&](){ return std::make_shared<Entry>(marks_count); }; | ||
| auto [entry, _] = cache.getOrSet(key, load_func); | ||
|
|
||
| chassert(marks_count == entry->matching_marks.size()); | ||
|
|
||
| /// Set MarkRanges to false, so there is no need to read these marks again later. | ||
| { | ||
| std::lock_guard lock(entry->mutex); | ||
| for (const auto & mark_range : mark_ranges) | ||
| std::fill(entry->matching_marks.begin() + mark_range.begin, entry->matching_marks.begin() + mark_range.end, false); | ||
|
|
||
| if (has_final_mark) | ||
| entry->matching_marks[marks_count - 1] = false; | ||
|
|
||
| LOG_DEBUG( | ||
| logger, | ||
| "table_id: {}, part_name: {}, condition_hash: {}, marks_count: {}, has_final_mark: {}, (ranges: {})", | ||
| table_id, | ||
| part_name, | ||
| condition_hash, | ||
| marks_count, | ||
| has_final_mark, | ||
| toString(mark_ranges)); | ||
| } | ||
| } | ||
|
|
||
| void QueryConditionCache::clear() | ||
| { | ||
| cache.clear(); | ||
| } | ||
|
|
||
| void QueryConditionCache::setMaxSizeInBytes(size_t max_size_in_bytes) | ||
| { | ||
| cache.setMaxSizeInBytes(max_size_in_bytes); | ||
| } | ||
|
|
||
| bool QueryConditionCache::Key::operator==(const Key & other) const | ||
| { | ||
| return table_id == other.table_id && part_name == other.part_name && condition_hash == other.condition_hash; | ||
| } | ||
|
|
||
| QueryConditionCache::Entry::Entry(size_t mark_count) | ||
| : matching_marks(mark_count, true) /// by default, all marks potentially are potential matches | ||
| { | ||
| } | ||
|
|
||
| size_t QueryConditionCache::KeyHasher::operator()(const Key & key) const | ||
| { | ||
| SipHash hash; | ||
| hash.update(key.table_id); | ||
| hash.update(key.part_name); | ||
| hash.update(key.condition_hash); | ||
| return hash.get64(); | ||
| } | ||
|
|
||
| size_t QueryConditionCache::QueryConditionCacheEntryWeight::operator()(const Entry & entry) const | ||
| { | ||
| /// Estimate the memory size of `std::vector<bool>`, for bool values, only 1 bit per element. | ||
| size_t dynamic_memory = (entry.matching_marks.capacity() + 7) / 8; /// Round up to bytes. | ||
| return sizeof(decltype(entry.matching_marks)) + dynamic_memory; | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,66 @@ | ||
| #pragma once | ||
zhongyuankai marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| #include <Common/CacheBase.h> | ||
| #include <Storages/MergeTree/MarkRange.h> | ||
|
|
||
| namespace DB | ||
| { | ||
|
|
||
| /// Cache the mark filter corresponding to the query condition, | ||
| /// which helps to quickly filter out useless Marks and speed up the query when the index is not hit. | ||
| class QueryConditionCache | ||
| { | ||
| public: | ||
| /// 0 means none of the rows in the mark match the predicate. We can skip such marks. | ||
| /// 1 means at least one row in the mark matches the predicate. We need to read such marks. | ||
| using MatchingMarks = std::vector<bool>; | ||
|
|
||
| QueryConditionCache(const String & cache_policy, size_t max_size_in_bytes, double size_ratio); | ||
|
|
||
| /// Read the filter and return empty if it does not exist. | ||
| std::optional<MatchingMarks> read(const UUID & table_id, const String & part_name, size_t condition_hash); | ||
|
|
||
| /// Take out the mark filter corresponding to the query condition and set it to false on the corresponding mark. | ||
| void write(const UUID & table_id, const String & part_name, size_t condition_hash, const MarkRanges & mark_ranges, size_t marks_count, bool has_final_mark); | ||
|
|
||
| void clear(); | ||
|
|
||
| void setMaxSizeInBytes(size_t max_size_in_bytes); | ||
|
|
||
| private: | ||
| struct Key | ||
| { | ||
| const UUID table_id; | ||
| const String part_name; | ||
| const size_t condition_hash; | ||
|
|
||
| bool operator==(const Key & other) const; | ||
| }; | ||
|
|
||
| struct Entry | ||
| { | ||
| MatchingMarks matching_marks; | ||
| std::mutex mutex; | ||
|
|
||
| explicit Entry(size_t mark_count); | ||
| }; | ||
|
|
||
| struct KeyHasher | ||
| { | ||
| size_t operator()(const Key & key) const; | ||
| }; | ||
|
|
||
| struct QueryConditionCacheEntryWeight | ||
| { | ||
| size_t operator()(const Entry & entry) const; | ||
| }; | ||
|
|
||
| using Cache = CacheBase<Key, Entry, KeyHasher, QueryConditionCacheEntryWeight>; | ||
| Cache cache; | ||
|
|
||
| LoggerPtr logger = getLogger("QueryConditionCache"); | ||
| }; | ||
|
|
||
| using QueryConditionCachePtr = std::shared_ptr<QueryConditionCache>; | ||
|
|
||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.