Skip to content

Dedicated Mark/Uncompressed cache for skip indices#27961

Merged
kitaisreal merged 2 commits intoClickHouse:masterfrom
amosbird:indexcache
Oct 15, 2021
Merged

Dedicated Mark/Uncompressed cache for skip indices#27961
kitaisreal merged 2 commits intoClickHouse:masterfrom
amosbird:indexcache

Conversation

@amosbird
Copy link
Collaborator

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add dedicated Mark/Uncompressed cache for skip indices. Not sure how to test it properly.

Detailed description / Documentation draft:
.

@robot-clickhouse robot-clickhouse added the pr-improvement Pull request with some product improvements label Aug 21, 2021
@CLAassistant
Copy link

CLAassistant commented Sep 28, 2021

CLA assistant check
All committers have signed the CLA.

@kitaisreal kitaisreal self-assigned this Oct 1, 2021
@kitaisreal kitaisreal merged commit 97fa53d into ClickHouse:master Oct 15, 2021
if (auto index_uncompressed_cache = getContext()->getIndexUncompressedCache())
{
new_values["IndexUncompressedCacheBytes"] = index_uncompressed_cache->weight();
new_values["IndexUncompressedCacheCells"] = index_uncompressed_cache->count();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there are different metrics for bytes/cells/files but same metric for cache misses/hits.
I think that it worth adding new metrics for hits/misses for these separate caches.
@amosbird what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@azat I plan to test this pull request and add them. Also enable them by default.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@azat Yes, it'll be useful. @kitaisreal Great!

@azat
Copy link
Member

azat commented Oct 19, 2021

@amosbird BTW have you tested this on a real workload? Maybe you can share some numbers then?

@amosbird
Copy link
Collaborator Author

amosbird commented Oct 20, 2021

@amosbird BTW have you tested this on a real workload? Maybe you can share some numbers then?

We saw some improvement but it's not substantial because the related workload always requests realtime data and the cache hit rate is poor. It might be useful to fill the cache during insert/merge.


/// Size of cache for index marks (index of MergeTree skip indices). It is necessary.
/// Specify default value for index_mark_cache_size explicitly!
size_t index_mark_cache_size = config().getUInt64("index_mark_cache_size", 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This major performance degradation went unnoticed for two years.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, before this PR, there is no mark/uncompressed cache for secondary indices at all. I don't think it's a degradation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants