Skip to content

Control cache downloads to avoid negative optimization of local caches#37516

Merged
kssenii merged 13 commits intoClickHouse:masterfrom
KinderRiven:improve_local_cache
May 27, 2022
Merged

Control cache downloads to avoid negative optimization of local caches#37516
kssenii merged 13 commits intoClickHouse:masterfrom
KinderRiven:improve_local_cache

Conversation

@KinderRiven
Copy link
Contributor

@KinderRiven KinderRiven commented May 25, 2022

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Currently clickhouse directly downloads all remote files to the local cache (even if they are only read once), which will frequently cause IO of the local hard disk. In some scenarios, these IOs may not be necessary and may easily cause negative optimization. As shown in the figure below, when we run SSB Q1-Q4, the performance of the cache has caused negative optimization.

image

In response to the above problems, we record the data access trend (an LRU queue) within an access frame at the cache layer. Only frequently accessed cache blocks will be saved locally. As shown in the figure, the addition of control will not cause significant negative optimization, but the data with hot spots can still be cached locally.

image

The relevant threshold can be set in the configuration file (the threshold indicates how many times a certain piece of data is accessed before being cached, the default value is 0).

image

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

@KinderRiven KinderRiven changed the title impl improve remote fs cache Control cache downloads to avoid negative optimization of local caches May 25, 2022
@robot-ch-test-poll robot-ch-test-poll added the pr-improvement Pull request with some product improvements label May 25, 2022
@kssenii kssenii self-assigned this May 25, 2022
@kssenii kssenii added the can be tested Allows running workflows for external contributors label May 25, 2022
@KinderRiven KinderRiven requested a review from kssenii May 27, 2022 02:13
@kssenii kssenii merged commit f5d6950 into ClickHouse:master May 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants