Userspace page cache v2 by al13n321 · Pull Request #70509 · ClickHouse/ClickHouse

al13n321 · 2024-10-09T07:42:14Z

Changelog category (leave one):

New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

A new implementation of the Userspace Page Cache, which allows caching data in the in-process memory instead of relying on the OS page cache, which is useful when the data is stored on a remote virtual filesystem without backing with the local filesystem cache.

Un-overengineered it: got rid of the madvise(MADV_FREE) stuff, it was slow. It's just a CacheBase now. Size adjusted periodically by MemoryWorker thread, which runs every 50ms by default. Need to also add size adjustment on memory allocation path, similar to overcommit tracker.

robot-ch-test-poll3 · 2024-10-09T07:44:13Z

This is an automated comment for commit af4a9e5 with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check name	Description	Status
Integration tests	The integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests	❌ failure
Stateless tests	Runs stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	❌ failure

Successful checks

Check name	Description	Status
AST fuzzer	Runs randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help	✅ success
Builds	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
BuzzHouse (asan)	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
BuzzHouse (msan)	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
BuzzHouse (tsan)	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
BuzzHouse (ubsan)	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
ClickBench	Runs ClickBench with instant-attach table	✅ success
Compatibility check	Checks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help	✅ success
Docker keeper image	The check to build and optionally push the mentioned image to docker hub	✅ success
Docker server image	The check to build and optionally push the mentioned image to docker hub	✅ success
Docs check	Builds and tests the documentation	✅ success
Fast test	Normally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here	✅ success
Flaky tests	Checks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integration tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc	✅ success
Install packages	Checks that the built packages are installable in a clear environment	✅ success
Performance Comparison	Measure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests	✅ success
Stress test	Runs stateless functional tests concurrently from several clients to detect concurrency-related errors	✅ success
Style check	Runs a set of checks to keep the code style clean. If some of tests failed, see the related log from the report	✅ success
Unit tests	Runs the unit tests for different release types	✅ success
Upgrade check	Runs stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts	✅ success

al13n321 · 2024-10-10T03:47:58Z

I don't have good ideas for how to make autoresizing work. Queries fail with out-of-memory errors because MemoryTracker's "rss" adjustment races with memory deallocation, so sometimes MemoryTracker clears the page cache, then updates "rss" to a high value (from before the clearing), then we allocate a little more emory, MemoryTracker clears the cache again but "rss" is still too high and the query fails.

al13n321 · 2024-10-10T03:51:36Z

Also not looking forward to dealing with the MergeTree read path and ReadBuffer spaghetti again...

antonio2368 · 2024-10-10T08:43:01Z

Maybe try something purge_mib.run() so you instantly return the dirty pages to jemalloc?
The problem is we don't have control over what jemalloc will reuse from cache and what it will allocate from OS.
But I see you do purge before autoResize
I would just avoid purging every tick of MemoryWorker. We can even introduce a soft limit so we start purging before we hit the hard limit or use allocated amount along RSS to decide when to do it.

clickhouse-gh · 2024-12-10T13:17:24Z

Dear @al13n321, this PR hasn't been updated for a while. Will you continue working on it? If not, please close it. Otherwise, ignore this message.

src/Common/MemoryWorker.cpp

al13n321 · 2025-01-18T00:10:35Z

Worked fine in a few manual tests in a toy cgroup with 8 GB memory limit, didn't get oom-killed or fail queries unnecessarily. Seems good to go.

(Also tried setting memory_worker_period_ms to 10s to hit the code path that shrinks cache on memory allocation (as opposed to MemoryWorker thread). The shrinking seems to work, but the server gets oom-killed often, even without userspace page cache. I'm not sure exactly why, but it's easy to imagine either jemalloc dirty memory or kernel memory fluctuating enough to explain it.)

…the other way around. It doesn't mix with distributed cache otherwise.

clickhouse-gh · 2025-02-14T22:33:58Z

Workflow [PR], commit [abc94a5]

programs/server/Server.cpp

src/Common/CurrentMemoryTracker.cpp

src/Common/MemoryTracker.cpp

src/Common/MemoryWorker.cpp

src/Common/PageCache.cpp

src/Interpreters/Context.h

src/Interpreters/Context.cpp

src/IO/CachedInMemoryReadBufferFromFile.cpp

…cause it's too slow rather than because the memory limit is hit.

nikitamikhaylov · 2025-03-03T19:15:07Z

@al13n321 ASAN reports a problem in Integration tests (asan, flaky check) https://pastila.nl/?000207d3/5332b66a5836f334a35b887014bb05bd#Z6+P1Gn9Fv297FDSIBHLmw==

…mpl' in parallel

nikitamikhaylov · 2025-03-04T14:36:11Z

03367_bfloat16_tuple_final is failing in master.

robot-ch-test-poll1 added the pr-not-for-changelog This PR should not be mentioned in the changelog label Oct 9, 2024

al13n321 force-pushed the npc branch from 79a01f3 to a6df361 Compare October 9, 2024 07:44

kssenii self-assigned this Oct 9, 2024

al13n321 force-pushed the npc branch 2 times, most recently from 799d248 to bc67d1f Compare October 9, 2024 09:12

Userspace page cache v2

5c94876

al13n321 force-pushed the npc branch from bc67d1f to 5c94876 Compare October 10, 2024 04:52

al13n321 mentioned this pull request Nov 28, 2024

Remove flaky test_page_cache #72613

Merged

al13n321 added 2 commits December 23, 2024 19:34

Merge remote-tracking branch 'origin/master' into npc

14b153b

Conflicts

4849963

alexey-milovidov mentioned this pull request Dec 31, 2024

Roadmap 2025 #74046

Closed

76 tasks

al13n321 and others added 7 commits January 15, 2025 03:09

Merge remote-tracking branch 'origin/master' into npc

e847270

Improvements

1a4ec29

Fix test

b92a13e

Automatic style fix

3dd76a7

Remove flaky assert

107897a

Merge remote-tracking branch 'origin/npc' into npc

8463117

Unbreak test

f38bc53

antonio2368 reviewed Jan 17, 2025

View reviewed changes

src/Common/MemoryWorker.cpp Outdated Show resolved Hide resolved

al13n321 added 3 commits January 17, 2025 23:03

Small improvement

e4d8c8c

Merge remote-tracking branch 'origin/master' into npc

7080b0a

Tidy

b0f68c7

al13n321 marked this pull request as ready for review January 18, 2025 00:04

"""tidy"""

14fb417

al13n321 added 6 commits January 24, 2025 02:16

Fix

f9c578d

Make the cached buffer wrap ReadBufferFromRemoteFSGather rather than …

e7c70fd

…the other way around. It doesn't mix with distributed cache otherwise.

Merge remote-tracking branch 'origin/master' into npc

3483a13

Add comment

5bf3a3b

Merge remote-tracking branch 'origin/master' into npc

af4a9e5

Merge remote-tracking branch 'origin/master' into npc

3df639a

alexey-milovidov unassigned antonio2368 and jkartseva Feb 21, 2025

Merge branch 'master' of github.com:ClickHouse/ClickHouse into npc

6d7e414

antonio2368 reviewed Feb 24, 2025

View reviewed changes

al13n321 added 2 commits February 24, 2025 20:22

Merge remote-tracking branch 'origin/master' into npc

81a2da4

Review comments.

a0fdae4

antonio2368 reviewed Feb 25, 2025

View reviewed changes

src/IO/CachedInMemoryReadBufferFromFile.cpp Show resolved Hide resolved

More review comments

6713db9

antonio2368 self-assigned this Feb 25, 2025

antonio2368 approved these changes Feb 25, 2025

View reviewed changes

al13n321 and others added 3 commits February 26, 2025 02:07

Double memory limit in the test again. Now it'll probably be flaky be…

aa275bb

…cause it's too slow rather than because the memory limit is hit.

Merge branch 'master' into npc

e59a979

Merge branch 'master' into npc

c9f8b56

al13n321 added 2 commits March 4, 2025 02:08

Merge remote-tracking branch 'origin/master' into npc

52e4090

Fix AsynchronousBoundedReadBuffer calling thread-unsafe methods on 'i…

abc94a5

…mpl' in parallel

nikitamikhaylov added this pull request to the merge queue Mar 4, 2025

Merged via the queue into master with commit 825f023 Mar 4, 2025
121 of 126 checks passed

nikitamikhaylov deleted the npc branch March 4, 2025 15:23

tavplubix mentioned this pull request Mar 4, 2025

Revert "Userspace page cache v2" #77113

Merged

robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Mar 4, 2025

Conversation

al13n321 commented Oct 9, 2024 • edited by alexey-milovidov Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Uh oh!

robot-ch-test-poll3 commented Oct 9, 2024 • edited by robot-ch-test-poll2 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

al13n321 commented Oct 10, 2024

Uh oh!

al13n321 commented Oct 10, 2024

Uh oh!

antonio2368 commented Oct 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clickhouse-gh bot commented Dec 10, 2024

Uh oh!

Uh oh!

al13n321 commented Jan 18, 2025

Uh oh!

clickhouse-gh bot commented Feb 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nikitamikhaylov commented Mar 3, 2025

Uh oh!

nikitamikhaylov commented Mar 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

al13n321 commented Oct 9, 2024 •

edited by alexey-milovidov

Loading

robot-ch-test-poll3 commented Oct 9, 2024 •

edited by robot-ch-test-poll2

Loading

antonio2368 commented Oct 10, 2024 •

edited

Loading

clickhouse-gh bot commented Feb 14, 2025 •

edited

Loading