Implement combined disk and HTTP cache by aherrmann · Pull Request #7512 · bazelbuild/bazel

aherrmann · 2019-02-22T15:51:25Z

This PR adds the ability to enable disk and HTTP cache simultaneously. Specifically, if you set

build --remote_http_cache=http://some.remote.cache
build --disk_cache=/some/disk/cache

Then Bazel will look for cached items in the disk cache first. If an item is not found in the disk cache, it will be looked up in the HTTP cache. If it is found there, it will be copied into the disk cache. On put, Bazel will store items in both the disk and the HTTP cache.

We tested this change on a relatively large internal project and found that a fully cached, clean build was 5-10 times faster with the mixed cache (i.e. reading from disk cache) than with a pure remote cache, depending on the remote cache location.

cc @dajmaki @francesco-da

jin · 2019-02-22T18:59:37Z

Additionally assigning @ola-rozenfeld because @buchgr is unavailable.

aherrmann-da · 2019-02-25T09:11:16Z

CI fails on Ubuntu 14.04 with

ERRO[0559] error waiting for container: unexpected EOF
🚨 Error: The command exited with status 125

All other steps pass. I don't know, but this looks unrelated to this PR to me.

googlebot · 2019-02-26T08:31:22Z

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

aherrmann · 2019-02-26T08:32:54Z

+      // Write a temporary file first, and then rename, to avoid data corruption in case of a crash.
+      Path temp = toPath(UUID.randomUUID().toString());
+
+      try (OutputStream tempOut = temp.getOutputStream()) {


Using try-with-resources here caused tempOut to be closed to early occasionally leading to StreamClosed errors. Thanks @moritzkiefer-da for catching that!

aherrmann · 2019-02-27T10:25:19Z

@googlebot recheck CLA

googlebot · 2019-02-27T10:25:22Z

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of all the commit author(s), set the cla label to yes (if enabled on your project), and then merge this pull request when appropriate.

ℹ️ Googlers: Go here for more info.

moritzkiefer-da · 2019-02-27T10:27:38Z

I agree to my commits being contributed 👍

buchgr

Hi!

thanks for the PR! I think the overall idea is sound but I think it's more future proof to implement it slightly differently. Instead of having CombinedDiskHttpBlobStore extend from DiskBlobStore you should create a generic blobstore implementation that uses composition and that can check a list of blob stores in sequence. that way the code can be used more widely, as for example we would also like to use this code with remote execution and the gRPC based --remote_cache flag.

Also please add some unit and integration tests.

francesco-da · 2019-02-27T11:12:36Z

@buchgr we can't treat the disk store too opaquely because we want to have access to the file itself to be able to stream to it and then from (or the reverse) it efficiently without intermediate buffers. so we extended it to reuse the functions getting the file names and similar.

i think if we were to treat it opaquely we'd have to tweak some of the types in SimpleBlobStore a bit.

buchgr

fair points. overall looks good then. can add some tests please?

buchgr · 2019-02-27T12:23:56Z

@@ -0,0 +1,148 @@
+// Copyright 2017 The Bazel Authors. All rights reserved.


buchgr · 2019-02-27T12:27:48Z

+
+  @Override
+  public ListenableFuture<Boolean> get(String key, OutputStream out) {
+    boolean use_bsDisk = super.containsKey(key);


foundOnDisk?

buchgr · 2019-02-27T12:30:34Z

+
+  public CombinedDiskHttpBlobStore(Path root, SimpleBlobStore bsHttp) {
+    super(root);
+    this.bsHttp = bsHttp;


Preconditions.checkNotNull

buchgr · 2019-02-27T12:30:45Z

+
+import com.google.devtools.build.lib.vfs.Path;
+
+/** A {@link SimpleBlobStore} implementation combining two blob stores.


line break after **

- bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment)

googlebot · 2019-02-28T09:25:54Z

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

aherrmann · 2019-02-28T09:35:20Z

@buchgr Thanks! I've addressed your review comments.

can add some tests please?

Yes, I'm happy to add some tests. Though, I'm afraid I'll need some pointers. I tried adding two simple test cases, that check whether put puts data into both caches, and whether get copies from http cache to disk cache. I've checked the existing test suite for examples for blob store tests. Unfortunately, I couldn't find a test case for the disk cache. However, I think I figured out how to use InMemoryFileSystem to setup a disk cache for testing. But, I also couldn't find an example of a test case that tests a working http cache. The existing test cases only seem to operate against a dummy http server that doesn't actually store or read cached items. I'm not familiar with Bazel's test suite. So, it's perfectly possible that I overlooked something obvious. Could you point me to an example of how to test the http cache?

googlebot · 2019-02-28T12:43:21Z

So there's good news and bad news.

👍 The good news is that everyone that needs to sign a CLA (the pull request submitter and all commit authors) have done so. Everything is all good there.

😕 The bad news is that it appears that one or more commits were authored or co-authored by someone other than the pull request submitter. We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that here in the pull request.

Note to project maintainer: This is a terminal state, meaning the cla/google commit status will not change from this state. It's up to you to confirm consent of all the commit author(s), set the cla label to yes (if enabled on your project), and then merge this pull request when appropriate.

ℹ️ Googlers: Go here for more info.

- bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment)

buchgr · 2019-02-28T13:51:01Z

Unfortunately, I couldn't find a test case for the disk cache.

ouch :-). it's time to add some then.

However, I think I figured out how to use InMemoryFileSystem to setup a disk cache for testing.

Great. Yep, that's what to use for testing.

But, I also couldn't find an example of a test case that tests a working http cache. The existing test cases only seem to operate against a dummy http server that doesn't actually store or read cached items.

Correct. It's a bit of a mess (it's an ongoing cleanup effort). The unit tests currently mostly test error scenarios. We do have integration tests that provide a working http cache here: https://source.bazel.build/bazel/+/master:src/test/shell/bazel/remote/remote_execution_http_test.sh?q=remote_execution_http_test.sh i.e. look at test_cc_binary_http_cache.

Hope that helps.

This will result in the stream being closed immediately rather then when the future finishes running which leads to StreamClosed errors.

- bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment)

aherrmann · 2019-02-28T15:00:33Z

@buchgr Thank you, yes, that helps a lot.

I had been looking in the wrong place. I found some integration tests for the disk cache in src/test/shell/bazel/disk_cache_test.sh. I used those and the HTTP cache tests that you pointed to as a guide and added tests for the combined cache.

buchgr · 2019-03-04T13:53:27Z

+  mkdir -p a
+  cat > a/BUILD <<EOF
+package(default_visibility = ["//visibility:public"])
+cc_binary(


I know we are using cc_binary through the remote execution tests, but we are currently working towards removing it completely where not necessary and using a faster genrule (i.e one that writes hello world to some file) instead in order to speed up our tests.

Makes sense. I've changed the test case to use a genrule instead.

buchgr · 2019-03-04T13:54:51Z

+  cp -f bazel-bin/a/test ${TEST_TMPDIR}/test_expected
+
+  # Fetch from disk cache
+  bazel clean --expunge


--expunge shouldn't be necessary here. we are trying to avoid --expunge in tests in order to make them run faster.

buchgr · 2019-03-04T13:55:12Z

+    || fail "Disk cache generated different result"
+
+  # Fetch from http cache
+  bazel clean --expunge


buchgr · 2019-03-04T13:55:40Z

+  mkdir $cache
+
+  # Copy from http cache to disk cache
+  bazel clean --expunge


buchgr · 2019-03-04T13:55:49Z

+    || fail "HTTP cache generated different result"
+
+  # Fetch from disk cache
+  bazel clean --expunge


buchgr · 2019-03-04T13:59:31Z

+  bazel clean --expunge
+  bazel build $disk_flags //a:test &> $TEST_log \
+    || fail "Failed to fetch //a:test from disk cache"
+  expect_log "remote cache hit"


"1 remote cache hit"? same below

Yes, this is more concrete. I've changed it.

aherrmann requested review from buchgr, ola-rozenfeld and philwo as code owners February 22, 2019 15:51

googlebot added the cla: yes label Feb 22, 2019

aherrmann-da mentioned this pull request Feb 22, 2019

Implement combined disk and HTTP cache #7511

Closed

jin added the team-Remote-Exec Issues and PRs for the Execution (Remote) team label Feb 22, 2019

jin assigned buchgr and ola-rozenfeld Feb 22, 2019

googlebot added cla: no and removed cla: yes labels Feb 26, 2019

aherrmann commented Feb 26, 2019

View reviewed changes

buchgr suggested changes Feb 27, 2019

View reviewed changes

aherrmann commented Feb 27, 2019

View reviewed changes

Comment thread src/main/java/com/google/devtools/build/lib/remote/blobstore/CombinedDiskHttpBlobStore.java

buchgr suggested changes Feb 27, 2019

View reviewed changes

aherrmann force-pushed the combined-cache-pr branch from efbd7d9 to 14a8449 Compare February 28, 2019 09:25

aherrmann added a commit to aherrmann/bazel that referenced this pull request Feb 28, 2019

Address review comments

dfd6ed3

- bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment)

aherrmann force-pushed the combined-cache-pr branch from 14a8449 to 9d07c7d Compare February 28, 2019 12:43

aherrmann added a commit to aherrmann/bazel that referenced this pull request Feb 28, 2019

Address review comments

9a09390

- bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment)

Implement combined disk and HTTP cache

49aa8ec

moritzkiefer-da and others added 3 commits February 28, 2019 15:53

Don’t use try-with-resources for the tempOut stream

5626569

This will result in the stream being closed immediately rather then when the future finishes running which leads to StreamClosed errors.

Address review comments

7044156

- bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment) - bazelbuild#7512 (comment)

Combined disk and http cache integration test

e399b7a

aherrmann force-pushed the combined-cache-pr branch from 9d07c7d to e399b7a Compare February 28, 2019 14:57

buchgr reviewed Mar 4, 2019

View reviewed changes

Address review comments

3595fd7

buchgr approved these changes Mar 6, 2019

View reviewed changes

bazel-io closed this in 76370d5 Mar 6, 2019

buchgr mentioned this pull request Mar 8, 2019

Allow using remote cache and local_disk_cache together #5811

Closed

keith mentioned this pull request Apr 15, 2019

Remote cache + disk cache requests incorrect action keys #8052

Closed

kastiglione mentioned this pull request Apr 29, 2019

Change execution log proto to differentiate between remote and disk cache hits #8192

Closed

JaredNeil mentioned this pull request May 1, 2019

--disk_cache overrides --noremote_upload_local_results #8216

Closed

Globegitter mentioned this pull request May 9, 2019

Release 0.26 - May 2019 #7499

Closed

aherrmann mentioned this pull request May 17, 2019

Download performace regression between Bazel 0.24.0 and Bazel 0.25.2 #8383

Closed

SrodriguezO mentioned this pull request Jun 20, 2019

Allow for multi-level caching with the gRPC cache and the local disk cache #8690

Closed

		@@ -0,0 +1,148 @@
		// Copyright 2017 The Bazel Authors. All rights reserved.


		import com.google.devtools.build.lib.vfs.Path;

		/** A {@link SimpleBlobStore} implementation combining two blob stores.

Conversation

aherrmann commented Feb 22, 2019

Uh oh!

jin commented Feb 22, 2019

Uh oh!

aherrmann-da commented Feb 25, 2019

Uh oh!

googlebot commented Feb 26, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aherrmann commented Feb 27, 2019

Uh oh!

googlebot commented Feb 27, 2019

Uh oh!

moritzkiefer-da commented Feb 27, 2019

Uh oh!

buchgr left a comment

Choose a reason for hiding this comment

Uh oh!

francesco-da commented Feb 27, 2019

Uh oh!

Uh oh!

buchgr left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

googlebot commented Feb 28, 2019

Uh oh!

aherrmann commented Feb 28, 2019

Uh oh!

googlebot commented Feb 28, 2019

Uh oh!

buchgr commented Feb 28, 2019

Uh oh!

aherrmann commented Feb 28, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

buchgr Mar 4, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

buchgr Mar 4, 2019 •

edited

Loading