Description of the problem / feature request:
When building with --remote_download_toplevel, some actions that are definitely in the cache are missed.
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
% bazel clean
% bazel build <target> --remote_download_toplevel
INFO: Elapsed time: 306.725s, Critical Path: 44.48s
INFO: 1004 processes: 167 remote cache hit, 9 internal, 828 darwin-sandbox.
INFO: Build completed successfully, 1004 total actions
% bazel clean
% bazel build <target> --remote_download_toplevel
INFO: Elapsed time: 100.161s, Critical Path: 75.20s
INFO: 1066 processes: 902 remote cache hit, 64 internal, 100 darwin-sandbox.
INFO: Build completed successfully, 1066 total actions
The second build should have 100% hit rate, but does not. At the end of this is a build log diff [1] showing a rebuild where all the actions are identical.
Anecdotally, the hit rate increases on each iteration of clean+build. On the second build, maybe 10% of actions are missed, and it seems like they are fairly deep in the action graph - usually library linking and the like. On the third, it will be far fewer. It seems like actions that have inputs which are present only in the cache (vs locally present) end up generating different hashes
What operating system are you running Bazel on?
OSX 10.14 + Xcode 11.3.1
What's the output of bazel info release?
The issue does not occur with release 4.0.0 but does occur with release 4.1.0rc1 (#13099) as well as with master.
Have you found anything relevant by searching the web?
No
Any other information, logs, or outputs that you want to share?
[1] https://gist.github.com/jlaxson/aec2813d19a8494b237e15093e0359a3
This is a side by side diff (left is first build when cache is totally empty, right is clean+rebuild) with two actions excerpted. First one was a cache hit and shows no actual_outputs in the event, second is a lib made from that object, and was a cache miss despite all inputs being the same
--announce_rc output:
INFO: Options provided by the client:
Inherited 'common' options: --isatty=1 --terminal_columns=234
INFO: Reading rc options for 'build' from /Users/me/repo/.bazelrc:
Inherited 'common' options: --experimental_allow_tags_propagation --experimental_repo_remote_exec
INFO: Reading rc options for 'build' from /Users/me/repo/.bazelrc:
'build' options: --cxxopt=-std=c++14 --host_cxxopt=-std=c++14 --platforms=//:macos -c opt --copt=-g --features=-static_link_cpp_runtimes --dynamic_mode=fully --build_tag_filters=-halide_generator --experimental_cc_shared_library --experimental_guard_against_concurrent_changes --define open_source_build=true --per_file_copt=tensorflow/.*@-g0
INFO: Reading rc options for 'build' from /Users/me/.bazelrc:
'build' options: --sandbox_writable_path=/Users/me/.ccache --repository_cache=/Users/me/Library/Caches/bazel_repository --distdir=/Users/me/Library/Caches/bazel_dist
INFO: Found applicable config definition common:gcs_cache in file /Users/me/repo/.bazelrc: --google_default_credentials
INFO: Found applicable config definition common:gcs_cache in file /Users/me/.bazelrc: --google_credentials=/creds_file.json
INFO: Found applicable config definition build:gcs_cache in file /User/me/.bazelrc: --remote_cache=https://storage.googleapis.com/cache-bucket
Description of the problem / feature request:
When building with --remote_download_toplevel, some actions that are definitely in the cache are missed.
Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
The second build should have 100% hit rate, but does not. At the end of this is a build log diff [1] showing a rebuild where all the actions are identical.
Anecdotally, the hit rate increases on each iteration of clean+build. On the second build, maybe 10% of actions are missed, and it seems like they are fairly deep in the action graph - usually library linking and the like. On the third, it will be far fewer. It seems like actions that have inputs which are present only in the cache (vs locally present) end up generating different hashes
What operating system are you running Bazel on?
OSX 10.14 + Xcode 11.3.1
What's the output of
bazel info release?The issue does not occur with
release 4.0.0but does occur withrelease 4.1.0rc1(#13099) as well as with master.Have you found anything relevant by searching the web?
No
Any other information, logs, or outputs that you want to share?
[1] https://gist.github.com/jlaxson/aec2813d19a8494b237e15093e0359a3
This is a side by side diff (left is first build when cache is totally empty, right is clean+rebuild) with two actions excerpted. First one was a cache hit and shows no actual_outputs in the event, second is a lib made from that object, and was a cache miss despite all inputs being the same
--announce_rc output: