[8.6.0] Add option to continue with local execution if remote cache is unavailable.#28001
Conversation
…lable. Currently, if the endpoint specified by `--remote_cache` is not available at the start of the build, the build fails with a message saying that the remote cache is not available. This change introduces a new flag `--incompatible_remote_local_fallback_for_remote_cache`, which when combined with `--remote_local_fallback`, allow build to continue build even if the remote cache is not available initially. Fixes bazelbuild#27734, bazelbuild#25965. Closes bazelbuild#27846. PiperOrigin-RevId: 844783391 Change-Id: Id414458b27c2673318ac9767d07bd54887a74e45
There was a problem hiding this comment.
Code Review
This pull request introduces a new flag, --incompatible_remote_local_fallback_for_remote_cache, to allow builds to continue with local execution if the remote cache is unavailable, instead of failing. The changes in RemoteOptions.java add the new flag, and RemoteSpawnCache.java is modified to handle RemoteExecutionCapabilitiesException by falling back to local execution when the flag is enabled. A new shell test is also added to verify this new fallback behavior.
My review identifies a critical issue in RemoteSpawnCache.java. The new fallback logic can lead to a memory leak if results are not configured for upload, as a LocalExecution object is not properly closed. I've provided a suggestion to fix this leak by ensuring the object is closed in the fallback path when no upload will occur.
| boolean shouldLocalFallback = | ||
| options.remoteLocalFallbackForRemoteCache && options.remoteLocalFallback; | ||
| if (!shouldLocalFallback) { | ||
| if (thisExecution != null) { | ||
| thisExecution.close(); | ||
| } | ||
| throw createExecExceptionFromRemoteExecutionCapabilitiesException(e); | ||
| } |
There was a problem hiding this comment.
When shouldLocalFallback is true, this exception is swallowed, and execution continues. This is the intended behavior for the fallback mechanism.
However, there's a potential resource leak. If shouldUploadLocalResults is false, the lookup method will eventually return SpawnCache.NO_RESULT_NO_STORE. In this scenario, the thisExecution object (an instance of LocalExecution), which was created and stored in inFlightExecutions, is never closed or removed from the map. This will lead to a memory leak, which could eventually cause an OutOfMemoryError.
This might be a pre-existing issue (e.g., for CacheNotFoundException), but this change makes it more likely to be triggered when a remote cache is unavailable.
To fix this, we should explicitly close thisExecution when we fall back and we know we are not going to upload the results.
boolean shouldLocalFallback =
options.remoteLocalFallbackForRemoteCache && options.remoteLocalFallback;
if (!shouldLocalFallback) {
if (thisExecution != null) {
thisExecution.close();
}
throw createExecExceptionFromRemoteExecutionCapabilitiesException(e);
}
if (!shouldUploadLocalResults && thisExecution != null) {
thisExecution.close();
}90b7bf1
### What does this PR do? Bump `.bazelversion` from 8.5.1 to 8.6.0. ### Motivation Selected changes between 8.5.1 and 8.6.0: - Fix visibility for implicit deps of parent rules (bazelbuild/bazel#28722) - Force rctx.{download_and,}extract to create user-readable files (bazelbuild/bazel#28551) - Fix disk cache failures on concurrent read-write access on Windows (bazelbuild/bazel#28529) - Add a target_type argument to ctx.actions.symlink (bazelbuild/bazel#28538) - Compensate for Windows filesystems lacking junction support (bazelbuild/bazel#28367) (our fix) - Add short_uncached and detailed_uncached options to --test_summary (bazelbuild/bazel#28343) - Add --experimental_strict_repo_env option (bazelbuild/bazel#28189) - Make overlaid files executable in http_archive (bazelbuild/bazel#28277) - Add bazel mod show_repo --all_repos and --all_visible_repos (bazelbuild/bazel#28012) - Enable --experimental_retain_test_configuration_across_testonly (bazelbuild/bazel#28115) - Add option to continue with local execution if remote cache is unavailable (bazelbuild/bazel#28001)
### What does this PR do? Bump `.bazelversion` from 8.5.1 to 8.6.0. ### Motivation Selected changes between 8.5.1 and 8.6.0: - Fix visibility for implicit deps of parent rules (bazelbuild/bazel#28722) - Force rctx.{download_and,}extract to create user-readable files (bazelbuild/bazel#28551) - Fix disk cache failures on concurrent read-write access on Windows (bazelbuild/bazel#28529) - Add a target_type argument to ctx.actions.symlink (bazelbuild/bazel#28538) - Compensate for Windows filesystems lacking junction support (bazelbuild/bazel#28367) (our fix) - Add short_uncached and detailed_uncached options to --test_summary (bazelbuild/bazel#28343) - Add --experimental_strict_repo_env option (bazelbuild/bazel#28189) - Make overlaid files executable in http_archive (bazelbuild/bazel#28277) - Add bazel mod show_repo --all_repos and --all_visible_repos (bazelbuild/bazel#28012) - Enable --experimental_retain_test_configuration_across_testonly (bazelbuild/bazel#28115) - Add option to continue with local execution if remote cache is unavailable (bazelbuild/bazel#28001)
### What does this PR do? Bump `bazel` version from 8.5.1 to 8.6.0 to benefit from a series of improvements and fixes. Ours (bazelbuild/bazel#28367) allows to re-enable "convenience symlinks" for Windows users and makes [`path.realpath`](https://bazel.build/rules/lib/builtins/path#realpath) succeed when sharing a folder between a Linux host and a Windows VM. ### Motivation Selected changes between 8.5.1 and 8.6.0: - 💡 bazelbuild/bazel#28001 - bazelbuild/bazel#28012 - 💡 bazelbuild/bazel#28189 - bazelbuild/bazel#28277 - bazelbuild/bazel#28343 - 🐕 bazelbuild/bazel#28367 - bazelbuild/bazel#28529 - bazelbuild/bazel#28538 - bazelbuild/bazel#28551 - bazelbuild/bazel#28722 Co-authored-by: regis.desgroppes <[email protected]>
Currently, if the endpoint specified by
--remote_cacheis not available at the start of the build, the build fails with a message saying that the remote cache is not available.This change introduces a new flag
--incompatible_remote_local_fallback_for_remote_cache, which when combined with--remote_local_fallback, allow build to continue build even if the remote cache is not available initially.Fixes #27734, #25965.
Closes #27846.
PiperOrigin-RevId: 844783391
Change-Id: Id414458b27c2673318ac9767d07bd54887a74e45
Commit 461c074