[7.2.0] Add option to track sandbox stashes in memory #22637
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change introduces the flag
--experimental_inmemory_sandbox_stashesto track in memory the contents of the stashes stored with--reuse_sandbox_directories=trueWith the old behavior Bazel has to perform a lot of I/O to read the contents of each stash before reusing it in a new action. Essentially, it checks every directory and subdirectory in the stashed sandbox to find out which files are different to the inputs of the new action about to be executed.
With in-memory stashes we associate each stash to the symlinks that needed to be created for that action. We also store the time at which these symlinks were created. In a background thread after the action has finished executing we stat every directory and for the ones that have changed (this should be rare) we update the contents. Because we only read the contents of the directories that have changed we do much less I/O than before.
If an action purposefully changes a sandbox symlink in-place, this won't be detected by statting the directory. I don't know any use case for this since the symlink itself is an implementation detail to achieve sandboxing. For this reason, manipulation of sandbox symlinks in-place is not supported.
Depending on the build this change might have a significant effect on memory. It should generally improve wall time further in builds where
--reuse_sandbox_directoriesalready improved wall time.This change also introduces a minor optimization which is to associate each stash with the target that it was originally created for. When a new action wants to reuse a stash and there is more than one available, it will take the stash whose target is closest to its own. This is done with the assumption that targets that are closer together in the graph will have more inputs in common.
Fixes #22309 .
Closes #22559.
PiperOrigin-RevId: 640142481
Change-Id: Iece2d718df47f403e2fe91c1bd887604eceee8ee
Commit 1c0135c