Skip to content

Commit d705928

Browse files
joelebacopybara-github
authored andcommitted
Fix a potential infinite loop in the case of an interruption.
**The Issue** Some external users reported the following sequence: 1. Build starts 2. Build interrupted very early on 3. Another build is started. The command line says "A previous command is running", while the server is stuck. What happened under the hood: The issue could be reproduced very reliably by placing a breakpoint here[1] and interrupt the build. Bazel is in the middle of the recursive `IncrementalPackageRoots.registerAndPlantMissingSymlinks` method when it received the interruption. One important detail: we only add a NestedSet to the `donePackagesRef` set when the _method_ is done successfully. When there's an interruption, we always bail early and never actually reach this line where the NestedSet is added to the set[2]. Without deduplication, this could lead to what feels like an finite loop if the packages are structured like so: ``` [[A], [B, [A]]] ``` In this case, NestedSet `[A]` represents a common child of many NestedSets and would be repeated again and again. We've indeed observed this in a real build, making it unable to finish within any reasonable timeframe. **The Solution** It was overly restrictive to only commit a NestedSet into the de-dup set _after_ all of its symlinks have been planted. It only makes sense if we're planting the symlinks for multiple top-level targets at the same time and want to avoid the situation where a top-level target is allowed to enter execution without all of its symlinks planted. We're already avoiding this situation by design by planting the symlinks for 1 single top-level target at a time. To avoid the near-infinite loop caused by a repeated NestedSet, we add each NestedSet to the de-duplication set the very first time it's seen. **Changes in this CL** - [Bug-fixing] Add a NestedSet to the de-duplication set the very first time it's seen. - [Code simplicity] 1 single blocking `Future.get()` instead of 1 for each recursive layer. Fixes #22586. --- [1] https://github.com/bazelbuild/bazel/blob/193b114287b3e20850a4b106b889771dfa63a601/src/main/java/com/google/devtools/build/lib/skyframe/IncrementalPackageRoots.java#L253 [2] https://github.com/bazelbuild/bazel/blob/193b114287b3e20850a4b106b889771dfa63a601/src/main/java/com/google/devtools/build/lib/skyframe/IncrementalPackageRoots.java#L256 PiperOrigin-RevId: 640524271 Change-Id: I63c39d7c8f27abaf9229396af1424e775cf5f85f
1 parent c53bbda commit d705928

File tree

1 file changed

+30
-24
lines changed

1 file changed

+30
-24
lines changed

src/main/java/com/google/devtools/build/lib/skyframe/IncrementalPackageRoots.java

Lines changed: 30 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,7 @@ public static IncrementalPackageRoots createAndRegisterToEventBus(
158158
* </pre>
159159
*
160160
* We'd plant the symlink to "noclash" first, then wait to see whether we need "foo" or "Foo". If
161-
* we end up needing both, throw an error. See {@link #registerAndPlantMissingSymlinks}.
161+
* we end up needing both, throw an error. See {@link #recursiveRegisterAndPlantMissingSymlinks}.
162162
*/
163163
public void eagerlyPlantSymlinksToSingleSourceRoot() throws AbruptExitException {
164164
try {
@@ -193,7 +193,7 @@ public PackageRootLookup getPackageRootLookup() {
193193
// a symlink and starting an action that requires that symlink. This race condition is possible
194194
// because of the various memoizations we use to avoid repeated work.
195195
@Subscribe
196-
public void topLevelTargetReadyForSymlinkPlanting(TopLevelTargetReadyForSymlinkPlanting event)
196+
public void lazilyPlantSymlinks(TopLevelTargetReadyForSymlinkPlanting event)
197197
throws AbruptExitException {
198198
if (allowExternalRepositories || !maybeConflictingBaseNamesLowercase.isEmpty()) {
199199
Set<NestedSet.Node> donePackagesLocalRef;
@@ -206,10 +206,28 @@ public void topLevelTargetReadyForSymlinkPlanting(TopLevelTargetReadyForSymlinkP
206206
donePackagesLocalRef = donePackages;
207207
lazilyPlantedSymlinksLocalRef = lazilyPlantedSymlinks;
208208
}
209-
registerAndPlantMissingSymlinks(
209+
210+
// Initial capacity: arbitrarily chosen.
211+
// This list doesn't need to be thread-safe, as items are added sequentially.
212+
List<ListenableFuture<Void>> futures = new ArrayList<>(128);
213+
recursiveRegisterAndPlantMissingSymlinks(
210214
event.transitivePackagesForSymlinkPlanting(),
211215
donePackagesLocalRef,
212-
lazilyPlantedSymlinksLocalRef);
216+
lazilyPlantedSymlinksLocalRef,
217+
futures);
218+
219+
// Now wait on the futures. After that, we can be sure that the symlinks have been planted.
220+
try {
221+
Futures.whenAllSucceed(futures).call(() -> null, directExecutor()).get();
222+
} catch (InterruptedException e) {
223+
Thread.currentThread().interrupt();
224+
// Bail
225+
} catch (ExecutionException e) {
226+
if (e.getCause() instanceof AbruptExitException) {
227+
throw (AbruptExitException) e.getCause();
228+
}
229+
throw new IllegalStateException("Unexpected exception", e);
230+
}
213231
}
214232
}
215233

@@ -224,16 +242,17 @@ public void analysisFinished(AnalysisPhaseCompleteEvent unused) {
224242
* <p>There are 2 possibilities: either we're planting symlinks to the external repos, or there's
225243
* potentially conflicting symlinks detected.
226244
*/
227-
private void registerAndPlantMissingSymlinks(
228-
NestedSet<Package> packages, Set<Node> donePackagesRef, Set<Path> lazilyPlantedSymlinksRef)
229-
throws AbruptExitException {
245+
private void recursiveRegisterAndPlantMissingSymlinks(
246+
NestedSet<Package> packages,
247+
Set<Node> donePackagesRef,
248+
Set<Path> lazilyPlantedSymlinksRef,
249+
List<ListenableFuture<Void>> futures) {
230250
// Optimization to prune subsequent traversals.
231251
// A false negative does not affect correctness.
232-
if (donePackagesRef.contains(packages.toNode())) {
252+
if (!donePackagesRef.add(packages.toNode())) {
233253
return;
234254
}
235255

236-
List<ListenableFuture<Void>> futures = new ArrayList<>(packages.getLeaves().size());
237256
synchronized (symlinkPlantingPool) {
238257
// Some other thread shut down the executor, exit now.
239258
if (symlinkPlantingPool.isShutdown()) {
@@ -246,22 +265,9 @@ private void registerAndPlantMissingSymlinks(
246265
}
247266
}
248267
for (NestedSet<Package> transitive : packages.getNonLeaves()) {
249-
registerAndPlantMissingSymlinks(transitive, donePackagesRef, lazilyPlantedSymlinksRef);
250-
}
251-
// Now wait on the futures.
252-
try {
253-
Futures.whenAllSucceed(futures).call(() -> null, directExecutor()).get();
254-
} catch (InterruptedException e) {
255-
Thread.currentThread().interrupt();
256-
return; // Bail
257-
} catch (ExecutionException e) {
258-
if (e.getCause() instanceof AbruptExitException) {
259-
throw (AbruptExitException) e.getCause();
260-
}
261-
throw new IllegalStateException("Unexpected exception", e);
268+
recursiveRegisterAndPlantMissingSymlinks(
269+
transitive, donePackagesRef, lazilyPlantedSymlinksRef, futures);
262270
}
263-
// Only update the memoization set now, after the symlinks are confirmed planted.
264-
donePackagesRef.add(packages.toNode());
265271
}
266272

267273
private Void plantSingleSymlinkForPackage(Package pkg, Set<Path> lazilyPlantedSymlinksRef)

0 commit comments

Comments
 (0)