Skip to content

bazel 5.0.0rc3 crashes with local and remote actions being done #14433

@philsc

Description

@philsc

Description of the problem / feature request:

I'm trying out 5.0.0rc3 in our CI environment and saw the following crash:

(11:09:44) INFO: Caught InterruptedException from ExecException for remote branch of sensors/tools/linear_range/_objs/LinearRangeControllerModule_static/linear_range_controller_module.pic.o, which may cause a crash.
--
  | (11:10:01) FATAL: bazel crashed due to an internal error. Printing stack trace:
  | java.lang.AssertionError: Neither branch of sensors/tools/linear_range/_objs/LinearRangeControllerModule_static/linear_range_controller_module.pic.o completed. Local was cancelled and done and remote was not cancelled and done.
  | at com.google.devtools.build.lib.dynamic.DynamicSpawnStrategy.waitBranches(DynamicSpawnStrategy.java:345)
  | at com.google.devtools.build.lib.dynamic.DynamicSpawnStrategy.exec(DynamicSpawnStrategy.java:733)
  | at com.google.devtools.build.lib.actions.SpawnStrategy.beginExecution(SpawnStrategy.java:47)
  | at com.google.devtools.build.lib.exec.SpawnStrategyResolver.beginExecution(SpawnStrategyResolver.java:68)
  | at com.google.devtools.build.lib.rules.cpp.CppCompileAction.beginExecution(CppCompileAction.java:1430)
  | at com.google.devtools.build.lib.actions.Action.execute(Action.java:133)
  | at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$5.execute(SkyframeActionExecutor.java:907)
  | at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.continueAction(SkyframeActionExecutor.java:1076)
  | at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.run(SkyframeActionExecutor.java:1031)
  | at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:152)
  | at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:91)
  | at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:492)
  | at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:856)
  | at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.computeInternal(ActionExecutionFunction.java:349)
  | at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:169)
  | at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:590)
  | at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:382)
  | at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
  | at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
  | at java.base/java.lang.Thread.run(Unknown Source)
  | (11:10:03) Failed with return code 37.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

This is just setting .bazelversion to 5.0.0rc3 and running it on our entire build. This involves a remote buildfarm cluster.
I don't know of a "simple" way to reproduce this.

What operating system are you running Bazel on?

Everything's on x86_64 Linux.

What's the output of bazel info release?

$ bazel info release
Starting local Bazel server and connecting to it...
release 5.0.0rc3

Have you found anything relevant by searching the web?

I couldn't find anything pertinent.

Any other information, logs, or outputs that you want to share?

Not at this time.

Metadata

Metadata

Assignees

Labels

P1I'll work on this now. (Assignee required)team-Remote-ExecIssues and PRs for the Execution (Remote) teamtype: bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions