Skip to content

High rate of spurious CI failures on macOS machines #14459

@fmeum

Description

@fmeum

Description of the problem / feature request:

I run daily CI checks in my rulesets' GitHub Actions pipeline. The macOS pipelines, running on macos-latest, fail every few days with two kinds of spurious failures that I have never been able to reproduce locally:

Issue 1:

Starting local Bazel server and connecting to it...
... still trying to connect to local Bazel server after 10 seconds ...
... still trying to connect to local Bazel server after 20 seconds ...
... still trying to connect to local Bazel server after 30 seconds ...
... still trying to connect to local Bazel server after 40 seconds ...
... still trying to connect to local Bazel server after 50 seconds ...
... still trying to connect to local Bazel server after 60 seconds ...
... still trying to connect to local Bazel server after 70 seconds ...
... still trying to connect to local Bazel server after 80 seconds ...
... still trying to connect to local Bazel server after 90 seconds ...
... still trying to connect to local Bazel server after 100 seconds ...
... still trying to connect to local Bazel server after 110 seconds ...
FATAL: couldn't connect to server (1753) after 120 seconds.
Error: Process completed with exit code 37.

Issue 2:

ERROR: /Users/runner/work/rules_jni/rules_jni/tests/libjvm_stub/BUILD.bazel:116:12: Target '//libjvm_stub:HelloFromJava' depends on toolchain '@local_config_cc//:cc-compiler-darwin', which cannot be found: error loading package '@local_config_cc//': cannot load '@local_config_cc_toolchains//:osx_archs.bzl': no such file'
ERROR: Analysis of target '//libjvm_stub:HelloFromJava' failed; build aborted: Analysis failed

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I have no way to consistently reproduce this issue, but it happens every few days on rules_jni's CI schedule.

What operating system are you running Bazel on?

macOS 10.15

What's the output of bazel info release?

Over time, I have hit the issues on 4.2.2, 5.0.0rc3 and various last_green builds.

Have you found anything relevant by searching the web?

Any other information, logs, or outputs that you want to share?

I can make arbitrary changes to the CI config if that helps to gather more information on the cause of these issues.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions