Skip to content

Bazel server crashing abruptly in 7.4+ #24389

@luispadron

Description

@luispadron

Description of the bug:

Since upgrading to 7.4.0 and subsequently 7.4.1 we've been hit by a server crash during random CI (and local) builds:

Server terminated abruptly (error code: 14, error message: 'Socket closed', log file: '/private/var/tmp/_bazel_build/a50428ec9d717925b6582741c513dab3/server/jvm.out')

Which category does this issue belong to?

No response

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I have not found a way to reliably reproduce this issue but were seeing about 1-3 of these per day on our ~300 machine CI fleet.

Which operating system are you running Bazel on?

macOS

What is the output of bazel info release?

7.4.1

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

No response

If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.

No response

Have you found anything relevant by searching the web?

There was a GC threading issue in 7.4.0 that was also causing a server crash but that was fixed in 7.4.1 so this might be another case of that issue or something new entirely. I do not see any

Any other information, logs, or outputs that you want to share?

Attached are the Bazel server JVM log from the build it terminated on along with the .ips file generated from the crash.

jvm_logs_and_ips.zip

...

Thread 51 crashed with ARM Thread State (64-bit):
    x0: 0x0000000000000000   x1: 0x0000000000000000   x2: 0x0000000000000000   x3: 0x0000000000000000
    x4: 0x0000000000000001   x5: 0x00000001777affc8   x6: 0x000000000000002e   x7: 0x0000000000000000
    x8: 0xb48a3bbbfd395f71   x9: 0xb48a3bba8a426f71  x10: 0x0000000000000002  x11: 0x00000000fffffffd
   x12: 0x0000010000000000  x13: 0x0000000000000000  x14: 0x0000000000000000  x15: 0x0000000000000000
   x16: 0x0000000000000148  x17: 0x000000020168e4e8  x18: 0x0000000000000000  x19: 0x0000000000000006
   x20: 0x00000001777b3000  x21: 0x000000000000e707  x22: 0x00000001777b30e0  x23: 0x000000000000000a
   x24: 0x0000000102ade007  x25: 0x00000001305a5e00  x26: 0x0000000000000000  x27: 0x00000000000007d0
   x28: 0x00000000ffffffff   fp: 0x00000001777b1070   lr: 0x000000018f13dc20
    sp: 0x00000001777b1050   pc: 0x000000018f1055f0 cpsr: 0x40001000
   far: 0x0000000000000000  esr: 0x56000080  Address size fault

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2We'll consider working on this in future. (Assignee optional)team-OSSIssues for the Bazel OSS team: installation, release processBazel packaging, websitetype: bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions