Skip to content

ppc64 backend segfault on trunk #12482

@jmid

Description

@jmid

As part of multicoretests we are observing segfaults in code produced by the newly restored ppc64 backend.
The test triggering it is property-based test of arrays against a model.
Contrary to previous torture tests, this is a sequential, single-domain test causing a crash.
A branch is available here: https://github.com/ocaml-multicore/multicoretests/tree/ppc64-crash-repro

Here's an example output: https://ocaml-multicoretests.ci.dev:8100/job/2023-08-16/132015-ci-ocluster-build-8fbc23#L343

random seed: 257065471
generated error fail pass / total     time test name

[ ]    0    0    0    0 / 1000     0.0s STM Array test sequential
[ ]    0    0    0    0 / 1000     0.0s STM Array test sequential (generating)File "src/array/dune", line 4, characters 7-16:
4 |  (name stm_tests)
           ^^^^^^^^^
(cd _build/default/src/array && ./stm_tests.exe --verbose)
Command got signal SEGV.

To recreate:

  • clone the above repo branch
  • opam install dune qcheck-core
  • dune build @ci -j1 --no-buffer

I don't have direct access to a ppc-machine ATM, so the above has been produced via CI-golf... 🤓
I may get access to one next week.

Context

  • The test runs with a hard-coded seed and crashes consistently (deterministically) in the CI on the first generated input.
    Below I include the long counter example produced with this seed - a 283-element cmd list!
    (The output was produced by adding a fork to avoid the crash taking down the testing process).
    The other sizes we have observed crashes on are also around 283 or bigger (the smallest was 278 elements).
    I suspect the particular commands and their order is less significant and instead serve to build up a certain heap structure.
  • While chasing this I've observed occasional messages of: double free or corruption (out), free(): invalid pointer, and Fatal error: allocation failure during minor GC which may indicate memory corruption
  • We have observed similar crashes in tests of Bytes and Array.Floatarray and even an infinite loop. The observations have been tracked in [ocaml5-issue] Crashes and hangs on ppc64 trunk/5.2 ocaml-multicore/multicoretests#380.
Long counter example
random seed: 257065471
generated error  fail  pass / total     time test name

[ ]     0     0     0     0 / 10000     0.0s STM Array test sequential
[ ]     0     0     0     0 / 10000     0.0s STM Array test sequential (generating)
[✗]     1     0     1     0 / 10000    10.8s STM Array test sequential

--- Failure --------------------------------------------------------------------

Test STM Array test sequential failed (5 shrink steps):

 Mem 'm'
 Fill (6, 4, '|')
 Sub (9, 8)
 To_list
 Fill (1, 0, 'R')
 Get 4
 Set (15, ')')
 Length
 Fill (7, 15, '$')
 Fill (8, 7, 'H')
 Length
 Copy
 To_seq
 Mem 'b'
 Get 5
 Copy
 Sort
 Sort
 Fill (6, 3, 'G')
 To_seq
 Fill (6, 6, 'w')
 To_seq
 Copy
 Fill (4, 5, 'u')
 Fill (1, 71, 'U')
 Length
 Fill (2, 4, 'U')
 Fill (23, 9, 'j')
 To_list
 Set (50, 'z')
 To_list
 Sort
 Sub (11, 12)
 Get 8
 Set (9, 'B')
 Mem '4'
 Mem 'n'
 Sub (4, 9)
 Length
 Fill (2, 14, 'C')
 Sort
 Length
 Fill (9, 2, '=')
 To_list
 Sub (13, 9)
 Set (9, 'A')
 To_list
 To_seq
 Get 6
 Copy
 Copy
 To_list
 Set (1, '\'')
 Sub (6, 10)
 Fill (6, 5, 'b')
 Fill (14, 50, '|')
 Sort
 Get 13
 Length
 Get 7
 Mem 'd'
 Mem '`'
 Sub (8, 13)
 To_seq
 To_list
 Copy
 Get 3
 To_seq
 Sort
 To_list
 Fill (7, 4, 'K')
 Mem '<'
 To_list
 Fill (8, 4, '/')
 Fill (2, 11, 'M')
 Fill (3, 12, 'E')
 Sort
 To_list
 Sort
 Mem '<'
 Fill (6, 1, ',')
 Set (4, '+')
 To_seq
 Set (2, 'H')
 To_seq
 Copy
 Get 7
 Sub (6, 3)
 Get 1
 Copy
 Get 12
 Fill (12, 7, '>')
 To_seq
 Set (2, 'c')
 Fill (6, 4, '&')
 Set (1, 'B')
 Mem 'x'
 Length
 Mem '%'
 Fill (4, 5, 'Q')
 Sub (65, 8)
 Copy
 Length
 Sort
 Get 0
 To_seq
 Mem '>'
 Sub (6, 0)
 To_seq
 Fill (5, 8, '4')
 To_list
 Get 8
 Get 53
 Fill (11, 2, ',')
 Copy
 Fill (12, 0, 's')
 Sub (4, 0)
 Set (29, 'i')
 Length
 Fill (1, 3, 'E')
 Sub (99, 8)
 Mem ' '
 Sort
 Sort
 Get 2
 Copy
 Sub (5, 6)
 Sort
 Sort
 To_seq
 Get 0
 To_list
 Sort
 Copy
 Mem 's'
 Length
 Get 3
 To_seq
 Length
 To_seq
 Get 1
 To_seq
 To_seq
 Get 1
 Mem '3'
 Fill (11, 1, 'o')
 To_seq
 Mem ':'
 Copy
 Sub (15, 14)
 To_seq
 Fill (41, 2, 'h')
 Mem '#'
 Set (3, 's')
 Get 4
 Get 6
 Sort
 To_seq
 Sub (6, 5)
 Sub (5, 0)
 Sub (6, 11)
 To_list
 To_seq
 Copy
 Fill (5, 7, 'Y')
 Fill (15, 9, '_')
 Copy
 Sub (8, 6)
 Fill (11, 76, '>')
 Sub (9, 1)
 Get 0
 To_list
 Length
 Mem 'r'
 Sub (9, 3)
 Get 5
 Copy
 Fill (1, 6, 'N')
 To_list
 Get 4
 To_list
 Sort
 Length
 Set (15, '6')
 To_seq
 To_seq
 Copy
 To_list
 To_seq
 Sub (5, 4)
 To_list
 Get 14
 Get 7
 Fill (11, 1, 'u')
 Mem 'E'
 Set (74, 'T')
 Sub (0, 7)
 Sub (9, 13)
 To_seq
 To_list
 Fill (6, 3, '6')
 To_seq
 Fill (9, 2, ':')
 Mem '}'
 Mem '9'
 To_seq
 Get 9
 To_seq
 Sort
 To_seq
 Mem 'z'
 Copy
 Copy
 Length
 Length
 Mem 'I'
 Fill (81, 22, 'V')
 Sub (98, 0)
 Mem '4'
 Get 12
 Get 6
 To_list
 Sub (13, 4)
 Copy
 Sort
 Copy
 Sub (4, 5)
 Sub (2, 1)
 To_seq
 Copy
 Sub (1, 0)
 Mem '2'
 Mem 'S'
 Mem 'X'
 Get 8
 To_seq
 Length
 Sort
 Mem '1'
 Length
 Set (1, '#')
 Sub (2, 6)
 Sort
 Sort
 Length
 Fill (10, 1, 'H')
 Fill (63, 1, 'x')
 Get 78
 Mem '7'
 Set (8, '6')
 Fill (1, 6, 'U')
 Get 14
 Mem '+'
 To_list
 Fill (13, 0, 'b')
 Length
 Copy
 Copy
 Mem 'D'
 Fill (78, 8, 'k')
 To_list
 Get 9
 Sort
 Fill (81, 7, '.')
 To_list
 Sub (15, 5)
 Set (2, 'L')
 Sort
 To_list
 Copy
 Mem 'v'
 Mem '4'
 Get 9
 Fill (8, 7, 'w')
 Mem 'E'
 Fill (0, 4, 'u')
 To_seq
 Set (2, '\\')
 Fill (75, 12, 'H')
 Get 19
 To_seq
 To_seq
 Fill (2, 11, 'g')
 Length

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions