Skip to content

Move bytecode debug info from the OCaml heap to static allocations#1

Closed
gasche wants to merge 1 commit intotrunkfrom
fix-bytecode-debug-info
Closed

Move bytecode debug info from the OCaml heap to static allocations#1
gasche wants to merge 1 commit intotrunkfrom
fix-bytecode-debug-info

Conversation

@gasche
Copy link
Owner

@gasche gasche commented Aug 3, 2015

Xavier found a GC bug in the bytecode debug-info handling, due to the fact that

void read_main_debug_info(struct debug_info *di)

is passed a internal pointer in the middle of a custom block inside the OCaml heap. The function then allocate, the custom block is moved, we write at the old location and the global caml_debug_info list ends up corrupted. The bug only manifest itself when such custom blocks are allocated on the minor heap (so not in the current trunk or older OCaml version, but it shows up when adding the pull request ocaml#92 "Allow finalizers on custom blocks in the minor heap, fix PR#3612" of Pierre Chambart).

This commit fixes this issue by moving debug_info chunks from the OCaml heap to the C land, stored in a dynamic table. They are allocated when caml_add_debug_info is called, and removed when caml_remove_debug_info is called.

It may be the wrong approach, however: maybe we need debug_info to remain in OCaml land because its lifetime is harder to predict than it seems (can backtraces point to debug_info referring to a piece of bytecode that has "removed"?).

There is currently a GC bug in the bytecode debug-info handling, due
to the fact that
  void read_main_debug_info(struct debug_info *di)
is passed a internal pointer in the middle of a custom block inside
the OCaml heap. The bug only manifest itself if such custom blocks
are allocated on the minor heap.

This commit fixes this issue by moving debug_info chunks from the
OCaml heap to the C land, stored in a dynamic table. They are
allocated when caml_add_debug_info is called, and removed when
caml_remove_debug_info is called.

It may be the wrong approach, however: maybe we need debug_info to
remain in OCaml land because its lifetime is harder to predict than it
seems (can backtraces point to debug_info referring to a piece of
bytecode that has "removed"?).
@whitequark
Copy link

Nevermind my previous comment, I was inattentive. No, that can't happen, because raw backtraces store code pointers directly (which may harmlessly point into nowhere) and during conversion of raw backtraces to symbolic backtraces the debug information is copied out. No unloading may happen during that.

There's a possible error condition here (load code → capture a raw backtrace → unload code → load something else in it place → convert the raw backtrace to symbolic form), but it's probably not important.

@gasche
Copy link
Owner Author

gasche commented Aug 22, 2015

Merged in trunk.

@gasche gasche closed this Aug 22, 2015
gasche pushed a commit that referenced this pull request Jan 28, 2018
manual: last minute  fixes
gasche pushed a commit that referenced this pull request Aug 11, 2020
This job did actually do two different things:

1. Check that the compiler can be built without the world.opt target

2. Run sanitizers

This commit thus splits the extra-checks job into two separate
ones that are defined as Jenkins pipeline jobs named sanitizers and
step-by-step-build.
gasche pushed a commit that referenced this pull request Sep 15, 2021
The macro \libdocitem{Bigarray} was including `refman/extension/bigarray.etex`.
Rewriting the macro to input `library/#1.tex` instead of `#1.tex`
avoids this issue and should be hopefully less brittle in the future.
gasche pushed a commit that referenced this pull request Jul 13, 2024
The toplevel printer detects cycles by keeping a hashtable of values
that it has already traversed.

However, some OCaml runtime types (at least bigarrays) may be
partially uninitialized, and hashing them at arbitrary program points
may read uninitialized memory. In particular, the OCaml testsuite
fails when running with a memory-sanitizer enabled, as bigarray
printing results in reads to uninitialized memory:

```
==133712==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e6d11 in caml_ba_hash /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45
    #1 0x52474a in caml_hash /var/home/edwin/git/ocaml/runtime/hash.c:251:35
    #2 0x599ebf in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1065:14
    #3 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    #4 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    #5 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #6 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

  Uninitialized value was created by a heap allocation
    #0 0x47d306 in malloc (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x47d306) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)
    #1 0x4e7960 in caml_ba_alloc /var/home/edwin/git/ocaml/runtime/bigarray.c:246:12
    #2 0x4e801f in caml_ba_create /var/home/edwin/git/ocaml/runtime/bigarray.c:673:10
    #3 0x59b8fc in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
    #4 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    #5 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    #6 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #7 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #8 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45 in caml_ba_hash
```

The only use of hashing in genprintval is to avoid cycles, that is, it
is only useful for OCaml values that contain other OCaml values
(including possibly themselves). Bigarrays cannot introduce cycles,
and they are always printed as "<abstr>" anyway.

The present commit proposes to be more conservative in which values
are hashed by the cycle detector to avoid this issue: we skip hashing
any value with tag above No_scan_tag -- which may not contain any
OCaml values.

Suggested-by: Gabriel Scherer <[email protected]>
Signed-off-by: Edwin Török <[email protected]>
gasche added a commit that referenced this pull request Jul 30, 2024
…l#13294)

The toplevel printer detects cycles by keeping a hashtable of values
that it has already traversed.

However, some OCaml runtime types (at least bigarrays) may be
partially uninitialized, and hashing them at arbitrary program points
may read uninitialized memory. In particular, the OCaml testsuite
fails when running with a memory-sanitizer enabled, as bigarray
printing results in reads to uninitialized memory:

```
==133712==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e6d11 in caml_ba_hash /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45
    #1 0x52474a in caml_hash /var/home/edwin/git/ocaml/runtime/hash.c:251:35
    #2 0x599ebf in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1065:14
    #3 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    #4 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    #5 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #6 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

  Uninitialized value was created by a heap allocation
    #0 0x47d306 in malloc (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x47d306) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)
    #1 0x4e7960 in caml_ba_alloc /var/home/edwin/git/ocaml/runtime/bigarray.c:246:12
    #2 0x4e801f in caml_ba_create /var/home/edwin/git/ocaml/runtime/bigarray.c:673:10
    #3 0x59b8fc in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
    #4 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    #5 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    #6 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #7 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #8 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45 in caml_ba_hash
```

The only use of hashing in genprintval is to avoid cycles, that is, it
is only useful for OCaml values that contain other OCaml values
(including possibly themselves). Bigarrays cannot introduce cycles,
and they are always printed as "<abstr>" anyway.

The present commit proposes to be more conservative in which values
are hashed by the cycle detector to avoid this issue: we skip hashing
any value with tag above No_scan_tag -- which may not contain any
OCaml values.

Suggested-by: Gabriel Scherer <[email protected]>

Signed-off-by: Edwin Török <[email protected]>
Co-authored-by: Edwin Török <[email protected]>
gasche pushed a commit that referenced this pull request Aug 1, 2024
Found by -fsanitize=memory -fsanitize-memory-track-origins:
```
> ==102752==WARNING: MemorySanitizer: use-of-uninitialized-value
>     #0 0x7f2ba7fb4ea4 in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:496:18
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was stored to memory at
>     #0 0x7f2ba7fb4e9d in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:497:69
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was created by an allocation of 'buf' in the stack frame
>     #0 0x7f2ba7fb3dbc in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:402:7
>
```

This is in fact an EV_LIFECYCLE with EV_RING_STOP, which has 0
additional data, and thus msg_length 2:
```
runtime/runtime_events.c:      EV_RUNTIME, (ev_message_type){.runtime=EV_LIFECYCLE}, EV_RING_STOP, 0,
```

Attempting to read from `buf[2]` would read uninitialized data.

Signed-off-by: Edwin Török <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants