Skip to content

Warn user when a type variable in a type constraint has been instantiated.#6

Closed
hnrgrgr wants to merge 2 commits intotrunkfrom
unknown repository
Closed

Warn user when a type variable in a type constraint has been instantiated.#6
hnrgrgr wants to merge 2 commits intotrunkfrom
unknown repository

Conversation

@hnrgrgr
Copy link

@hnrgrgr hnrgrgr commented Jan 31, 2014

It seems to be a common misunderstanding amongst beginners that type variables in type constraint are unification variables and that they should be explicitly quantified when desired.

This small patch adds a warning when such non-quantified variable has been instantiated by the type checker. The verification is made /a posteriori/ by looking for type constraints in the typed tree.

I'm not sure this warning is really a desired feature, but it is probably worth discussion. For instance, ocamlyacc-generated files use such variables on purpose.

@gasche
Copy link
Member

gasche commented Jan 31, 2014

If we accepted '_a in programs and didn't warn on those, we could enable the warning for everyone -- and progressively convert codebases to use '_a for non-generalized inference variables. With the current practices it seems hard to enable the warning except in constrained educational settings, but I think it's still good to have it around.

@trefis
Copy link
Contributor

trefis commented Jan 31, 2014

@gasche do you know why '_a is not accepted?

With the current practices it seems hard to enable the warning except in constrained educational settings

I disagree. I'd be surprised to see people use 'a while expecting it to be unified with int.

I really like the idea of this PR btw!

@gasche
Copy link
Member

gasche commented Jan 31, 2014

Let's formulate this differently: you cannot willingly forget 20 years of (admittedly dubious but voluntary) flexible variable semantics and claim that it's actually a mistake to use them that way. It may seem surprising to you but some people are familiar with the current semantics of type variables and have used it in their codebases -- note that there is no alternative, while it is now possible to explicitly enforce rigid variables with ('a . ...). There even are people that consider the current flexible semantics is the right default.
Warnings are generally used to deprecate a feature, send a signal that "from now on you should avoid that" -- unused variables, (or), resolution of ambiguous record labels by the shadowing rule. It is not possible in this case, or rather would severely restrict expressivity of type annotations. In absence of an alternative to avoid the warning when flexible named variables is what you want, it should be considered not a general recommendation for all codebases, but a specialized recommendation for codebases that are simple enough to not need that -- educational purposes.

I like the idea as well. I did a code review but, besides the universal pattern Thomas spotted already, I couldn't comment much, as I'm not familiar with the type-checker codebase. Alain may have more to tell. I wondered whether this should be a delayed check or not, but my understanding is that the implementation only check variables at "safe points" (after checking a toplevel phrase or complete module structure) where it is a precondition that all local inference constraints have been solved.

@hnrgrgr
Copy link
Author

hnrgrgr commented Jan 31, 2014

Disabling the warning for type variables like '_a would indeed be consistent with warnings triggered by unused (value) variables. Unfortunatly '_a is already used in pretty-printing of weak type variables.

@lefessan
Copy link
Contributor

lefessan commented Feb 1, 2014

Actually, there would be some consistency in having no warnings on '_a, as such variables indicates that they are actually not polymorphic, and should be eventually instantiated. So, we should accept such variables, and warn, on the contrary, if they are not instantiated, while not warning if they are.

@lpw25
Copy link
Contributor

lpw25 commented Feb 3, 2014

Note that -- because of structural types -- some things can only be expressed in OCaml using unification variables. For example, this (nonsense) function can only be written using unification variables:

# let rec f n (x : < m: 'b. 'b -> 'a > as 'a) =
    if n < 0 then x
    else if n = 0 then f (n - 2) (x#m 5)
    else f (n - 2) (x#m 6.0);;
val f : int -> (< m : 'b. 'b -> 'a > as 'a) -> 'a = <fun>

I'm not sure that we should add a warning which goes off for any possible definition of perfectly valid functions.

I also think, that unification variables are used more often than people think: almost every use of as in a type expression for instance.

My preferred approach to helping people to understand the difference between unification variables and type variables would be to have the compiler accept the syntax:

val id: 'a. 'a -> 'a

for value specifications, and also to print value specifications with that syntax by default. This would mean that type variables were consistently bound in OCaml, and unification variables were consistently unbound.

@lpw25
Copy link
Contributor

lpw25 commented Feb 3, 2014

I think your patch still gives a warning for uses of as like this:

# let rec f (x : < m: int; .. > as 'a) : 'a option =
    if x#m < 0 then None
    else Some x;;
val f : (< m : int; .. > as 'a) -> 'a option = <fun>

which I think are very common.

@hnrgrgr
Copy link
Author

hnrgrgr commented Feb 3, 2014

Indeed, your example should not trigger a warning. I forgot that case, thanks. I have updated the patch to handle this case. Maybe not the best implementation, but it should be sufficient to test.

@lpw25
Copy link
Contributor

lpw25 commented Feb 3, 2014

I think your patch still gives a warning for uses of as like this

Oops, I had only seen the changes to extract_vars, and completely missed the addition of extract_aliases. Your updated patch should be fine for the examples I showed.

@hnrgrgr
Copy link
Author

hnrgrgr commented Feb 3, 2014

Unfortunatly, it isn't.

@hnrgrgr
Copy link
Author

hnrgrgr commented Jul 9, 2014

Here is an updated version, with proper 'scope handling' for aliased variables.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if I don't like it, the implicit coding rules of the compiler forbid this | and the rule have been enforced in other merge request.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many occurrences of '|' on the first branch in the compiler code base. I don't think anyone would complain about this style.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thank you, I'm happy to hear it. I will stop removing them from my patch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If anything, I would complain about the absence of the first |. But I've learned not to complain too much.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, changing just this is more of a concern, as the diff is unnecessary!

stedolan pushed a commit to stedolan/ocaml that referenced this pull request Aug 18, 2015
AltGr pushed a commit to AltGr/ocaml that referenced this pull request Nov 14, 2015
mshinwell referenced this pull request in mshinwell/ocaml Jan 18, 2016
@damiendoligez damiendoligez added this to the 4.04-or-later milestone Jan 27, 2016
@damiendoligez damiendoligez removed this from the 4.04 milestone Aug 3, 2016
@garrigue garrigue self-assigned this Feb 22, 2017
@mshinwell
Copy link
Contributor

@garrigue It's been more than three years since the most recent update of this patch, as far as I can see. Is it going to be merged? If not then I think we should close the pull request until someone steps forward to bring the code to a mergeable state.

@garrigue
Copy link
Contributor

We are not going to introduce this change, as it is not "backward compatible" (i.e., doesn't fit well with the existing codebase).
But we're still reflecting on improvements to the handling of type variables.

jfehrle added a commit to jfehrle/ocaml that referenced this pull request Aug 6, 2020
gretay-js pushed a commit to gretay-js/ocaml that referenced this pull request May 28, 2021
Emit low-level location information for FDO: use discriminators (2/2)
punchagan pushed a commit to punchagan/ocaml that referenced this pull request Feb 23, 2022
Add units for metrics and switch to v2 schema
edwintorok added a commit to edwintorok/ocaml that referenced this pull request Jul 11, 2024
Reported by `-fsanitize=memory`:
```
> ==102752==WARNING: MemorySanitizer: use-of-uninitialized-value
>     #0 0x7f2ba7fb4ea4 in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:496:18
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was stored to memory at
>     #0 0x7f2ba7fb4e9d in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:497:69
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was created by an allocation of 'buf' in the stack frame
>     #0 0x7f2ba7fb3dbc in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:402:7
>
```

This is in fact an EV_LIFECYCLE with EV_RING_STOP, which has 0
additional data, and thus msg_length 2:
```
runtime/runtime_events.c:      EV_RUNTIME, (ev_message_type){.runtime=EV_LIFECYCLE}, EV_RING_STOP, 0,
```

Attempting to read from `buf[2]` would read uninitialized data (or
potentially beyond the end of the buffer).

Check `msg_length` before reading.

Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml that referenced this pull request Jul 11, 2024
Bigarrays are printed in the toplevel as `<abstr>`, but ObjTbl.mem
computes a hash. However bigarrays are uninitialized when allocated,
so this seems to be a genuine bug:
```
==133712==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e6d11 in caml_ba_hash /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45
    #1 0x52474a in caml_hash /var/home/edwin/git/ocaml/runtime/hash.c:251:35
    ocaml#2 0x599ebf in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1065:14
    ocaml#3 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    ocaml#4 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    ocaml#5 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#6 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

  Uninitialized value was created by a heap allocation
    #0 0x47d306 in malloc (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x47d306) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)
    #1 0x4e7960 in caml_ba_alloc /var/home/edwin/git/ocaml/runtime/bigarray.c:246:12
    ocaml#2 0x4e801f in caml_ba_create /var/home/edwin/git/ocaml/runtime/bigarray.c:673:10
    ocaml#3 0x59b8fc in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
    ocaml#4 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    ocaml#5 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    ocaml#6 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#7 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#8 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45 in caml_ba_hash
```

Suppress it for now until a solution is found.

The testsuite passes now:
```
Summary:
  1334 tests passed
   103 tests skipped
     0 tests failed
     0 tests not started (parent test skipped or failed)
     0 unexpected errors
  1437 tests considered
```

Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml that referenced this pull request Jul 11, 2024
Bigarrays are printed in the toplevel as `<abstr>`, but ObjTbl.mem
computes a hash. However bigarrays are uninitialized when allocated,
so this seems to be a genuine bug:
```
==133712==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e6d11 in caml_ba_hash /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45
    #1 0x52474a in caml_hash /var/home/edwin/git/ocaml/runtime/hash.c:251:35
    ocaml#2 0x599ebf in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1065:14
    ocaml#3 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    ocaml#4 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    ocaml#5 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#6 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

  Uninitialized value was created by a heap allocation
    #0 0x47d306 in malloc (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x47d306) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)
    #1 0x4e7960 in caml_ba_alloc /var/home/edwin/git/ocaml/runtime/bigarray.c:246:12
    ocaml#2 0x4e801f in caml_ba_create /var/home/edwin/git/ocaml/runtime/bigarray.c:673:10
    ocaml#3 0x59b8fc in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
    ocaml#4 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    ocaml#5 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    ocaml#6 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#7 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#8 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45 in caml_ba_hash
```

Suppress it for now until a solution is found.

The testsuite passes now:
```
Summary:
  1334 tests passed
   103 tests skipped
     0 tests failed
     0 tests not started (parent test skipped or failed)
     0 unexpected errors
  1437 tests considered
```

Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml that referenced this pull request Jul 11, 2024
Reported by `-fsanitize=memory`:
```
> ==102752==WARNING: MemorySanitizer: use-of-uninitialized-value
>     #0 0x7f2ba7fb4ea4 in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:496:18
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was stored to memory at
>     #0 0x7f2ba7fb4e9d in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:497:69
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was created by an allocation of 'buf' in the stack frame
>     #0 0x7f2ba7fb3dbc in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:402:7
>
```

This is in fact an EV_LIFECYCLE with EV_RING_STOP, which has 0
additional data, and thus msg_length 2:
```
runtime/runtime_events.c:      EV_RUNTIME, (ev_message_type){.runtime=EV_LIFECYCLE}, EV_RING_STOP, 0,
```

Attempting to read from `buf[2]` would read uninitialized data (or
potentially beyond the end of the buffer).

Check `msg_length` before reading.

Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml that referenced this pull request Jul 11, 2024
Bigarrays are printed in the toplevel as `<abstr>`, but ObjTbl.mem
computes a hash. However bigarrays are uninitialized when allocated,
so this seems to be a genuine bug:
```
==133712==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e6d11 in caml_ba_hash /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45
    #1 0x52474a in caml_hash /var/home/edwin/git/ocaml/runtime/hash.c:251:35
    ocaml#2 0x599ebf in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1065:14
    ocaml#3 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    ocaml#4 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    ocaml#5 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#6 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

  Uninitialized value was created by a heap allocation
    #0 0x47d306 in malloc (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x47d306) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)
    #1 0x4e7960 in caml_ba_alloc /var/home/edwin/git/ocaml/runtime/bigarray.c:246:12
    ocaml#2 0x4e801f in caml_ba_create /var/home/edwin/git/ocaml/runtime/bigarray.c:673:10
    ocaml#3 0x59b8fc in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
    ocaml#4 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    ocaml#5 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    ocaml#6 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#7 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#8 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45 in caml_ba_hash
```

The hashing is only needed to avoid recursion, skip it when the OCaml
value doesn't contain further nested OCaml values, i.e. when it wouldn't
be scanned by the GC.

The testsuite passes now:
```
Summary:
  1335 tests passed
   102 tests skipped
     0 tests failed
     0 tests not started (parent test skipped or failed)
     0 unexpected errors
  1437 tests considered
```

Suggested-by: Gabriel Scherer <[email protected]>
Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml that referenced this pull request Jul 14, 2024
Found by -fsanitize=memory -fsanitize-memory-track-origins:
```
> ==102752==WARNING: MemorySanitizer: use-of-uninitialized-value
>     #0 0x7f2ba7fb4ea4 in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:496:18
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was stored to memory at
>     #0 0x7f2ba7fb4e9d in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:497:69
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was created by an allocation of 'buf' in the stack frame
>     #0 0x7f2ba7fb3dbc in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:402:7
>
```

This is in fact an EV_LIFECYCLE with EV_RING_STOP, which has 0
additional data, and thus msg_length 2:
```
runtime/runtime_events.c:      EV_RUNTIME, (ev_message_type){.runtime=EV_LIFECYCLE}, EV_RING_STOP, 0,
```

Attempting to read from `buf[2]` would read uninitialized data.

Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml that referenced this pull request Jul 14, 2024
Found by -fsanitize=memory -fsanitize-memory-track-origins:
```
> ==102752==WARNING: MemorySanitizer: use-of-uninitialized-value
>     #0 0x7f2ba7fb4ea4 in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:496:18
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was stored to memory at
>     #0 0x7f2ba7fb4e9d in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:497:69
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was created by an allocation of 'buf' in the stack frame
>     #0 0x7f2ba7fb3dbc in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:402:7
>
```

This is in fact an EV_LIFECYCLE with EV_RING_STOP, which has 0
additional data, and thus msg_length 2:
```
runtime/runtime_events.c:      EV_RUNTIME, (ev_message_type){.runtime=EV_LIFECYCLE}, EV_RING_STOP, 0,
```

Attempting to read from `buf[2]` would read uninitialized data.

Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml that referenced this pull request Jul 14, 2024
The toplevel printer detects cycles by keeping a hashtable of values
that it has already traversed.

However, some OCaml runtime types (at least bigarrays) may be
partially uninitialized, and hashing them at arbitrary program points
may read uninitialized memory. In particular, the OCaml testsuite
fails when running with a memory-sanitizer enabled, as bigarray
printing results in reads to uninitialized memory:

```
==133712==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e6d11 in caml_ba_hash /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45
    #1 0x52474a in caml_hash /var/home/edwin/git/ocaml/runtime/hash.c:251:35
    ocaml#2 0x599ebf in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1065:14
    ocaml#3 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    ocaml#4 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    ocaml#5 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#6 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

  Uninitialized value was created by a heap allocation
    #0 0x47d306 in malloc (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x47d306) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)
    #1 0x4e7960 in caml_ba_alloc /var/home/edwin/git/ocaml/runtime/bigarray.c:246:12
    ocaml#2 0x4e801f in caml_ba_create /var/home/edwin/git/ocaml/runtime/bigarray.c:673:10
    ocaml#3 0x59b8fc in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
    ocaml#4 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    ocaml#5 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    ocaml#6 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#7 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    ocaml#8 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45 in caml_ba_hash
```

The only use of hashing in genprintval is to avoid cycles, that is, it
is only useful for OCaml values that contain other OCaml values
(including possibly themselves). Bigarrays cannot introduce cycles,
and they are always printed as "<abstr>" anyway.

The present commit proposes to be more conservative in which values
are hashed by the cycle detector to avoid this issue: we skip hashing
any value with tag above No_scan_tag -- which may not contain any
OCaml values.

Suggested-by: Gabriel Scherer <[email protected]>
Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml that referenced this pull request Jul 15, 2024
Found by -fsanitize=memory -fsanitize-memory-track-origins:
```
> ==102752==WARNING: MemorySanitizer: use-of-uninitialized-value
>     #0 0x7f2ba7fb4ea4 in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:496:18
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was stored to memory at
>     #0 0x7f2ba7fb4e9d in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:497:69
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was created by an allocation of 'buf' in the stack frame
>     #0 0x7f2ba7fb3dbc in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:402:7
>
```

This is in fact an EV_LIFECYCLE with EV_RING_STOP, which has 0
additional data, and thus msg_length 2:
```
runtime/runtime_events.c:      EV_RUNTIME, (ev_message_type){.runtime=EV_LIFECYCLE}, EV_RING_STOP, 0,
```

Attempting to read from `buf[2]` would read uninitialized data.

Signed-off-by: Edwin Török <[email protected]>
edwintorok added a commit to edwintorok/ocaml that referenced this pull request Jul 24, 2024
Found by -fsanitize=memory -fsanitize-memory-track-origins:
```
> ==102752==WARNING: MemorySanitizer: use-of-uninitialized-value
>     #0 0x7f2ba7fb4ea4 in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:496:18
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was stored to memory at
>     #0 0x7f2ba7fb4e9d in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:497:69
>     #1 0x7f2ba7fbc016 in caml_ml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:1207:9
>     ocaml#2 0x59ba5c in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
>     ocaml#3 0x5a9220 in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
>     ocaml#4 0x540d6b in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
>     ocaml#5 0x7f2ba8120087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#6 0x7f2ba812014a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
>     ocaml#7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 617637580ee48eff08a2bce790e1667ad09f3b69)
>
>   Uninitialized value was created by an allocation of 'buf' in the stack frame
>     #0 0x7f2ba7fb3dbc in caml_runtime_events_read_poll /var/home/edwin/git/ocaml/otherlibs/runtime_events/runtime_events_consumer.c:402:7
>
```

This is in fact an EV_LIFECYCLE with EV_RING_STOP, which has 0
additional data, and thus msg_length 2:
```
runtime/runtime_events.c:      EV_RUNTIME, (ev_message_type){.runtime=EV_LIFECYCLE}, EV_RING_STOP, 0,
```

Attempting to read from `buf[2]` would read uninitialized data.

Signed-off-by: Edwin Török <[email protected]>
nojb pushed a commit that referenced this pull request Jul 24, 2024
The toplevel printer detects cycles by keeping a hashtable of values
that it has already traversed.

However, some OCaml runtime types (at least bigarrays) may be
partially uninitialized, and hashing them at arbitrary program points
may read uninitialized memory. In particular, the OCaml testsuite
fails when running with a memory-sanitizer enabled, as bigarray
printing results in reads to uninitialized memory:

```
==133712==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4e6d11 in caml_ba_hash /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45
    #1 0x52474a in caml_hash /var/home/edwin/git/ocaml/runtime/hash.c:251:35
    #2 0x599ebf in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1065:14
    #3 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    #4 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    #5 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #6 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #7 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

  Uninitialized value was created by a heap allocation
    #0 0x47d306 in malloc (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x47d306) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)
    #1 0x4e7960 in caml_ba_alloc /var/home/edwin/git/ocaml/runtime/bigarray.c:246:12
    #2 0x4e801f in caml_ba_create /var/home/edwin/git/ocaml/runtime/bigarray.c:673:10
    #3 0x59b8fc in caml_interprete /var/home/edwin/git/ocaml/runtime/interp.c:1058:14
    #4 0x5a909a in caml_main /var/home/edwin/git/ocaml/runtime/startup_byt.c:575:9
    #5 0x540ccb in main /var/home/edwin/git/ocaml/runtime/main.c:37:3
    #6 0x7f0910abb087 in __libc_start_call_main (/lib64/libc.so.6+0x2a087) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #7 0x7f0910abb14a in __libc_start_main@GLIBC_2.2.5 (/lib64/libc.so.6+0x2a14a) (BuildId: 8f53abaad945a669f2bdcd25f471d80e077568ef)
    #8 0x441804 in _start (/var/home/edwin/git/ocaml/runtime/ocamlrun+0x441804) (BuildId: 7a60eef57e1c2baf770bc38d10d6c227e60ead37)

SUMMARY: MemorySanitizer: use-of-uninitialized-value /var/home/edwin/git/ocaml/runtime/bigarray.c:486:45 in caml_ba_hash
```

The only use of hashing in genprintval is to avoid cycles, that is, it
is only useful for OCaml values that contain other OCaml values
(including possibly themselves). Bigarrays cannot introduce cycles,
and they are always printed as "<abstr>" anyway.

The present commit proposes to be more conservative in which values
are hashed by the cycle detector to avoid this issue: we skip hashing
any value with tag above No_scan_tag -- which may not contain any
OCaml values.

Suggested-by: Gabriel Scherer <[email protected]>

Signed-off-by: Edwin Török <[email protected]>
Co-authored-by: Edwin Török <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.