Ensure that register typing constraints are respected at join points by mshinwell · Pull Request #555 · ocaml/ocaml

mshinwell · 2016-04-19T11:45:31Z

It has been known for some time that great care has to be taken when using Obj.magic to ensure that, at join points in the control flow graph, a register "known to hold only immediates" does not end up being used to hold a value that might be a pointer. If this happens, it will not be registered as a GC root, which can cause it to become invalid. (The typical fix is something like using None instead of "unit".)

After having fixed a piece of code having that problem this morning it became apparent that the following function, which does not use Obj.magic, suffers from the same problem:

type 'a typ = Int : int typ | Ptr : int list typ

let f (type a) (t : a typ) (p : int list) : a =
  match t with
  | Int -> 100
  | Ptr -> p

If the function "f" is inlined into another function then the situation above can arise, as exhibited in the test case on this pull request, which segfaults.

The only reasonable solution it would appear at present is to ensure that at join points we respect the register typing. This may produce slightly worse code in some cases. For such cases the correct solution to solve this problem is probably the propagation of more information about what values might be held in which identifiers right through from the type checker.

I think this is probably serious enough to warrant putting in 4.03.

ghost · 2016-04-19T13:05:06Z

asmcomp/cmm.ml

+    (* Float unboxing code must be sure to avoid this case. *)
+    assert false
+
+let ge_component comp1 comp2 =


Couldn't this be implemented as comp1 = lub_component comp1 comp2? That would avoid having to check the code twice

I tend to think it's clearer written out, actually.

lpw25 · 2016-04-19T13:15:37Z

Patch looks correct to me.

damiendoligez · 2016-04-19T15:22:08Z

Devil's advocate here: the bug has been here since we introduced GADTs in 4.00.0. Is it really urgent to fix it now? What's your assessment of the risk level for this change?

gasche · 2016-04-19T18:39:30Z

My understanding is that the issue is sensibly more severe in presence of flambda than it was in previous releases, as it will aggressively inline these cases.

mshinwell · 2016-04-20T12:29:20Z

I'm also kind of surprised this hasn't caused trouble before, given how long the GADT support has been merged for.

It doesn't really seem an ideal change just before a release. On balance I think I'm pretty convinced about the soundness of the change, but I'm not completely sure about potential performance degradation. @xavierleroy : do you have an opinion on that?

gasche · 2016-04-20T12:47:16Z

Can you characterize the problematic joint points? Are they exactly those that match on a GADT at the source level? If so, that would make it easier to reason about potential performance impact (most of the code doesn't use GADTs and would be unaffected, and I understand there are a few performance-sensitive uses of GADTs that could be inspected manually).

Have you tried tracing the cases where the join decreases precision, for example on a compiler bootstrap?

mshinwell · 2016-04-20T14:19:33Z

I'm not sure I have an exact characterisation of the problematic join points, but I think a GADT match must be involved, with one of the branches returning a result that forces the backend to allocate a register of type "Int" for that result (in fact one of type "Addr" would probably cause it to go wrong too) and another branch returning a value that might be boxed. In some cases the backend would then use the "Int" register as the result for the whole match, which might then go wrong if the live range for that register then extends forwards across a GC point (and if the boxed branch is taken in the match).

For performance, it's actually cases that do not use GADTs, but might end up getting penalised, which we're worried about. One way we might be able to assess this is using OCamlPro's benchmarking suite which we were using for flambda, perhaps.

mshinwell · 2016-04-20T14:33:42Z

@garrigue Do you have an opinion on this one?

lpw25 · 2016-04-20T14:41:19Z

Patch looks correct to me.

Actually one thing I am not sure about: Is Addr supposed to include Int? At the moment if one branch returns Addr and another returns Int then they will both be stored in an Addr register. Is that correct? If Addr is allowed to hold Int then it is not clear to me what the difference between Val and Addr is.

mshinwell · 2016-04-20T14:48:14Z

Val is registered with the GC, Addr is not, as far as I recall.

mshinwell · 2016-04-20T14:48:26Z

(I will check about having Int in Addr)

mshinwell · 2016-04-20T15:01:38Z

Hmm, so I think if we use Addr occasionally rather than Int, it isn't that much of a big deal. It will inhibit CSE at certain points. That might be all, in fact, with the current code.

bluddy · 2016-04-20T15:12:34Z

Not to hijack the discussion, but could someone explain to me what the join points are?

mshinwell · 2016-04-20T15:15:40Z

When two control flow paths come together, such as at the end of a conditional expression.

bluddy · 2016-04-20T15:54:41Z

Thanks. And normally join points are places where we may register GC roots?

mshinwell · 2016-04-20T16:15:28Z

No. The roots are effectively "registered" at a point where one OCaml function may call another. The problem is that at the join point, when generating code, a decision has to be made about which register to use to hold the result of the conditional. In some cases the backend reuses the result register from one of the two code paths (and generates a move at the end of the other code path into that register). However this is wrong if the reused register will not be registered as a root (at any subsequent point where it might need to be, before it becomes dead) and the value being moved into it on the other code path may need to be registered.

bluddy · 2016-04-20T17:01:32Z

Awesome. Thanks again.

xavierleroy · 2016-04-20T17:04:49Z

asmcomp/cmm.ml

+  | Int, Val -> Val
+  | Addr, Int -> Addr
+  | Addr, Addr -> Addr
+  | Addr, Val -> Val


"Addr" denotes a derived pointer that might point in the middle of a heap block. The GC is unable to treat such derived pointers correctly: it cannot follow them and it cannot update them after moving the underlying object. This is why it is a compile-time fatal error to have a variable of type Addr live across a GC or call point.

Now, if you have a conditional where x : Addr in one branch and x : Val in the other, x may be a derived pointer and therefore must be given type Addr after the join point. If it dies before the next GC / call point, everything will be fine. Otherwise it is a serious code generation error and the compiler should bomb.

In summary: Addr, Val -> Addr and symmetrically.

Ah yes, my mistake, the relation is indeed the other way around.

mshinwell · 2016-04-20T17:11:02Z

Patch updated following @xavierleroy 's comment

xavierleroy · 2016-04-20T17:11:26Z

Apart the Addr corner case, the patch looks good to me. I'm not worried at all about performance degradation: most C-- variables, and especially most of those involved in a join point, are of type Val already.

xavierleroy · 2016-04-20T17:16:29Z

One last, cosmetic suggestion. While checking the code I drew the lattice of types on a sheet of paper. What about including it in a comment as ASCII art? E.g.

      Addr     Float
       |
      Val
       |
      Int

mshinwell · 2016-04-20T17:21:26Z

Good idea. I've added to the comment a note about [Addr] as well.

stedolan · 2016-04-20T23:24:21Z

testsuite/tests/asmcomp/register_typing.ml

+let allocate_garbage () =
+  for i = 0 to 100 do
+    ignore (Array.make 200 0.0)
+  done


Whether or not this function allocates anything on the minor heap is dependent on word size.

On 32-bit machines, a 200-float array is 400 words, bigger than Max_young_wosize, and nothing is allocated on the minor heap. Is this the intent?

stedolan · 2016-04-20T23:33:31Z

Even in theory, what is the possible performance degradation of this patch? That the frame tables may be slightly larger?

lpw25 · 2016-04-21T09:42:20Z

Even in theory, what is the possible performance degradation of this patch?

Additional roots for the GC to try to mark even though they are always integers.

xavierleroy · 2016-04-21T16:46:33Z

Yes, larger frame tables, and a bit more time spent in the GC scanning the stack.

…#555)

runtime: CAML_TRACE_VERSION is now set to a Multicore specific value

…10354 and PR#10387) (ocaml#555) * Refactor Proc.op_is_pure and fix Mach.operation_can_raise These two predicates test properties of Mach operations. For the standard operations, the results are independent of the target platform. For the platform-specific operations, results vary. The "is pure" predicate was implemented in a platform-specific file, $ARCH/proc.ml. The treatment of standard operations was duplicated. The "can raise" predicate was implemented in a platform-independent file, mach.ml. All specific operations were assumed not to raise, which is incorrect for ARM and ARM64 and their "shiftcheckbound" specific operation. (See ocaml#10339.) This commit refactors the two predicates as follows: - `Arch.operation_is_pure` and `Arch.operation_can_raise` predicates are added to each platform description file. They deal with specific operations only. - `Mach.operation_is_pure` was added and `Mach.operation_can_raise` was fixed to implement the test for standard operations and delegate to the Arch functions for specific operations. * Fix handling of exception-raising specific operations during spilling The ARM and ARM64 architectures have `Ishiftcheckbound` specific instructions that can raise an Invalid_argument exception. This exception can, in turn, be handled in the same function (see PR#10339 for an example). However, these specific instructions were not taken into account during insertion of spill code in asmcomp/spill.ml. The consequence is a variable that is reloaded from stack in the exception handler, yet has not been spilled to stack before. This commit fixes the issue by using the recently-fixed `Mach.operation_can_raise` predicate to handle operations during spill code insertion, instead of an ad-hoc pattern-matching. Fixes: PR#10339 * Fix handling of exception-raising specific operations during liveness analysis (ocaml#10387) This is a follow-up to 15e6354 and ocaml#10354. Use the `Mach.operation_can_raise` predicate instead of an ad-hoc pattern matching to spot operations where the variables live at entry to the enclosing exception handler must also be live. The ad-hoc pattern matching was missing the `Ishiftcheckbound` specific operations of ARM and ARM64. * Handle flambda-backend operations * Update cfg * Iprobe_is_enabled is pure * Exhaustive match * Fix after rebase * Fix arm64 after rebase and cleanup Co-authored-by: Xavier Leroy <[email protected]> Co-authored-by: Xavier Leroy <[email protected]>

fe8a98b flambda-backend: Save Mach as Cfg after Selection (ocaml#624) 2b205d8 flambda-backend: Clean up algorithms (ocaml#611) 524f0b4 flambda-backend: Initial refactoring of To_cmm (ocaml#619) 0bf75de flambda-backend: Refactor and correct the "is pure" and "can raise" (port upstream PR#10354 and PR#10387) (ocaml#555) d234bfd flambda-backend: Cpp mangling is now a configuration option (ocaml#614) 20fc614 flambda-backend: Check that stack frames are not too large (ocaml#10085) (ocaml#561) 5fc2e95 flambda-backend: Allow CSE of immutable loads across stores (port upstream PR#9562) (ocaml#562) 2a650de flambda-backend: Backport commit fc95347 from trunk (ocaml#584) 31651b8 flambda-backend: Improved ARM64 code generation (port upstream PR#9937) (ocaml#556) f0b6d68 flambda-backend: Simplify processing and remove dead code (error paths) in asmlink (port upstream PR#9943) (ocaml#557) 90c6746 flambda-backend: Improve code-generation for inlined comparisons (port upstream PR#10228) (ocaml#563) git-subtree-dir: ocaml git-subtree-split: fe8a98b

…port upstream PR#10354 and PR#10387) (ocaml#555) * Refactor Proc.op_is_pure and fix Mach.operation_can_raise These two predicates test properties of Mach operations. For the standard operations, the results are independent of the target platform. For the platform-specific operations, results vary. The "is pure" predicate was implemented in a platform-specific file, $ARCH/proc.ml. The treatment of standard operations was duplicated. The "can raise" predicate was implemented in a platform-independent file, mach.ml. All specific operations were assumed not to raise, which is incorrect for ARM and ARM64 and their "shiftcheckbound" specific operation. (See ocaml#10339.) This commit refactors the two predicates as follows: - `Arch.operation_is_pure` and `Arch.operation_can_raise` predicates are added to each platform description file. They deal with specific operations only. - `Mach.operation_is_pure` was added and `Mach.operation_can_raise` was fixed to implement the test for standard operations and delegate to the Arch functions for specific operations. * Fix handling of exception-raising specific operations during spilling The ARM and ARM64 architectures have `Ishiftcheckbound` specific instructions that can raise an Invalid_argument exception. This exception can, in turn, be handled in the same function (see PR#10339 for an example). However, these specific instructions were not taken into account during insertion of spill code in asmcomp/spill.ml. The consequence is a variable that is reloaded from stack in the exception handler, yet has not been spilled to stack before. This commit fixes the issue by using the recently-fixed `Mach.operation_can_raise` predicate to handle operations during spill code insertion, instead of an ad-hoc pattern-matching. Fixes: PR#10339 * Fix handling of exception-raising specific operations during liveness analysis (ocaml#10387) This is a follow-up to 15e6354 and ocaml#10354. Use the `Mach.operation_can_raise` predicate instead of an ad-hoc pattern matching to spot operations where the variables live at entry to the enclosing exception handler must also be live. The ad-hoc pattern matching was missing the `Ishiftcheckbound` specific operations of ARM and ARM64. * Handle flambda-backend operations * Update cfg * Iprobe_is_enabled is pure * Exhaustive match * Fix after rebase * Fix arm64 after rebase and cleanup Co-authored-by: Xavier Leroy <[email protected]> Co-authored-by: Xavier Leroy <[email protected]>

mshinwell added the consider-for-release label Apr 19, 2016

ghost reviewed Apr 19, 2016
View reviewed changes

damiendoligez added this to the 4.03.0 milestone Apr 20, 2016

xavierleroy reviewed Apr 20, 2016
View reviewed changes

Ensure that register typing constraints are respected at join points

37de8a1

mshinwell force-pushed the join_reg_typing branch from f522408 to 37de8a1 Compare April 20, 2016 17:21

stedolan reviewed Apr 20, 2016
View reviewed changes

damiendoligez added the approved label Apr 22, 2016

mshinwell merged commit 180e378 into ocaml:4.03 Apr 22, 2016

mshinwell mentioned this pull request Jun 1, 2016

Ensure that register typing constraints are respected (again) #600

Merged

mshinwell mentioned this pull request Jun 23, 2016

Simplify identity matches #632

Closed

lthls mentioned this pull request Jan 23, 2018

Replace constant pointers by regular integers #1580

Merged

mshinwell added a commit to mshinwell/ocaml that referenced this pull request Jul 12, 2021

Move everything except Code_id.t out of Function_declaration.t (ocaml…

80ab5af

…#555)

EduardoRFS pushed a commit to esy-ocaml/ocaml that referenced this pull request Jul 23, 2021

Merge pull request ocaml#555 from Engil/caml_trace_version_update

26402bf

runtime: CAML_TRACE_VERSION is now set to a Multicore specific value

EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024

Fix typo in rs_00_guidelines.md (ocaml#555)

be1f2f2

Conversation

mshinwell commented Apr 19, 2016

Uh oh!

ghost Apr 19, 2016

Choose a reason for hiding this comment

Uh oh!

mshinwell Apr 19, 2016

Choose a reason for hiding this comment

Uh oh!

lpw25 commented Apr 19, 2016

Uh oh!

damiendoligez commented Apr 19, 2016

Uh oh!

gasche commented Apr 19, 2016

Uh oh!

mshinwell commented Apr 20, 2016

Uh oh!

gasche commented Apr 20, 2016

Uh oh!

mshinwell commented Apr 20, 2016

Uh oh!

mshinwell commented Apr 20, 2016

Uh oh!

lpw25 commented Apr 20, 2016

Uh oh!

mshinwell commented Apr 20, 2016

Uh oh!

mshinwell commented Apr 20, 2016

Uh oh!

mshinwell commented Apr 20, 2016

Uh oh!

bluddy commented Apr 20, 2016

Uh oh!

mshinwell commented Apr 20, 2016

Uh oh!

bluddy commented Apr 20, 2016

Uh oh!

mshinwell commented Apr 20, 2016

Uh oh!

bluddy commented Apr 20, 2016

Uh oh!

xavierleroy Apr 20, 2016

Choose a reason for hiding this comment

Uh oh!

mshinwell Apr 20, 2016

Choose a reason for hiding this comment

Uh oh!

mshinwell commented Apr 20, 2016

Uh oh!

xavierleroy commented Apr 20, 2016

Uh oh!

xavierleroy commented Apr 20, 2016

Uh oh!

mshinwell commented Apr 20, 2016

Uh oh!

stedolan Apr 20, 2016

Choose a reason for hiding this comment

Uh oh!

stedolan commented Apr 20, 2016

Uh oh!

lpw25 commented Apr 21, 2016

Uh oh!

xavierleroy commented Apr 21, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants