Skip to content

Ensure that register typing constraints are respected at join points#555

Merged
mshinwell merged 1 commit intoocaml:4.03from
mshinwell:join_reg_typing
Apr 22, 2016
Merged

Ensure that register typing constraints are respected at join points#555
mshinwell merged 1 commit intoocaml:4.03from
mshinwell:join_reg_typing

Conversation

@mshinwell
Copy link
Contributor

It has been known for some time that great care has to be taken when using Obj.magic to ensure that, at join points in the control flow graph, a register "known to hold only immediates" does not end up being used to hold a value that might be a pointer. If this happens, it will not be registered as a GC root, which can cause it to become invalid. (The typical fix is something like using None instead of "unit".)

After having fixed a piece of code having that problem this morning it became apparent that the following function, which does not use Obj.magic, suffers from the same problem:

type 'a typ = Int : int typ | Ptr : int list typ

let f (type a) (t : a typ) (p : int list) : a =
  match t with
  | Int -> 100
  | Ptr -> p

If the function "f" is inlined into another function then the situation above can arise, as exhibited in the test case on this pull request, which segfaults.

The only reasonable solution it would appear at present is to ensure that at join points we respect the register typing. This may produce slightly worse code in some cases. For such cases the correct solution to solve this problem is probably the propagation of more information about what values might be held in which identifiers right through from the type checker.

I think this is probably serious enough to warrant putting in 4.03.

(* Float unboxing code must be sure to avoid this case. *)
assert false

let ge_component comp1 comp2 =
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this be implemented as comp1 = lub_component comp1 comp2? That would avoid having to check the code twice

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to think it's clearer written out, actually.

@lpw25
Copy link
Contributor

lpw25 commented Apr 19, 2016

Patch looks correct to me.

@damiendoligez
Copy link
Member

Devil's advocate here: the bug has been here since we introduced GADTs in 4.00.0. Is it really urgent to fix it now? What's your assessment of the risk level for this change?

@gasche
Copy link
Member

gasche commented Apr 19, 2016

My understanding is that the issue is sensibly more severe in presence of flambda than it was in previous releases, as it will aggressively inline these cases.

@damiendoligez damiendoligez added this to the 4.03.0 milestone Apr 20, 2016
@mshinwell
Copy link
Contributor Author

I'm also kind of surprised this hasn't caused trouble before, given how long the GADT support has been merged for.

It doesn't really seem an ideal change just before a release. On balance I think I'm pretty convinced about the soundness of the change, but I'm not completely sure about potential performance degradation. @xavierleroy : do you have an opinion on that?

@gasche
Copy link
Member

gasche commented Apr 20, 2016

Can you characterize the problematic joint points? Are they exactly those that match on a GADT at the source level? If so, that would make it easier to reason about potential performance impact (most of the code doesn't use GADTs and would be unaffected, and I understand there are a few performance-sensitive uses of GADTs that could be inspected manually).

Have you tried tracing the cases where the join decreases precision, for example on a compiler bootstrap?

@mshinwell
Copy link
Contributor Author

I'm not sure I have an exact characterisation of the problematic join points, but I think a GADT match must be involved, with one of the branches returning a result that forces the backend to allocate a register of type "Int" for that result (in fact one of type "Addr" would probably cause it to go wrong too) and another branch returning a value that might be boxed. In some cases the backend would then use the "Int" register as the result for the whole match, which might then go wrong if the live range for that register then extends forwards across a GC point (and if the boxed branch is taken in the match).

For performance, it's actually cases that do not use GADTs, but might end up getting penalised, which we're worried about. One way we might be able to assess this is using OCamlPro's benchmarking suite which we were using for flambda, perhaps.

@mshinwell
Copy link
Contributor Author

@garrigue Do you have an opinion on this one?

@lpw25
Copy link
Contributor

lpw25 commented Apr 20, 2016

Patch looks correct to me.

Actually one thing I am not sure about: Is Addr supposed to include Int? At the moment if one branch returns Addr and another returns Int then they will both be stored in an Addr register. Is that correct? If Addr is allowed to hold Int then it is not clear to me what the difference between Val and Addr is.

@mshinwell
Copy link
Contributor Author

Val is registered with the GC, Addr is not, as far as I recall.

@mshinwell
Copy link
Contributor Author

(I will check about having Int in Addr)

@mshinwell
Copy link
Contributor Author

Hmm, so I think if we use Addr occasionally rather than Int, it isn't that much of a big deal. It will inhibit CSE at certain points. That might be all, in fact, with the current code.

@bluddy
Copy link
Contributor

bluddy commented Apr 20, 2016

Not to hijack the discussion, but could someone explain to me what the join points are?

@mshinwell
Copy link
Contributor Author

When two control flow paths come together, such as at the end of a conditional expression.

@bluddy
Copy link
Contributor

bluddy commented Apr 20, 2016

Thanks. And normally join points are places where we may register GC roots?

@mshinwell
Copy link
Contributor Author

No. The roots are effectively "registered" at a point where one OCaml function may call another. The problem is that at the join point, when generating code, a decision has to be made about which register to use to hold the result of the conditional. In some cases the backend reuses the result register from one of the two code paths (and generates a move at the end of the other code path into that register). However this is wrong if the reused register will not be registered as a root (at any subsequent point where it might need to be, before it becomes dead) and the value being moved into it on the other code path may need to be registered.

@bluddy
Copy link
Contributor

bluddy commented Apr 20, 2016

Awesome. Thanks again.

asmcomp/cmm.ml Outdated
| Int, Val -> Val
| Addr, Int -> Addr
| Addr, Addr -> Addr
| Addr, Val -> Val
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Addr" denotes a derived pointer that might point in the middle of a heap block. The GC is unable to treat such derived pointers correctly: it cannot follow them and it cannot update them after moving the underlying object. This is why it is a compile-time fatal error to have a variable of type Addr live across a GC or call point.

Now, if you have a conditional where x : Addr in one branch and x : Val in the other, x may be a derived pointer and therefore must be given type Addr after the join point. If it dies before the next GC / call point, everything will be fine. Otherwise it is a serious code generation error and the compiler should bomb.

In summary: Addr, Val -> Addr and symmetrically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, my mistake, the relation is indeed the other way around.

@mshinwell
Copy link
Contributor Author

Patch updated following @xavierleroy 's comment

@xavierleroy
Copy link
Contributor

Apart the Addr corner case, the patch looks good to me. I'm not worried at all about performance degradation: most C-- variables, and especially most of those involved in a join point, are of type Val already.

@xavierleroy
Copy link
Contributor

One last, cosmetic suggestion. While checking the code I drew the lattice of types on a sheet of paper. What about including it in a comment as ASCII art? E.g.

      Addr     Float
       |
      Val
       |
      Int

@mshinwell
Copy link
Contributor Author

Good idea. I've added to the comment a note about [Addr] as well.

let allocate_garbage () =
for i = 0 to 100 do
ignore (Array.make 200 0.0)
done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whether or not this function allocates anything on the minor heap is dependent on word size.

On 32-bit machines, a 200-float array is 400 words, bigger than Max_young_wosize, and nothing is allocated on the minor heap. Is this the intent?

@stedolan
Copy link
Contributor

Even in theory, what is the possible performance degradation of this patch? That the frame tables may be slightly larger?

@lpw25
Copy link
Contributor

lpw25 commented Apr 21, 2016

Even in theory, what is the possible performance degradation of this patch?

Additional roots for the GC to try to mark even though they are always integers.

@xavierleroy
Copy link
Contributor

Yes, larger frame tables, and a bit more time spent in the GC scanning the stack.

@mshinwell mshinwell merged commit 180e378 into ocaml:4.03 Apr 22, 2016
mshinwell added a commit to mshinwell/ocaml that referenced this pull request Jul 12, 2021
EduardoRFS pushed a commit to esy-ocaml/ocaml that referenced this pull request Jul 23, 2021
runtime: CAML_TRACE_VERSION is now set to a Multicore specific value
stedolan pushed a commit to stedolan/ocaml that referenced this pull request May 24, 2022
…10354 and PR#10387) (ocaml#555)

* Refactor Proc.op_is_pure and fix Mach.operation_can_raise

These two predicates test properties of Mach operations.  For the
standard operations, the results are independent of the target
platform.  For the platform-specific operations, results vary.

The "is pure" predicate was implemented in a platform-specific file,
$ARCH/proc.ml. The treatment of standard operations was duplicated.

The "can raise" predicate was implemented in a platform-independent file,
mach.ml.  All specific operations were assumed not to raise, which is
incorrect for ARM and ARM64 and their "shiftcheckbound" specific operation.
(See ocaml#10339.)

This commit refactors the two predicates as follows:
- `Arch.operation_is_pure` and `Arch.operation_can_raise` predicates
  are added to each platform description file.  They deal with
  specific operations only.
- `Mach.operation_is_pure` was added and `Mach.operation_can_raise`
  was fixed to implement the test for standard operations and
  delegate to the Arch functions for specific operations.

* Fix handling of exception-raising specific operations during spilling

The ARM and ARM64 architectures have `Ishiftcheckbound` specific instructions
that can raise an Invalid_argument exception.  This exception can, in turn,
be handled in the same function (see PR#10339 for an example).

However, these specific instructions were not taken into account
during insertion of spill code in asmcomp/spill.ml.  The consequence
is a variable that is reloaded from stack in the exception handler,
yet has not been spilled to stack before.

This commit fixes the issue by using the recently-fixed
`Mach.operation_can_raise` predicate to handle operations during spill
code insertion, instead of an ad-hoc pattern-matching.

Fixes: PR#10339

* Fix handling of exception-raising specific operations during liveness analysis (ocaml#10387)

This is a follow-up to 15e6354 and ocaml#10354.

Use the `Mach.operation_can_raise` predicate instead of an ad-hoc
pattern matching to spot operations where the variables live at entry
to the enclosing exception handler must also be live.

The ad-hoc pattern matching was missing the `Ishiftcheckbound`
specific operations of ARM and ARM64.

* Handle flambda-backend operations

* Update cfg

* Iprobe_is_enabled is pure

* Exhaustive match

* Fix after rebase

* Fix arm64 after rebase and cleanup

Co-authored-by: Xavier Leroy <[email protected]>
Co-authored-by: Xavier Leroy <[email protected]>
stedolan pushed a commit to stedolan/ocaml that referenced this pull request May 24, 2022
fe8a98b flambda-backend: Save Mach as Cfg after Selection (ocaml#624)
2b205d8 flambda-backend: Clean up algorithms (ocaml#611)
524f0b4 flambda-backend: Initial refactoring of To_cmm (ocaml#619)
0bf75de flambda-backend: Refactor and correct the "is pure" and "can raise" (port upstream PR#10354 and PR#10387) (ocaml#555)
d234bfd flambda-backend: Cpp mangling is now a configuration option (ocaml#614)
20fc614 flambda-backend: Check that stack frames are not too large (ocaml#10085) (ocaml#561)
5fc2e95 flambda-backend: Allow CSE of immutable loads across stores (port upstream PR#9562) (ocaml#562)
2a650de flambda-backend: Backport commit fc95347 from trunk (ocaml#584)
31651b8 flambda-backend: Improved ARM64 code generation (port upstream PR#9937) (ocaml#556)
f0b6d68 flambda-backend: Simplify processing and remove dead code (error paths) in asmlink (port upstream PR#9943) (ocaml#557)
90c6746 flambda-backend: Improve code-generation for inlined comparisons (port upstream PR#10228) (ocaml#563)

git-subtree-dir: ocaml
git-subtree-split: fe8a98b
lpw25 pushed a commit to lpw25/ocaml that referenced this pull request Jun 21, 2022
…port upstream PR#10354 and PR#10387) (ocaml#555)

* Refactor Proc.op_is_pure and fix Mach.operation_can_raise

These two predicates test properties of Mach operations.  For the
standard operations, the results are independent of the target
platform.  For the platform-specific operations, results vary.

The "is pure" predicate was implemented in a platform-specific file,
$ARCH/proc.ml. The treatment of standard operations was duplicated.

The "can raise" predicate was implemented in a platform-independent file,
mach.ml.  All specific operations were assumed not to raise, which is
incorrect for ARM and ARM64 and their "shiftcheckbound" specific operation.
(See ocaml#10339.)

This commit refactors the two predicates as follows:
- `Arch.operation_is_pure` and `Arch.operation_can_raise` predicates
  are added to each platform description file.  They deal with
  specific operations only.
- `Mach.operation_is_pure` was added and `Mach.operation_can_raise`
  was fixed to implement the test for standard operations and
  delegate to the Arch functions for specific operations.

* Fix handling of exception-raising specific operations during spilling

The ARM and ARM64 architectures have `Ishiftcheckbound` specific instructions
that can raise an Invalid_argument exception.  This exception can, in turn,
be handled in the same function (see PR#10339 for an example).

However, these specific instructions were not taken into account
during insertion of spill code in asmcomp/spill.ml.  The consequence
is a variable that is reloaded from stack in the exception handler,
yet has not been spilled to stack before.

This commit fixes the issue by using the recently-fixed
`Mach.operation_can_raise` predicate to handle operations during spill
code insertion, instead of an ad-hoc pattern-matching.

Fixes: PR#10339

* Fix handling of exception-raising specific operations during liveness analysis (ocaml#10387)

This is a follow-up to 15e6354 and ocaml#10354.

Use the `Mach.operation_can_raise` predicate instead of an ad-hoc
pattern matching to spot operations where the variables live at entry
to the enclosing exception handler must also be live.

The ad-hoc pattern matching was missing the `Ishiftcheckbound`
specific operations of ARM and ARM64.

* Handle flambda-backend operations

* Update cfg

* Iprobe_is_enabled is pure

* Exhaustive match

* Fix after rebase

* Fix arm64 after rebase and cleanup

Co-authored-by: Xavier Leroy <[email protected]>
Co-authored-by: Xavier Leroy <[email protected]>
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants