-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Remove the SIGTRAP-based bounds checking on POWER #12540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
(note that I'll be AFK from this evening until september 20th, so there is no hurry to review this and I won't be able to address any comments until then) |
xavierleroy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a very quick look, this looks good, but you need to handle the "far" case where the checkbound instruction is so far from the end of the function that the range of the conditional jump is exceeded. You can see how it's done in the ARM64 port. But I'm sure I wrote this code (for POWER) at some point in the past, so if I can find it soon I'll add it to this PR.
|
I didn't find my old code, and I'm pretty sure it was incomplete anyway, but I added the branch relaxation part, and also fixed the backtraces, which were not quite right. The OCaml test suite passes. I didn't try to re-run #12482 because I'm not familiar with OPAM switches based on working sources for the core system. |
|
There's a variant of this code that uses the "conditional branch and link" POWER instructions to share a single call to caml_ml_array_bound_error, even in -g mode. This has the potential to reduce code size in -g mode. However, it interacts negatively (= currently generates wrong code) with the instruction scheduling pass, so I'm not inclined to pursue this direction. |
|
This should also restore FreeBSD ppc64le support, since the only thing stopping that was the SIGTRAP handling (#10837). I can test this on there when back from my ICFP travels in a week. |
|
I can confirm that this fixes the ppc64 crashes we have observed on We're still able to trigger (non-crashing) failures on our model-based test of let init_sut () = Float.Array.make floatarray_size 1.0Passing it through a Seq-List-Seq conversion dance Here's a stand alone reproducer: let floatarray_size = 16
let reference = List.init floatarray_size (fun _ -> 1.0)
let _ =
for i=1 to 10_000 do
let t = Float.Array.make floatarray_size 1.0 in
assert (Seq.equal Float.equal (List.to_seq reference) (List.to_seq (List.of_seq (Float.Array.to_seq t))))
doneOn my Linux box this runs flawlessly, but on ppc64 this fails as follows: |
|
Well spotted. This seems to be caused by |
|
@xavierleroy: thanks for the relaxation code. I have rebased the PR and credited you for these changes. |
Thank you! A quick retest on a POWER9 machine is positive. We could wait for an extra external review (@mshinwell ? @kayceesrk ? @avsm ?), but my feeling is that this PR is good enough already. What would be nice, independently of the extra review, is an OPAM-wide testing, and I don't know how to arrange for that. |
|
I'll review the PR this week. |
kayceesrk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the PR by comparing the changes with the arm64 backend. I believe that the code is doing the right thing.
asmcomp/power/arch.ml
Outdated
| | Ipoll_far of { return_label : cmm_label option } | ||
| (* poll point in large functions *) | ||
| | Icheckbound_far (* bounds check in large functions *) | ||
| | Icheckbound_imm_far of int (* bounds check in large functions *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not suggesting a change in this PR, but observing that arm64/arch.ml calls these operations
| Ifar_alloc
| Ifar_poll
| Ifar_intop_checkbound
| Ifar_intop_imm_checkbound
It will be useful to standardize the names of these operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't write the ARM64 branch relaxation code, so I cannot comment on the choice of names that was made there. Let me just point out that the relaxation code was first developed for POWER, with the name Ialloc_far, before it was applied to ARM64, with the Ifar_* names. Then, poll points were added, with Ipoll_far name in POWER. So, I'm just being consistent with the earlier POWER names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not suggesting that we standardize to the ARM64 names. I prefer the *_far scheme used by POWER.
| end | ||
|
|
||
| let bound_error_label env dbg = | ||
| if !Clflags.debug then begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not familiar with this code. I'm reviewing the code by comparing this with the arm64 backend. I wondered why this code is different from the arm64 version of this function:
Lines 164 to 175 in 22084b7
| let bound_error_label env dbg = | |
| if !Clflags.debug || env.bound_error_sites = [] then begin | |
| let lbl_bound_error = new_label() in | |
| let lbl_frame = record_frame_label env Reg.Set.empty (Dbg_other dbg) in | |
| env.bound_error_sites <- | |
| { bd_lbl = lbl_bound_error; | |
| bd_frame = lbl_frame; | |
| } :: env.bound_error_sites; | |
| lbl_bound_error | |
| end else begin | |
| let bd = List.hd env.bound_error_sites in bd.bd_lbl | |
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be frank, the ARM64 code smells, e.g. the use of List.hd. The code I wrote for POWER is clearer, if I can say so myself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, I intend to work on unifying this logic between platforms (but not in this PR). I have also noticed riscv64 will always create new labels regardless of Cflags.debug...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dustanddreams. This would be useful.
This switches the code generation back to the usual "compare and branch" logic used by all other native backends.
|
Rebased with |
|
Thanks all for the contributions to this PR. Time to merge! |
As mentioned in #12482, the
SIGTRAP-based logic to perform array bounds checks on POWER systems is a bit brittle and causes some tests to misbehave.This PR switches POWER back to the usual "compare and branch to
caml_ml_array_bound_error" logic used by all other native backends.Also linking to #12276 where this code was introduced.