Revised handing of calling conventions for external C functions#9752
Revised handing of calling conventions for external C functions#9752xavierleroy merged 3 commits intoocaml:trunkfrom
Conversation
| incr int; l | ||
| end else begin | ||
| let l = stack_slot (make_stack !ofs) Int in | ||
| ofs := !ofs + (if macosx then 4 else 8); |
There was a problem hiding this comment.
Hmm wouldn't it be a better idea to put it in a variable?
let size_int32 = if macosx then 4 else size_int
There was a problem hiding this comment.
Not with the name size_int32, which for me means a solid "4" and nothing else. Possibly with the name size_int32_argument_on_stack. But I prefer the original if macosx then 4 else 8, it's documentation in code ("under macOS this stack slot spans 4 bytes, otherwise it spans 8 bytes").
|
@ctk21 / @kayceesrk - doesn't this change also address the problem we saw in multicore leading to dra27/ocaml-multicore@3b56c5b and ocaml-multicore/ocaml-multicore#352? |
Hm, what about using an (From the land of idle domain-unspecific comments: the patch would be nicer if the arguments to the |
Probably not. Your problem, as far as I understand it, is that you can't poll for pending signals (and possibly run arbitrary OCaml code as signal handlers) at every program point in the Mach representation. For I thought we agreed (at the latest dev meeting we had in person) that insertion of polling points should be performed earlier, on a higher-level representation than Mach or Cmm ? |
I thought about it, of course. But this is not an option like the |
For that specific one, the problem was that the GC is unaware at that specific point that
Indeed, and that plan hasn't changed - this was something which came up completing the Windows port of the native runtime for multicore, which is necessarily over the old/current implementation in Mach. |
Seconding to say that the plan hasn't changed. We're implementing the pass at Lambda. |
|
@xavierleroy Review request received, will try to do this in the next couple of days. |
|
Thank you @dra27 and @kayceesrk for the update concerning the polling points. Coming back to @dra27's initial comment: even before this PR it is conceptually possible for |
|
@xavierleroy I've read most of this now, but I'm finding the use of
I think I might be missing something here, so any clarification would be appreciated... Addendum: something else I've just noticed is that the expansion from |
It's essentially the same code we had before. This PR concentrates on Take ARM 32 bits as an example. Today we have and in the PR we have The new You're right to be suspicious, but this code works in the PR just like it works today. Taking your ARM examples, an unboxed int64 result is materialized as two pseudoregisters of type int, hence For soft FP, I would need to run special tests to check (nobody uses ARM softFP any more, to the point that we might remove support from OCaml), but I think the same thing happens: a float result is also materialized as two pseudoregisters of type int, hence In other 32-bit ports we're not as lucky and we just special-case the "return int64" case. For example the i386 port has such a special case in My initial attempt at this PR was also using |
You're right, something is weird here, in the case of ARM soft FP. We should use |
|
I was able to test on ARM with EABI calling convention (where FP arguments and results are passed in pairs of integer registers). The test found an unrelated problem (see #9782) but otherwise the calling conventions seem correctly implemented. |
|
I would like to merge this PR sooner than later so as to unblock #9699 . @mshinwell, would it be possible for you to complete your review by the end of the week? |
|
@xavierleroy Yes, today or tomorrow I hope! |
mshinwell
left a comment
There was a problem hiding this comment.
I've made a few minor suggestions, but this looks good. A tricky one to review.
|
|
||
| method private iextcall (func, alloc) = | ||
| Iextcall { func; alloc; label_after = Cmm.new_label (); } | ||
| method private iextcall func ty_res ty_args = |
There was a problem hiding this comment.
I found it confusing that ty_res and ty_args are in this order; could we reverse them? I think this would make the code that follows easier to read.
There was a problem hiding this comment.
It's like a C function type :-) I'm not very keen on it but I think it does the job, so I left the code as is.
asmcomp/cmm.mli
Outdated
| | XInt (**r OCaml value, word-sized integer *) | ||
| | XInt32 (**r 32-bit integer *) | ||
| | XInt64 (**r 64-bit integer *) | ||
| | XFloat (**r double-precision FP number *) |
There was a problem hiding this comment.
The formatting here seems off, with a missing space after XFloat, and varying numbers of spaces before the closing comment delimiters...
asmcomp/cmm_helpers.ml
Outdated
| fill_fields (idx + 2) el) in | ||
| Clet(VP.create id, | ||
| Cop(Cextcall("caml_alloc", typ_val, true, None), | ||
| Cop(Cextcall("caml_alloc", typ_val, [XInt;XInt], true, None), |
There was a problem hiding this comment.
I wonder if we might write [] instead of [XInt;Xint] here, as in other places.
There was a problem hiding this comment.
Well spotted. Done.
asmcomp/cmm_helpers.ml
Outdated
|
|
||
| let bswap16 arg dbg = | ||
| (Cop(Cextcall("caml_bswap16_direct", typ_int, false, None), | ||
| (Cop(Cextcall("caml_bswap16_direct", typ_int, [XInt], false, None), |
There was a problem hiding this comment.
Same comment as for caml_alloc, above.
asmcomp/power/proc.ml
Outdated
| int := !int + 2; | ||
| [| reg_lower; reg_upper |] | ||
| end else begin | ||
| let size_int64 = size_int * 2 in |
There was a problem hiding this comment.
Could we assert here that size_int = 4?
There was a problem hiding this comment.
This size_int * 2 is cut and pasted from the original code, but I don't like it, actually. I just put 8 for the two occurrences of size_int64, it's clearer and closer to the informal explanations this way.
340d4fb to
bed52e9
Compare
Introduce the type Cmm.exttype to precisely describe arguments to external C functions, especially unboxed numerical arguments. Annotate Cmm.Cextcall with the types of the arguments (Cmm.exttype list). An empty list means "all arguments have default type XInt". Annotate Mach.Iextcall with the type of the result (Cmm.machtype) and the types of the arguments (Cmm.exttype list). Change (slightly) the API for describing calling conventions in Proc: - loc_external_arguments now takes a Cmm.exttype list, in order to know more precisely the types of the arguments. - loc_arguments, loc_parameters, loc_results, loc_external_results now take a Cmm.machype instead of an array of pseudoregisters. (Only the types of the pseudoregisters mattered anyway.) Update the implementations of module Proc accordingly, in every port. Introduce a new overridable method in Selectgen, insert_move_extcall_arg, to produce the code that moves an argument of an external C function to the locations returned by Proc.loc_external_arguments. Revise the selection of external calls accordingly (method emit_extcall_args in Selectgen).
…tions Unboxed arguments of type `int32` that are passed on the stack are passed in 32-bit words instead of 64-bit words as in the AAPCS64 ABI. To support this, we introduce a new specific operation, `Imove32`, that compiles down to 32-bit moves or 32-bit stack loads or 32-bit stack stores. In the Selection pass, method `insert_move_extcall_arg`, we generate `Imove32` instructions when required, i.e. if the argument is an unboxed `int32` and needs to be passed on stack. We then update `Proc.loc_external_arguments` to use 32-bit stack words for `int32` arguments.
bed52e9 to
cc25fc9
Compare
|
Thanks a lot for the review. I squashed some intermediate commits and am now running a round of CI precheck. I'll merge when CI is green everywhere. |
Revised handing of calling conventions for external C functions
Revised handing of calling conventions for external C functions
Revised handing of calling conventions for external C functions
Revised handing of calling conventions for external C functions
Revised handing of calling conventions for external C functions
Executive summary
This PR changes the way calling conventions for external C functions are described for every ocamlopt target architecture, and the way code is generated to call external C functions. The motivation is to support iOS/macOS running on ARM 64 bits, whose calling conventions cannot be described with the current ocamlopt code but fits easily in the proposed approach. This fixes a problem that blocks the merge of #9699.
C calling conventions
The calling conventions for external functions implemented in C are more complex than those for internal functions implemented in OCaml, because 1- we don't get to choose our conventions and must adapt to those of the target platform, and 2- external C functions take a greater variety of unboxed numerical arguments: unboxed double-precision FP numbers, but also 64-bit, 32-bit, and platform-native unboxed integers.
The two problematic cases are:
Proposed approach
The first commit of this PR changes the way external C functions are called, as follows:
Cmm.exttypeto precisely describe arguments to external C functions, especially unboxed numerical arguments: 32-bit integers, 64-bit integers, native integers, or double FP numbers.Cmm.Cextcallwith the types of the arguments (Cmm.exttype list). An empty list means "all arguments have default typeXInt", which is the most common case.Mach.Iextcallwith the type of the result (Cmm.machtype) and the types of the arguments (Cmm.exttype list). (The asymmetry between the result types and the argument types is intended. The result type describes a value in words the OCaml runtime system understands, distinguishing e.g. well-formed values from raw integers. The argument types describe arguments in words the C compiler understands.)loc_external_argumentsnow takes aCmm.exttype list, in order to know more precisely the types of the arguments.loc_arguments,loc_parameters,loc_results,loc_external_resultsnow take aCmm.machypeinstead of an array of pseudoregisters. (Only the types of the pseudoregisters mattered anyway.)calling_conventionsfunction suffices, modulo a conversion fromCmm.exttype listtoCmm.machtypeSelectgen, namedinsert_move_extcall_arg, to produce the code that moves an argument of an external C function to the locations returned byProc.loc_external_arguments.emit_extcall_argsinSelectgen).The second commit reaps the benefits of the new approach. It modifies the ARM64 port to support the iOS calling conventions properly. To this end, a new
Imove32specific operation is introduced, and generated ininsert_move_extcall_argwhen anXInt32argument is passed on stack.