Skip to content

Special constructor for %sys_argv primitive#10570

Merged
lthls merged 2 commits intoocaml:trunkfrom
Keryan-dev:prim-argv
May 23, 2025
Merged

Special constructor for %sys_argv primitive#10570
lthls merged 2 commits intoocaml:trunkfrom
Keryan-dev:prim-argv

Conversation

@Keryan-dev
Copy link
Contributor

This fixes a bug in the case caml_sys_argv would be called directly by the user.

external argv :  _ -> _ = "caml_sys_argv"

This would end up as a function with two arguments because it would get caught in the special case of

ocaml/lambda/translprim.ml

Lines 651 to 652 in 7f58ef0

| External prim, args when prim = prim_sys_argv ->
Lprim(Pccall prim, Lconst (const_int 0) :: args, loc)

that was meant to handle %sys_argv.
Somehow, the additional argument doesn't seem to bother the compiler but it might lead to runtime errors.

The additional constructor avoid this by allowing caml_sys_argv to be handled as any other C call.

@lthls
Copy link
Contributor

lthls commented Aug 17, 2021

To reproduce the bug:

# external f : unit -> string array = "caml_sys_argv";;
external f : unit -> string array = "caml_sys_argv"
# let _ = f ();;
Segmentation fault

This caused a compilation failure in the obytelib library when compiling with Flambda 2 (which depends more strictly on the primitive arities). A pull request has already been made there to work around this issue (bvaugon/obytelib#5).

@xavierleroy
Copy link
Contributor

Playing devil's advocate:

Mis-using a %prim primitive can cause all sorts of run-time crashes. It's an unsafe feature that no amount of hacking on the compiler will make safe. Why focus on this run-time crash in particular?

Suggesting a simpler fix: in the current code it might work to just ignore the extra arguments:

- Lprim(Pccall prim, Lconst (const_int 0) :: args, loc)
+ Lprim(Pccall prim, [Lconst (const_int 0)], loc)

@Keryan-dev
Copy link
Contributor Author

Playing deviler advocate:
Should the user want to have side effects in those arguments, would we just ignore those for this particular primitive ?

@lthls
Copy link
Contributor

lthls commented Aug 26, 2021

To clarify a bit:

  • We haven't actually observed a runtime crash in the wild.
  • What we observed is a mismatch in the backend between the declared arity of a C primitive and the number of arguments it is given (in Lambda terms, that would have been Lprim (Pccall { prim_arity = 1; _}, [arg1; arg2], _)). The code in Flambda 2 is less forgiving of this kind of errors than the current compiler, so this leads to a compilation error.
  • The only code we've seen that actually exports caml_sys_argv as an external (and not the %sys_argv primitive) is obytelib, and it does it by re-exporting (almost) all the primitives returned by ocamlrun -p. We've already suggested a patch to special-case caml_sys_argv.

This patch does not fix a real bug, but brings consistency to the code translating external primitives so we thought it was worth submitting as a PR here in addition to the PR on the Flambda 2 branch.

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed this issue again with @zapashcanon (we are both attending the Mirage retreat): we looked for an OCaml implementation of the OCaml bytecode interpreter, and found obytelib which suffers from a minor bug coming from this issue (see bvaugon/obytelib#5 ).

The PR code looks clean to me -- it adds a distinction between two distinct things. I am in favor of merging.

let primitive_needs_event_after = function
| Primitive (prim,_) -> lambda_primitive_needs_event_after prim
| External _ -> true
| External _ | Sys_argv -> true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: the Sys_argv case should go on line 938 below, where the other non-external primitives are handled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree: External and Sys_argv are both compiled to Pccall, which makes it natural to handle them in the same place.
Lines 937-938 are mostly about generic application primitives (plus Lazy_force), so I don't think Sys_argv fits particularly well there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually caml_sys_argv returns a constant directly (it does not raise an exception or allocate), so we could return false here if I understand correctly. In any case, I don't like the proposed code organization because it has no apparent structure (if I add a new primitive, where should it go? There are two groups now). Before there was "External _" and "everything else", this is clear and I think we should follow it. If you have a different structure in mind, there must be a comment that explains what it is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it:

  • If it calls a C function, it should have an event. This covers External and Sys_argv. The code in lambda_primitive_needs_event_after specifies that if the primitive does not allocate or raise, an event is not needed, but caml_sys_argv is declared with ~alloc:true so for consistency it should get an event.
  • If it calls an OCaml function, it should have an event. That is the case for Lazy_force, the various Send primitives, and the Apply primitives.

Basically, I don't agree with your classification of lines 937-938 as "everything else". To me it looks like "everything that calls an arbitrary OCaml function", which is not the case for Sys_argv. If you want to put it in lines 939-943, then fine (this does look like "everything else" to me), although we should probably update the declaration of prim_sys_argv to ~alloc:false for consistency if we do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a commit which adds a Changes entry and a comment here. I will merge later this week if I don't forget.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is worse than before because (1) whether a primitive is implemented by a C call or not is not a property of the primitive at the Lambda level, only of the implementation choices further down, (2) this is more work to classify as we have to check if stuff emits a C call or not. But you seem very convinced that this new distinction is a good idea -- it is probably not just your esprit de contradiction at my earlier comment -- and you know the middle-end better than I do... oh well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind that this is not even the Lambda level, this is an intermediary step restricted to Translprim. In terms of Lambda, Sys_argv doesn't exist, we only see a regular Pccall.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also: the whole point of this function is to decide whether the primitive needs an event or not. The primitive needs an event if it can call arbitrary OCaml code, either directly or by running a GC. So you cannot avoid, at this point, having to know exactly how each primitive is implemented; otherwise you cannot make the right decision.
The conservative decision (for the cases where you're not sure if you need an event or not) is to always add an event. Previously, the only case fitting that was the External case (where we would always add an event, even if the primitive was declared with [@noalloc]). So in a way, this is the right place to put constructors where you do not really want to find out if you need an event or not. All the other cases have clear reasons why they require an event or not, and adding unrelated constructors to them would only make the code more confusing.

@lthls lthls merged commit 64ef2d0 into ocaml:trunk May 23, 2025
24 checks passed
@lthls
Copy link
Contributor

lthls commented May 23, 2025

Thanks @gasche for moving this PR forward (and thanks @Keryan-dev for submitting the patch).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants