Avoid removing effectful expressions in Closure, and eliminate more non-effectful ones.#983
Conversation
|
This might be buggy, since is_pure_clambda incorrectly considers Pdivint as pure while it can raise an exception (would need to check if the previous uses of is_pure_clambda are affected). |
|
The bug is_pure_clambda is a real one. Currently, the following does not raise when compiled with ocamlopt (and it should): type t = {x: int; y:int}
let f x = {x; y = x/0}.x
let () = ignore (f 1) |
|
Fixed Closure.is_pure_clambda. |
|
Wouldn't it be interesting to avoid catch-all patterns on instructions or primitive in this function? It would force us to consider each case and could uncover other similar bugs today, or tomorrow when a new instruction is added. |
|
@gasche Agreed. Note that the dangerous catch-all is on primitives (they are assumed to be pure "by default"), not on expressions. But yes, being explicit would be a good idea in both cases. |
|
Btw, array accesses look a bit fishy as well. At the level of Clambda, the checkbound is not yet explicit, and is_pure_clambda still assumes loads from array to be pure. |
|
Indeed: type t = {x: int; y:int}
let x = {x = 1; y = [||].(0)}.xdoesn't raise when compiled with ocamlopt. |
|
Also suspicious: Plazyforce, Prevapply, Pdirapply, Pmodint, Pdivbint, Pmodbint, Pstring_load_*, Pbigstring_load_*. |
|
On the other hand, I don't see why Pduprecord is considered non pure. Does anyone know? |
|
Closure.is_pure is also suspicious... |
|
Pduprecord, at least if it corresponds to |
|
Mmh, I don't see how the "memcopy + updates" version would break invariants expected from pure expressions (in this context, "pure" is used to avoid evaluating an expression when its result is not observed; not e.g. to change the evaluation order (which would require to mark all memory reads as also "impure"). |
|
Ok, I've committed a version with an exhaustive match on primitives. I did not try to take the "unsafe" arguments into account. |
|
Can this code just use Semantics_of_primitives instead, which already has a carefully-checked exhaustive list (not that I promise it has no bugs)? (We may need a flag to allow it to accept primitives that are illegal in Flambda mode, but that's easy.) |
|
Yes, indeed! I'm a bit reluctant to have Closure depends on middle-end/. Can we merge this module into Lambda? One could imagine that either Simplif or alternative backend a-la Bucklescript use the information as well. |
|
I think the middle-end dependency may be there already (for Debuginfo), so I'm not sure we should worry about that. If it really must be moved I'd rather it stayed as its own file and was moved properly with git to avoid introducing any merge conflicts with other branches in progress. |
|
Can you comment on: ? Aren't Prevapply/Pdirapply instead eliminated by About Psequand/Pseqor: would you be ok to support them in Semantics_of_primitive (returning |
The middle-end is not linked in the bytecode compiler, only in tools/objinfo. |
|
Ok, implemented @mshinwell's suggestion of reusing Semantics_of_primitives (after moving it to bytecomp/). I removed error cases in this function, so that it can be applied in more contexts. |
|
@alainfrisch Can you check that [Flambda_invariants] checks for absence of the primitives for which you have removed the error cases? (And make it check if it doesn't already.) |
|
@mshinwell This is the case for Psequand, Pseqor, Pdirapply, Prevapply, but not Ploc. At least this is implemented in Flambda_invariants.primitive_invariants (but I cannot guarantee this function is applied). Is it worth adding a case for Ploc? |
|
Yeah, I think we might as well. The more checks the better... |
|
Ok, done. (I'm not sure, but it seems Plazyforce is also expanded by translcore. And contrary to the error message in flambda_invariants, it seems that Prevapply/Pdirapply are handled by Simplif.) |
|
Feel free to fix the message! |
|
The message might be sign that flambda does some extra useless work; it would be good to check first. |
| done | ||
|
|
||
| let ignore_useless_args () = | ||
| let f x _y = int_of_float (cos x) in |
There was a problem hiding this comment.
I think f should have [@@inlined always] to ensure that we're testing what you wanted to test, no?
There was a problem hiding this comment.
I've added [@inlined always] to the call site (the attribute on the definition would be @@inline, no?).
| No_effects, No_coeffects | ||
| | Prevapply | ||
| | Pdirapply | ||
| | Pdirapply -> |
There was a problem hiding this comment.
Please un-indent these three lines to follow the rest of the file...
There was a problem hiding this comment.
The indentation is coherent with was ocp-indent does, considering the definitions in the toplevel .ocp-indent. It is also coherent with the style used elsewhere in the compiler (except in newest flambda code, I guess), and I find it more readable than the style used elsewhere in this file. Since the file moves to bytecomp, I'd rather apply a style similar to other files in this directory, i.e. apply ocp-indent on the whole file. Would you oppose to that?
There was a problem hiding this comment.
I've done it (cf last commit), let me know if you are ok.
There was a problem hiding this comment.
I will manage (no doubt it will cause a merge conflict somewhere though, but not to worry)
asmcomp/closure.ml
Outdated
| | Lprim(_, args,_) -> List.for_all is_pure args | ||
| | Levent(lam, _ev) -> is_pure lam | ||
| | Lprim(p, args,_) -> is_pure_prim p && List.for_all is_pure args | ||
| | Levent(lam, _ev) -> is_pure lam (* can this happen in native code? *) |
There was a problem hiding this comment.
I think Levent may be present in a very few cases on the gdb branch, but I would have to check. I would just remove this comment---it probably makes things more robust to include this case correctly, anyway.
asmcomp/closure.ml
Outdated
| | Pfield n, [ Uprim(Pmakeblock _, ul, _) ], [approx] | ||
| when n < List.length ul -> | ||
| (List.nth ul n, field_approx n approx) | ||
|
|
There was a problem hiding this comment.
Gratuitous whitespace change.
Changes
Outdated
| and "0 mod <expr>" in the case when <expr> was a non-constant | ||
| evaluating to zero (Mark Shinwell) | ||
|
|
||
| - Eliminate more useless pure computations when inlining functions |
There was a problem hiding this comment.
Ok, and moved to the Bug Fix section.
|
@alainfrisch Please change the |
When a function that ignores some of its parameters is inlined, the corresponding arguments are still evaluated if they are not determined to be "pure". This commit improves this purity criterion by reusing an existing function that detects more cases. Only the non flambda pipeline is adapted by this commit, but I guess the same optimization is already part of flambda (CI will tell!) or would be easily added.
…ed in more contexts.
ce3e04f to
1600aa8
Compare
|
@alainfrisch I'm not sure about the |
|
Ploc evaluates to a compile-time constant, no? Which effect could it have, and why could it not be moved? |
|
Well it seems to me that either However (reasonably enough) it seemed you were in favour of having |
I guess that by moving, you mean "changing its source location", while I was rather interpreting moving as changing its position in "the tree" (Parsetree, Typedtree, Lambda, etc). Currently, Ploc is expanded during the Typedtree->Lambda translation, but it could easily be done later, since the Lprim node also keeps the source location. "Moving" Ploc within Lambda is fine as long as you don't change the Location.t attached to the Lprim node. |
|
By moving I meant a code motion transformation. However from what you say it sounds like it's ok to do that anyway, as long as you don't change the location. I think this is fine then. |
…tations Avoid removing effectful expressions in Closure, and eliminate more non-effectful ones.
b11eea1 flambda-backend: Introduce Import_info (ocaml#1036) bc5b135 flambda-backend: Fix `ocamlobjinfo` on flambda2 .cmx files (ocaml#1029) c8babbd flambda-backend: Compilation_unit optimisations (ocaml#1035) e8d3e22 flambda-backend: Use 4.14.0 opam switch for building (includes upgrading ocamlformat to 0.24.1) (ocaml#1030) eb14a86 flambda-backend: Port PR81 from ocaml-jst (ocaml#1024) 131bc12 flambda-backend: Merge ocaml-jst 2022-12-13 (ocaml#1022) 06c189a flambda-backend: Make stack allocation the default (ocaml#1013) 98debd5 flambda-backend: Initial support for value slots not of value kind (ocaml#946) deb1714 flambda-backend: Add is_last flag to closinfo words (ocaml#938) d07fce1 flambda-backend: Disable poll insertion in Configure (ocaml#967) 0f1ce0e flambda-backend: Regenerate ocaml/configure autoconf 2.69 (instead of 2.71) (ocaml#1012) 27132d8 flambda-backend: Fix for spurious typing error related to expanding through functor arguments (ocaml#997) 724fb68 flambda-backend: Use `Compilation_unit.t` instead of `Ident.t` for globals (ocaml#871) 396d5b8 flambda-backend: Add a test for frametable setup in natdynlinked libraries (ocaml#983) b73ab12 flambda-backend: Fix invocation of `caml_shared_startup` in native dynlink (ocaml#980) 7c7d75a flambda-backend: Fix split_default_wrapper which did not trigger anymore with flambda2 (ocaml#970) 8fb75bd flambda-backend: Port ocaml#11727 and ocaml#11732 (ocaml#965) fdb7987 flambda-backend: Fix include functor issue after 4.14 merge. (ocaml#948) 9745cdb flambda-backend: Print -dprofile/-dtimings output to stdout like 4.12 (ocaml#943) 5f51f21 flambda-backend: Merge pull request ocaml#932 from mshinwell/4.14-upgrade 841687d flambda-backend: Run make alldepend in ocaml/ (ocaml#936) 72a7658 flambda-backend: Remove reformatting changes only in dynlink/dune (preserving PR889 and adjusting to minimise diff) 6d758cd flambda-backend: Revert whitespace changes in dune files, to match upstream c86bf6e flambda-backend: Remove duplicate tests for polling 971dbeb flambda-backend: Testsuite fixes 32f8356 flambda-backend: Topeval fix for symbols patch befea01 flambda-backend: Compilation fixes / rectify merge faults a84543f flambda-backend: Merge ocaml-jst 8e65056 flambda-backend: Merge ocaml-jst 4d70045 flambda-backend: Remove filename from system frametable (amd64) (ocaml#920) 5e57b7d flambda-backend: Bugfix for runtime frame_descr logic for C frames (ocaml#918) 6423d5e flambda-backend: Merge pull request ocaml#914 from mshinwell/merge-ocaml-jst-2022-10-24 ead605c flambda-backend: Add a missing Extract_exception (ocaml#916) c8f1481 flambda-backend: Resolve conflicts and add specialise/specialised attributes to Builtin_attributes cf4d0d3 flambda-backend: Merge fixes (ocaml#21) c2f742f flambda-backend: Re-enable some tests for Flambda2 (ocaml#881) 3d38d13 flambda-backend: Long frames in frametable (ocaml#797) 85aec7b flambda-backend: Add loop attribute to Builtin_attributes c0f16e3 flambda-backend: Compilation fixes 90dea23 flambda-backend: Merge flambda-backend/main 5acc6ea flambda-backend: Fixes after merge e501946 flambda-backend: Merge ocaml-jst 115083b flambda-backend: Merge ocaml-jst 9943b2e flambda-backend: Revert "Revert "Transform tail-recursive functions into recursive continuations (ocaml#893)"" (ocaml#909) ce339f1 flambda-backend: Fix alloc modes and call kinds for overapplications (ocaml#902) e6a317c flambda-backend: Revert "Transform tail-recursive functions into recursive continuations (ocaml#893)" 853c488 flambda-backend: Transform tail-recursive functions into recursive continuations (ocaml#893) 5a977e4 flambda-backend: Fix missing End_region primitives on switch arms (ocaml#898) 7fa7f9d flambda-backend: Add missing dependencies to Dune files (ocaml#889) 3cd36f0 flambda-backend: Have Lambda `Pgetglobal` and `Psetglobal` take `Compilation_unit.t` (ocaml#896) 7565915 flambda-backend: [@poll error] attribute (ocaml#745) 9eb9448 flambda-backend: Backport the main safepoints PRs (ocaml#740) 689bdda flambda-backend: Add strict mode for ocamldep (ocaml#892) git-subtree-dir: ocaml git-subtree-split: b11eea1
…#983) * stay on current page when using version switch in package docs * Optional argument tricks --------- Co-authored-by: Cuihtlauac ALVARADO <[email protected]>
When a function that ignores some of its parameters is inlined, the
corresponding arguments are still evaluated if they are not determined
to be "pure". This commit improves this purity criterion by reusing
an existing function that detects more cases. Only the non flambda
pipeline is adapted by this commit, but I guess the same optimization
is already part of flambda (CI will tell!) or would be easily added.
This optimization is triggered on the example discussed on the
caml-list in Dec. 2016 ("Closing the performance gap to C").
[EDIT] This PR evolved from adding a simple optimization to fixing a series of bugs introduced
by badly estimating the effectfulness of primitives in the Closure pass. Fixing these bugs is arguably more important than the new optimization.