Support "exception" under or-patterns (PR#6422) by trefis · Pull Request #305 · ocaml/ocaml

trefis · 2015-11-23T14:49:22Z

Cf. http://caml.inria.fr/mantis/view.php?id=6422

The implementation

Texp_match was modified to merge the value and exception cases, the distinction doesn't make sense if we allow exception both exception and value patterns in the same branch.
Tpat_exception was introduced to allow just that.

The splitting is done later, in Translcore.transl_match, to be able to share the body of branches with both value and exception patterns.
That code should be fairly straightforward.

Slight correction on what was just said: the splitting does still happen (temporarily) in typecore, to be able to run the exhaustivity checks performed by parmatch. This is required because the exhaustivity of value and exception branches should be checked separately.

So where are exception patterns allowed ?

In the same places as before and under or-patterns.
The only other places where an exception pattern might make sense is under Tpat_alias or Tpat_lazy.

The first one is actually problematic, consider:

match ... with
| (SomeConstructor (_, _) | exception SomeException) as x -> ...

which type do you give to x?

As for exceptions under lazy patterns, this might be possible, but can always be added at a later time and is not covered by this pull request.

Last comments

The careful reviewer will notice the apparition of Typedtree.pat_equiv, this was used during developpement to check the correctness of Typedtree.split_pattern; the assertion is still present in Translcore.transl_match but can be removed before merging (along with the pat_equiv function).

I added some comments in parmatch and typecore where I have some doubts about the correctness of my implementation.

yallop · 2015-11-23T15:12:09Z

I'm very pleased to see this proposal.

There's a regression in the behaviour of match patterns without value cases, like this:

match f () with exception Not_found -> ()

This is currently disallowed:

# match f () with exception Not_found -> ();;
Characters 0-41:
  match f () with exception Not_found -> ();;
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: None of the patterns in this 'match' expression match values.

following discussion on Mantis, but with your patch it's accepted.

The first one is actually problematic, consider:
match ... with
| (SomeConstructor (_, _) | exception SomeException) as x -> ...
which type do you give to x?

I don't think we should support aliasing exception patterns in this way.

trefis · 2015-11-23T15:34:04Z

There's a regression in the behaviour of match patterns without value cases

Ah indeed, I had misread the previous code. Thank you!
I fixed and added a test case.

I don't think we should support aliasing exception patterns in this way.

I agree. (Your comment seem to suggest there is another way though, what would it be?)

alainfrisch · 2015-11-24T11:50:02Z

As a side note, it seems all uses of Typedtree.pat_bound_idents actually only use the first projection of returned tuples. Shouldn't it be simplified accordingly (i.e. do the List.map fst in Typedtree)?

alainfrisch · 2015-11-24T11:54:37Z

typing/parmatch.ml

If you're saying that Tpat_exception should not occur here and its not clear what this function should do if it still appeared, you should probably assert false, no?

Fair point, I will update the commit.

alainfrisch · 2015-11-24T12:03:26Z

I only browsed the commit quickly. Is it indeed the case that guards are duplicated (i.e. the code is generated twice for them)?

trefis · 2015-11-24T13:26:40Z

As a side note, it seems all uses of Typedtree.pat_bound_idents actually only use the first projection of returned tuples. Shouldn't it be simplified accordingly (i.e. do the List.map fst in Typedtree)?

Indeed, I didn't check for that.
I could do it in this PR if you want, but I'd also be happy rebasing my PR if the change is done before.

Is it indeed the case that guards are duplicated (i.e. the code is generated twice for them)?

Yes. I might have missed something but I don't know how to handle code such as

exception Found of int

let guarded f =
  match f () with
  | Some i | exception (Found i) when i > 0 -> print_endline "positive!"
  | _ -> print_endline "onoes"

if the guard is shared. Indeed when the condition is not verified the code path differs for the value and exception branches.
The lambda code generated by this PR for the previous example looks like this:

(function f/1209
  (catch
    (catch
      (try (exit -5 (apply f/1209 0a)) with exn/1229
        (catch
          (if (== (field 0 exn/1229) (field 2 (global Test!)))
            (let (i/1210 =a (field 1 exn/1229))
              (if (> i/1210 0) (exit -4 i/1210) (exit 7)))
            (exit 7))
         with (7) (reraise exn/1229)))
     with (-5 val/1228)
      (catch
        (if val/1228
          (let (i/1210 =a (field 0 val/1228))
            (if (> i/1210 0) (exit -4 i/1210) (exit 6)))
          (exit 6))
       with (6) (apply (field 30 (global Pervasives!)) "onoes")))
   with (-4 i/1210)
    (apply (field 30 (global Pervasives!)) "positive")))

alainfrisch · 2015-11-24T14:48:33Z

but I'd also be happy rebasing my PR if the change is done before

Committed to trunk.

the code path differs for the value and exception branches

I think @chambart had a proposal to extend "static jumps" in intermediate languages with "first class label parameters" (i.e. you could pass to a jump target a symbolic reference to another target, which it could jump to). This might allow to cover such cases without duplicating the code.

I'm not completely decided on which is the best approach: duplicate the code or reject this situation (guards in presence of or-combined exception and non-exception patterns), at least for now. If we keep the duplication, I think we need to rename all identifiers, since some optimizations (e.g. in simplif.ml) might assume that all let-bound identifiers in the lambda code are different. (I'm not sure this would break code in practice, but let's be on the safe side.)

trefis · 2015-11-24T15:01:34Z

I'm not completely decided on which is the best approach: duplicate the code or reject this situation (guards in presence of or-combined exception and non-exception patterns), at least for now. If we keep the duplication, I think we need to rename all identifiers, since some optimizations (e.g. in simplif.ml) might assume that all let-bound identifiers in the lambda code are different.

@lpw25 just proposed another alternative which is to wrap the guard in a function. We suspect this to be easier to implement that the renaming necessary if we keep the code duplicated.
Once flambda is merged it will probably do the inlining (i.e. duplication) for us.

What do you think, should I try this?

PS: I don't like the idea of rejecting guards in this context. It is a minor restriction but just as arbitrary as the one rejecting exceptions under or-patterns and eventually someone is going to complain about it.

alainfrisch · 2015-11-24T15:18:44Z

What do you think, should I try this?

People complain about implicit allocations (even though there are already a bunch of them), so this might not be very consensual. Even if flambda could remove the allocation, this won't be guaranteed and this won't be available in bytecode. (For js_of_ocaml, the overhead could be quite big.) I've no strong opinion on it, though.

chambart · 2015-11-24T19:03:14Z

Duplicating code is something that needs to be done quite carefully. First you effectively need to substitute every variable, but you also must take care of not duplicating any string literal.

exception Exit

let r = ref ""

let guarded f =
  match f () with
  | true | exception Exit when r := "hello"; true -> !r
  | _ -> "other"

let a = guarded (fun () -> raise Exit)
let b = guarded (fun () -> true)
assert(a == b)

This is of course quite an involved example.
Generating a function is probably the easier way of duplicating code, and with flambda you can ensure that a function is effectively inlined, and that the closure allocation is removed.

One other way is to compile the pattern matching in a cps style using catch but I suppose it represent a huge change in the pattern matching compilation. By the way, you already have some way of building some first class label by returning an identifier of the label (encoded as an int) and branching with a swtich rather than an jump.

hhugo · 2015-11-24T23:13:02Z

$cat test.ml 
exception String of string
let _ = match input_line stdin with
  | str | exception (String str) -> str
  | exception _ -> ""

$ ocamlc -c test.ml 
>> Fatal error: Bytegen.comp_expr: var val_1210
Fatal error: exception Misc.Fatal_error
Called from unknown location
Called from unknown location
EXIT STATUS 2

alainfrisch · 2015-11-25T08:19:00Z

The lambda code generated by @hhugo example is:

(setglobal Test!
  (let (String/1203 = (makeblock 248 "Test.String" (caml_fresh_oo_id 0)))
    (seq
      (catch
        (catch
          (try
            (exit -2
              (apply (field 65 (global Pervasives!))
                (field 22 (global Pervasives!))))
           with exn/1211
            (if (== (field 0 exn/1211) String/1203)
              (let (str/1206 =a (field 1 exn/1211)) (exit -1 val/1210)) ""))
         with (-2 val/1210) (exit -1 val/1210))
       with (-1 str/1206) str/1206)
      (makeblock 0 String/1203))))

The same code compiled with -g works and produces:

(setglobal Test!
  (let (String/1203 = (makeblock 248 "Test.String" (caml_fresh_oo_id 0)))
    (seq
      (catch
        (catch
          (try
            (exit -2
              (after ../test.ml(2):42-58
                (apply (field 65 (global Pervasives!))
                  (field 22 (global Pervasives!)))))
           with exn/1211
            (let (tag/1212 =a (field 0 exn/1211))
              (if (== tag/1212 String/1203)
                (let (str/1206 =a (field 1 exn/1211)) (exit -1 str/1206))
                (before ../test.ml(4):125-127 ""))))
         with (-2 val/1210) (let (str/1206 =a val/1210) (exit -1 str/1206)))
       with (-1 str/1206) str/1206)
      (makeblock 0 String/1203))))

The problem is very probably related to the reuse of the same identifier, which indeed breaks assumptions in simplif.mf.

alainfrisch · 2015-11-25T08:28:04Z

By the way, you already have some way of building some first class label by returning an identifier of the label (encoded as an int) and branching with a swtich rather than an jump.

This seems to be an interesting approach, although it probably requires some non-trivial refactoring.
Note that we would need to pass along the exn value and invent a dummy value when the guard is evaluated for non-exceptional cases.

trefis · 2015-11-25T16:24:05Z

Thank you!
I actually missed that problem because the only tests I ran with names (instead of wildcards) where from the toplevel, which generates the same lambda code as ocamlc -g...

I added a new commit with the fix and @hhugo 's test. I will squash it with the other commits once it is reviewed and you guys are happy with it (and I'm not sure you will be, I don't know much about that part of the compiler).

alainfrisch · 2015-11-25T16:33:56Z

Duplicating affects the sharing of string literals, as @chambart noticed. I don't know how bad this is considered (esp. now that string are mostly immutable, the physical equality on them is no longer specified).

As for the implementation strategy, between duplicating the code (with proper renaming) or introducing a local function (hoping it will be inlined), I think I have a preference for duplicating: since guards are usually very small, this is likely to be ok, and the impact on bytecode and js_of_ocaml (assuming it can deal with this new construction!) will be better.

trefis · 2015-11-25T16:36:49Z

I expect that defining a local function with an inline_attribute of Always_inline should be enough to be sure that it is always inlined (which is better than hoping it to be).
I don't know if it is the case at the moment but will apparently definitely be true with flambda, so I don't mind trying that.

alainfrisch · 2015-11-25T16:40:21Z

It will not be inlined in bytecode and thus not by js_of_ocaml (which does not do any non-trivial inlining for now). If we force inlining anyway, I don't see any problem with duplicating the code (modulo string literals).

trefis · 2015-11-25T16:43:00Z

If we force inlining anyway, I don't see any problem with duplicating the code (modulo string literals).

My plan was to let flambda introduce the sharing before duplication, but indeed that does not work for bytecode (although @chambart's example already works with ocamlc, the problem only happens with the native backend).

maranget · 2015-11-26T10:04:46Z

As a principle of OCaml Pattern Matching : thou shalst not duplicate.
Now I look at the PR.

gasche · 2015-11-27T18:19:04Z

I'm not completely decided on which is the best approach: duplicate the code or reject this situation (guards in presence of or-combined exception and non-exception patterns)

I'm rather of the opinion than rejecting this situation is the best move (and that creating an auxiliary function would be better than duplicating the code). "Mixing return and exception cases under when-guards is not supported.".

(A change proposal is easily shot down if the implementation has a defect. Duplicating code and implicit allocation are defects. The only way out is to accept a design limitation instead. There is little love for when guards above or-patterns anyway.)

alainfrisch · 2015-12-07T09:32:56Z

"Mixing return and exception cases under when-guards is not supported."

Ok, it seems this is the way to go to get this PR merged quickly. We can always revisit later to drop the limitation.

trefis · 2015-12-07T11:37:07Z

Ok, it seems this is the way to go to get this PR merged quickly.

What do you mean by quickly? If you mean 4.03 I can do the required change this week, otherwise it might take a few more weeks before I get back on this.

alainfrisch · 2015-12-07T11:40:40Z

I'm in favor of including this in 4.03 if you can do the change quickly, and according to @gasche reply, I think he would be on the same line. So unless another core dev objects to it, indeed, this could go in 4.03.

hhugo · 2015-12-07T11:59:35Z

Note that I'm still not sure how would js_of_ocaml handle this.

alainfrisch · 2015-12-07T12:48:07Z

Note that I'm still not sure how would js_of_ocaml handle this.

Well, this could indeed be a blocking point. We don't want libraries to start using this new feature and become unusable for js_of_ocaml. Can you check with your fellow js_of_ocaml developpers if there is some hope to have this supported in the near future?

trefis · 2015-12-09T12:49:48Z

I just noticed the following line in the manual (in the chapter about language extensions, section "Exception cases in pattern matching"):

A new form of exception patterns is allowed, only as a toplevel pattern under a match...with pattern-matching (other occurrences are rejected by the type-checker).

Should it be updated?

alainfrisch · 2015-12-09T12:51:03Z

Should it be updated?

Yes, that would be great. The new text should clarify that the restriction has been relaxed in 4.03.

nojb · 2015-12-10T18:26:53Z

This PR seems to have introduced a bug in the exhaustiveness checker. See PR#7083.

PR7083: fix assertion failure introduced by #305

gasche · 2015-12-11T12:28:34Z

Thomas ( @trefis ) proposed in #343 a potential fix for the exhaustiveness issue. However, a preliminary review seems to suggest (to me) that we don't properly understand the implications of this change (the cause for the bug and the impact of the fix) to assess that it is correct.

I have second thoughts about having merged this PR in trunk. For now the only person that would be able to carefully review the code to check that it is correct is Luc ( @maranget ), and the timeframe to do this before the 4.03 release is too short. I would be glad to help (I'm starting to understand this part of the compiler better, and I think @trefis is as well), but I don't have the time before the release either.

Finally, someone remarked to me recently that this PR is a real language change that we hadn't discussed during the last development meeting (while we somewhat agreed to restrict ourselves, for large changes, to features that had been approved at that time). One last point against this PR.

I am thinking of reverting the present PR ( #305 ). I would like to work on warnings for pattern-matching in the first half of 2016 with @maranget , and @trefis if he has the time to do so. Then would be a better time to reconsider this change and its implications, and make sure we have a correct implementation. I think the feature is nice, but it's not absolutely essential that we rush it into 4.03.

alainfrisch · 2015-12-11T12:49:17Z

I'm not sure about the timeframe argument. Are we still considering flambda in 4.03? If so, it will probably take several weeks or perhaps months before the release, and a lot of testing with a lot of code in the wild will hopefully happen, which should allow to find any regression caused by this one as well. I don't see the point of postponing nice additions which are ready to merge. If they are included only in a few months, this will simply delay so much the testing and people motivated by the change now might have less motivation or time to go back to them.

All this could be alleviated if the feature freeze came along the branching for 4.03 so that the trunk remains open for inclusion of ready additions (even if they do not go to 4.03), but I know this is not the chosen policy.

alainfrisch · 2015-12-11T12:53:11Z

To add to the previous note: I'm pretty sure the bug would not have been found if the feature had not be included in trunk.

trefis · 2015-12-11T12:58:45Z

It might not have been clear in my previous messages but I was actually surprised this was considered for 4.03. So I don't particularly mind if it's delayed, I'm just mildly annoyed by your change of mind :) (there are other things I could have been working on, getting the native toplevel to work with flambda for example).

Are we still considering flambda in 4.03?

The last update I got was "flambda should be merged on Friday the 18th".
Of course this has been moving delayed by a week each week, so :)

gasche · 2015-12-11T13:00:05Z

I'm sorry, indeed it was my mistake to be too hasty merging it. I should have realized the importance of @maranget review and not taken a decision without this oversight.

maranget · 2015-12-11T13:19:00Z

I have probably over-reacted on this.
The fix should be relatively simple considering that pressure_variant should probably be called
on value and exception matchings separately, just like unused clause and exhaustivity are called.

Consider that those two matchings are independent as regards semantics. Grouping them in syntax is
let us say "by accident"

trefis · 2015-12-11T13:24:20Z

This is indeed what is done for the other checks.

For pressure_variants I went another way because I didn't consider that variants could be present deep inside exception patterns. That means that contrary to my bold claims #343 is not correct and should be reverted.
I will try to send an updated pull request later today.

This week we merged several changes from Thomas Refis, to allow the use of exception patterns under or-patterns, to write code such as match foo x with | None | exception Not_found -> ... | Some -> ... Unfortunately, I failed to properly assess the impact of this change, and in particular to make sure that Luc Maranget had properly reviewed this code -- any change to the pattern-matching machinery should be reviewed by Luc. The problem that I had not foreseen and that he would have immediately realized is that, while adapting the pattern-matching *compiler* is relatively easy (Thomas inserted a transformation at the right place to separate exception patterns from the others and handle them separately, using the staticraise construct used by the pattern-matching compiler to avoid duplicating the right-hand-side branch), adapting the pattern-matching warnings machinery is both more subtle and easier to overlook (it may fail silently and nobody notices, unlike wrong code production). This part of the compiler is subtle and best understood by Luc, but he does not have the time to do a proper review of those changes in the timeframe for the 4.03 feature freeze (mid-December). I believe the right move in this case, implemented in the present commit, is to revert the change from trunk (this is not a feature that we must *imperatively* have in 4.03), do a proper job of understanding the changes, and integrate the change when we are confident it is ready. I hope to do this in 2016, together with Luc Maranget and Thomas Refis -- hopefully this would allow Thomas and I to be more confident when changing the pattern-matching machinery in the future. Revert "Merge pull request #343 from trefis/pr7083" This reverts commit 22681b8, reversing changes made to a24e4ed. Revert "Merge pull request #341 from trefis/or-exception" This reverts commit f8f68bd, reversing changes made to 1534fe8. Revert "Merge pull request #305 from trefis/or-exception" This reverts commit cfeda89, reversing changes made to 77cf36c.

gasche · 2015-12-12T10:25:05Z

So (after some further discussion with @trefis) I went ahead and reverted this change from trunk ( d071da2 ). I would like to work on it again in early 2016, so hopefully there will be an updated PR at that time.

See ocaml/ocaml#305

* Fix `Inline_attribute.equal` The current implementation defers to `==`, which is a problem because `Unroll 1 == Unroll 1` is false. Signed-off-by: Luke Maurer <[email protected]> * Don't use `==` at all in `Inline_attribute.equal`

trefis force-pushed the or-exception branch from 181e024 to a5b8943 Compare November 23, 2015 15:32

alainfrisch reviewed Nov 24, 2015
View reviewed changes

trefis force-pushed the or-exception branch from a5b8943 to c0400a6 Compare November 24, 2015 13:42

trefis force-pushed the or-exception branch from c0400a6 to baf5f96 Compare November 24, 2015 16:11

hhugo mentioned this pull request Nov 24, 2015

Check OCaml compiler change : is it compatible with Js_of_ocaml ? ocsigen/js_of_ocaml#384

Closed

trefis force-pushed the or-exception branch from 6fa79c2 to 6802708 Compare November 26, 2015 13:52

gasche merged commit cfeda89 into ocaml:trunk Dec 9, 2015

trefis mentioned this pull request Dec 9, 2015

manual: post GPR#305 update (allow exceptions under or-patterns) #341

Merged

alainfrisch added a commit that referenced this pull request Dec 11, 2015

Merge pull request #343 from trefis/pr7083

22681b8

PR7083: fix assertion failure introduced by #305

gasche mentioned this pull request Dec 11, 2015

PR7083: fix assertion failure introduced by #305 #343

Merged

gasche added the after-next-release label Dec 11, 2015

hhugo mentioned this pull request Jan 22, 2016

Compiler: attempt to fix match_with_exn ocsigen/js_of_ocaml#402

Merged

314eter added a commit to 314eter/language-ocaml that referenced this pull request Jun 2, 2016

Add exception pattern

50675fe

See ocaml/ocaml#305

bobzhang mentioned this pull request Aug 12, 2016

Upgrade to OCaml 4.06.1 rescript-lang/rescript#633

Closed

AndreasDahl mentioned this pull request Jan 4, 2017

Add exception pattern (#20) atom-ocaml/language-ocaml-fix#1

Merged

trefis mentioned this pull request Jan 15, 2018

Allow exception-patterns under or-patterns #1568

Merged

bobzhang mentioned this pull request Apr 18, 2018

Upgrade to 4.06.1 rescript-lang/rescript#2755

Closed

bobzhang mentioned this pull request Nov 13, 2018

to-do list for upgrade rescript-lang/rescript#3000

Closed

37 tasks

This was referenced Mar 14, 2019

Support "exception" under or-patterns #6422

Closed

Typing crash when pattern matching with polymorphic variants + exception cases #7083

Closed

chambart added a commit to chambart/ocaml-1 that referenced this pull request Jan 4, 2022

Remove unused closure_var_uses in Denv (ocaml#305)

a0e44f5

Conversation

trefis commented Nov 23, 2015

The implementation

So where are exception patterns allowed ?

Last comments

Uh oh!

yallop commented Nov 23, 2015

Uh oh!

trefis commented Nov 23, 2015

Uh oh!

alainfrisch commented Nov 24, 2015

Uh oh!

alainfrisch Nov 24, 2015

Choose a reason for hiding this comment

Uh oh!

trefis Nov 24, 2015

Choose a reason for hiding this comment

Uh oh!

alainfrisch commented Nov 24, 2015

Uh oh!

trefis commented Nov 24, 2015

Uh oh!

alainfrisch commented Nov 24, 2015

Uh oh!

trefis commented Nov 24, 2015

Uh oh!

alainfrisch commented Nov 24, 2015

Uh oh!

chambart commented Nov 24, 2015

Uh oh!

hhugo commented Nov 24, 2015

Uh oh!

alainfrisch commented Nov 25, 2015

Uh oh!

alainfrisch commented Nov 25, 2015

Uh oh!

trefis commented Nov 25, 2015

Uh oh!

alainfrisch commented Nov 25, 2015

Uh oh!

trefis commented Nov 25, 2015

Uh oh!

alainfrisch commented Nov 25, 2015

Uh oh!

trefis commented Nov 25, 2015

Uh oh!

maranget commented Nov 26, 2015

Uh oh!

gasche commented Nov 27, 2015

Uh oh!

alainfrisch commented Dec 7, 2015

Uh oh!

trefis commented Dec 7, 2015

Uh oh!

alainfrisch commented Dec 7, 2015

Uh oh!

hhugo commented Dec 7, 2015

Uh oh!

alainfrisch commented Dec 7, 2015

Uh oh!

trefis commented Dec 9, 2015

Uh oh!

alainfrisch commented Dec 9, 2015

Uh oh!

nojb commented Dec 10, 2015

Uh oh!

gasche commented Dec 11, 2015

Uh oh!

alainfrisch commented Dec 11, 2015

Uh oh!

alainfrisch commented Dec 11, 2015

Uh oh!

trefis commented Dec 11, 2015

Uh oh!

gasche commented Dec 11, 2015

Uh oh!

maranget commented Dec 11, 2015

Uh oh!

trefis commented Dec 11, 2015

Uh oh!