Skip to content

Support "exception" under or-patterns (PR#6422)#305

Merged
gasche merged 3 commits intoocaml:trunkfrom
trefis:or-exception
Dec 9, 2015
Merged

Support "exception" under or-patterns (PR#6422)#305
gasche merged 3 commits intoocaml:trunkfrom
trefis:or-exception

Conversation

@trefis
Copy link
Contributor

@trefis trefis commented Nov 23, 2015

Cf. http://caml.inria.fr/mantis/view.php?id=6422

The implementation

Texp_match was modified to merge the value and exception cases, the distinction doesn't make sense if we allow exception both exception and value patterns in the same branch.
Tpat_exception was introduced to allow just that.

The splitting is done later, in Translcore.transl_match, to be able to share the body of branches with both value and exception patterns.
That code should be fairly straightforward.

Slight correction on what was just said: the splitting does still happen (temporarily) in typecore, to be able to run the exhaustivity checks performed by parmatch. This is required because the exhaustivity of value and exception branches should be checked separately.

So where are exception patterns allowed ?

In the same places as before and under or-patterns.
The only other places where an exception pattern might make sense is under Tpat_alias or Tpat_lazy.

The first one is actually problematic, consider:

match ... with
| (SomeConstructor (_, _) | exception SomeException) as x -> ...

which type do you give to x?

As for exceptions under lazy patterns, this might be possible, but can always be added at a later time and is not covered by this pull request.

Last comments

The careful reviewer will notice the apparition of Typedtree.pat_equiv, this was used during developpement to check the correctness of Typedtree.split_pattern; the assertion is still present in Translcore.transl_match but can be removed before merging (along with the pat_equiv function).

I added some comments in parmatch and typecore where I have some doubts about the correctness of my implementation.

@yallop
Copy link
Member

yallop commented Nov 23, 2015

I'm very pleased to see this proposal.

There's a regression in the behaviour of match patterns without value cases, like this:

match f () with exception Not_found -> ()

This is currently disallowed:

# match f () with exception Not_found -> ();;
Characters 0-41:
  match f () with exception Not_found -> ();;
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: None of the patterns in this 'match' expression match values.

following discussion on Mantis, but with your patch it's accepted.

The first one is actually problematic, consider:

match ... with
| (SomeConstructor (_, _) | exception SomeException) as x -> ...

which type do you give to x?

I don't think we should support aliasing exception patterns in this way.

@trefis
Copy link
Contributor Author

trefis commented Nov 23, 2015

There's a regression in the behaviour of match patterns without value cases

Ah indeed, I had misread the previous code. Thank you!
I fixed and added a test case.

I don't think we should support aliasing exception patterns in this way.

I agree. (Your comment seem to suggest there is another way though, what would it be?)

@alainfrisch
Copy link
Contributor

As a side note, it seems all uses of Typedtree.pat_bound_idents actually only use the first projection of returned tuples. Shouldn't it be simplified accordingly (i.e. do the List.map fst in Typedtree)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're saying that Tpat_exception should not occur here and its not clear what this function should do if it still appeared, you should probably assert false, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, I will update the commit.

@alainfrisch
Copy link
Contributor

I only browsed the commit quickly. Is it indeed the case that guards are duplicated (i.e. the code is generated twice for them)?

@trefis
Copy link
Contributor Author

trefis commented Nov 24, 2015

As a side note, it seems all uses of Typedtree.pat_bound_idents actually only use the first projection of returned tuples. Shouldn't it be simplified accordingly (i.e. do the List.map fst in Typedtree)?

Indeed, I didn't check for that.
I could do it in this PR if you want, but I'd also be happy rebasing my PR if the change is done before.

Is it indeed the case that guards are duplicated (i.e. the code is generated twice for them)?

Yes. I might have missed something but I don't know how to handle code such as

exception Found of int

let guarded f =
  match f () with
  | Some i | exception (Found i) when i > 0 -> print_endline "positive!"
  | _ -> print_endline "onoes"

if the guard is shared. Indeed when the condition is not verified the code path differs for the value and exception branches.
The lambda code generated by this PR for the previous example looks like this:

(function f/1209
  (catch
    (catch
      (try (exit -5 (apply f/1209 0a)) with exn/1229
        (catch
          (if (== (field 0 exn/1229) (field 2 (global Test!)))
            (let (i/1210 =a (field 1 exn/1229))
              (if (> i/1210 0) (exit -4 i/1210) (exit 7)))
            (exit 7))
         with (7) (reraise exn/1229)))
     with (-5 val/1228)
      (catch
        (if val/1228
          (let (i/1210 =a (field 0 val/1228))
            (if (> i/1210 0) (exit -4 i/1210) (exit 6)))
          (exit 6))
       with (6) (apply (field 30 (global Pervasives!)) "onoes")))
   with (-4 i/1210)
    (apply (field 30 (global Pervasives!)) "positive")))

@alainfrisch
Copy link
Contributor

but I'd also be happy rebasing my PR if the change is done before

Committed to trunk.

the code path differs for the value and exception branches

I think @chambart had a proposal to extend "static jumps" in intermediate languages with "first class label parameters" (i.e. you could pass to a jump target a symbolic reference to another target, which it could jump to). This might allow to cover such cases without duplicating the code.

I'm not completely decided on which is the best approach: duplicate the code or reject this situation (guards in presence of or-combined exception and non-exception patterns), at least for now. If we keep the duplication, I think we need to rename all identifiers, since some optimizations (e.g. in simplif.ml) might assume that all let-bound identifiers in the lambda code are different. (I'm not sure this would break code in practice, but let's be on the safe side.)

@trefis
Copy link
Contributor Author

trefis commented Nov 24, 2015

I'm not completely decided on which is the best approach: duplicate the code or reject this situation (guards in presence of or-combined exception and non-exception patterns), at least for now. If we keep the duplication, I think we need to rename all identifiers, since some optimizations (e.g. in simplif.ml) might assume that all let-bound identifiers in the lambda code are different.

@lpw25 just proposed another alternative which is to wrap the guard in a function. We suspect this to be easier to implement that the renaming necessary if we keep the code duplicated.
Once flambda is merged it will probably do the inlining (i.e. duplication) for us.

What do you think, should I try this?

PS: I don't like the idea of rejecting guards in this context. It is a minor restriction but just as arbitrary as the one rejecting exceptions under or-patterns and eventually someone is going to complain about it.

@alainfrisch
Copy link
Contributor

What do you think, should I try this?

People complain about implicit allocations (even though there are already a bunch of them), so this might not be very consensual. Even if flambda could remove the allocation, this won't be guaranteed and this won't be available in bytecode. (For js_of_ocaml, the overhead could be quite big.) I've no strong opinion on it, though.

@chambart
Copy link
Contributor

Duplicating code is something that needs to be done quite carefully. First you effectively need to substitute every variable, but you also must take care of not duplicating any string literal.

exception Exit

let r = ref ""

let guarded f =
  match f () with
  | true | exception Exit when r := "hello"; true -> !r
  | _ -> "other"

let a = guarded (fun () -> raise Exit)
let b = guarded (fun () -> true)
assert(a == b)

This is of course quite an involved example.
Generating a function is probably the easier way of duplicating code, and with flambda you can ensure that a function is effectively inlined, and that the closure allocation is removed.

One other way is to compile the pattern matching in a cps style using catch but I suppose it represent a huge change in the pattern matching compilation. By the way, you already have some way of building some first class label by returning an identifier of the label (encoded as an int) and branching with a swtich rather than an jump.

@hhugo
Copy link
Contributor

hhugo commented Nov 24, 2015

$cat test.ml 
exception String of string
let _ = match input_line stdin with
  | str | exception (String str) -> str
  | exception _ -> ""
$ ocamlc -c test.ml 
>> Fatal error: Bytegen.comp_expr: var val_1210
Fatal error: exception Misc.Fatal_error
Called from unknown location
Called from unknown location
EXIT STATUS 2

@alainfrisch
Copy link
Contributor

The lambda code generated by @hhugo example is:

(setglobal Test!
  (let (String/1203 = (makeblock 248 "Test.String" (caml_fresh_oo_id 0)))
    (seq
      (catch
        (catch
          (try
            (exit -2
              (apply (field 65 (global Pervasives!))
                (field 22 (global Pervasives!))))
           with exn/1211
            (if (== (field 0 exn/1211) String/1203)
              (let (str/1206 =a (field 1 exn/1211)) (exit -1 val/1210)) ""))
         with (-2 val/1210) (exit -1 val/1210))
       with (-1 str/1206) str/1206)
      (makeblock 0 String/1203))))

The same code compiled with -g works and produces:

(setglobal Test!
  (let (String/1203 = (makeblock 248 "Test.String" (caml_fresh_oo_id 0)))
    (seq
      (catch
        (catch
          (try
            (exit -2
              (after ../test.ml(2):42-58
                (apply (field 65 (global Pervasives!))
                  (field 22 (global Pervasives!)))))
           with exn/1211
            (let (tag/1212 =a (field 0 exn/1211))
              (if (== tag/1212 String/1203)
                (let (str/1206 =a (field 1 exn/1211)) (exit -1 str/1206))
                (before ../test.ml(4):125-127 ""))))
         with (-2 val/1210) (let (str/1206 =a val/1210) (exit -1 str/1206)))
       with (-1 str/1206) str/1206)
      (makeblock 0 String/1203))))

The problem is very probably related to the reuse of the same identifier, which indeed breaks assumptions in simplif.mf.

@alainfrisch
Copy link
Contributor

By the way, you already have some way of building some first class label by returning an identifier of the label (encoded as an int) and branching with a swtich rather than an jump.

This seems to be an interesting approach, although it probably requires some non-trivial refactoring.
Note that we would need to pass along the exn value and invent a dummy value when the guard is evaluated for non-exceptional cases.

@trefis
Copy link
Contributor Author

trefis commented Nov 25, 2015

Thank you!
I actually missed that problem because the only tests I ran with names (instead of wildcards) where from the toplevel, which generates the same lambda code as ocamlc -g...

I added a new commit with the fix and @hhugo 's test. I will squash it with the other commits once it is reviewed and you guys are happy with it (and I'm not sure you will be, I don't know much about that part of the compiler).

@alainfrisch
Copy link
Contributor

Duplicating affects the sharing of string literals, as @chambart noticed. I don't know how bad this is considered (esp. now that string are mostly immutable, the physical equality on them is no longer specified).

As for the implementation strategy, between duplicating the code (with proper renaming) or introducing a local function (hoping it will be inlined), I think I have a preference for duplicating: since guards are usually very small, this is likely to be ok, and the impact on bytecode and js_of_ocaml (assuming it can deal with this new construction!) will be better.

@trefis
Copy link
Contributor Author

trefis commented Nov 25, 2015

I expect that defining a local function with an inline_attribute of Always_inline should be enough to be sure that it is always inlined (which is better than hoping it to be).
I don't know if it is the case at the moment but will apparently definitely be true with flambda, so I don't mind trying that.

@alainfrisch
Copy link
Contributor

It will not be inlined in bytecode and thus not by js_of_ocaml (which does not do any non-trivial inlining for now). If we force inlining anyway, I don't see any problem with duplicating the code (modulo string literals).

@trefis
Copy link
Contributor Author

trefis commented Nov 25, 2015

If we force inlining anyway, I don't see any problem with duplicating the code (modulo string literals).

My plan was to let flambda introduce the sharing before duplication, but indeed that does not work for bytecode (although @chambart's example already works with ocamlc, the problem only happens with the native backend).

@maranget
Copy link
Contributor

As a principle of OCaml Pattern Matching : thou shalst not duplicate.
Now I look at the PR.

@gasche
Copy link
Member

gasche commented Nov 27, 2015

I'm not completely decided on which is the best approach: duplicate the code or reject this situation (guards in presence of or-combined exception and non-exception patterns)

I'm rather of the opinion than rejecting this situation is the best move (and that creating an auxiliary function would be better than duplicating the code). "Mixing return and exception cases under when-guards is not supported.".

(A change proposal is easily shot down if the implementation has a defect. Duplicating code and implicit allocation are defects. The only way out is to accept a design limitation instead. There is little love for when guards above or-patterns anyway.)

@alainfrisch
Copy link
Contributor

"Mixing return and exception cases under when-guards is not supported."

Ok, it seems this is the way to go to get this PR merged quickly. We can always revisit later to drop the limitation.

@trefis
Copy link
Contributor Author

trefis commented Dec 7, 2015

Ok, it seems this is the way to go to get this PR merged quickly.

What do you mean by quickly? If you mean 4.03 I can do the required change this week, otherwise it might take a few more weeks before I get back on this.

@alainfrisch
Copy link
Contributor

I'm in favor of including this in 4.03 if you can do the change quickly, and according to @gasche reply, I think he would be on the same line. So unless another core dev objects to it, indeed, this could go in 4.03.

@hhugo
Copy link
Contributor

hhugo commented Dec 7, 2015

Note that I'm still not sure how would js_of_ocaml handle this.

@alainfrisch
Copy link
Contributor

Note that I'm still not sure how would js_of_ocaml handle this.

Well, this could indeed be a blocking point. We don't want libraries to start using this new feature and become unusable for js_of_ocaml. Can you check with your fellow js_of_ocaml developpers if there is some hope to have this supported in the near future?

@gasche gasche merged commit cfeda89 into ocaml:trunk Dec 9, 2015
@trefis
Copy link
Contributor Author

trefis commented Dec 9, 2015

I just noticed the following line in the manual (in the chapter about language extensions, section "Exception cases in pattern matching"):

A new form of exception patterns is allowed, only as a toplevel pattern under a match...with pattern-matching (other occurrences are rejected by the type-checker).

Should it be updated?

@alainfrisch
Copy link
Contributor

Should it be updated?

Yes, that would be great. The new text should clarify that the restriction has been relaxed in 4.03.

@nojb
Copy link
Contributor

nojb commented Dec 10, 2015

This PR seems to have introduced a bug in the exhaustiveness checker. See PR#7083.

alainfrisch added a commit that referenced this pull request Dec 11, 2015
PR7083: fix assertion failure introduced by #305
@gasche
Copy link
Member

gasche commented Dec 11, 2015

Thomas ( @trefis ) proposed in #343 a potential fix for the exhaustiveness issue. However, a preliminary review seems to suggest (to me) that we don't properly understand the implications of this change (the cause for the bug and the impact of the fix) to assess that it is correct.

I have second thoughts about having merged this PR in trunk. For now the only person that would be able to carefully review the code to check that it is correct is Luc ( @maranget ), and the timeframe to do this before the 4.03 release is too short. I would be glad to help (I'm starting to understand this part of the compiler better, and I think @trefis is as well), but I don't have the time before the release either.

Finally, someone remarked to me recently that this PR is a real language change that we hadn't discussed during the last development meeting (while we somewhat agreed to restrict ourselves, for large changes, to features that had been approved at that time). One last point against this PR.

I am thinking of reverting the present PR ( #305 ). I would like to work on warnings for pattern-matching in the first half of 2016 with @maranget , and @trefis if he has the time to do so. Then would be a better time to reconsider this change and its implications, and make sure we have a correct implementation. I think the feature is nice, but it's not absolutely essential that we rush it into 4.03.

@alainfrisch
Copy link
Contributor

I'm not sure about the timeframe argument. Are we still considering flambda in 4.03? If so, it will probably take several weeks or perhaps months before the release, and a lot of testing with a lot of code in the wild will hopefully happen, which should allow to find any regression caused by this one as well. I don't see the point of postponing nice additions which are ready to merge. If they are included only in a few months, this will simply delay so much the testing and people motivated by the change now might have less motivation or time to go back to them.

All this could be alleviated if the feature freeze came along the branching for 4.03 so that the trunk remains open for inclusion of ready additions (even if they do not go to 4.03), but I know this is not the chosen policy.

@alainfrisch
Copy link
Contributor

To add to the previous note: I'm pretty sure the bug would not have been found if the feature had not be included in trunk.

@trefis
Copy link
Contributor Author

trefis commented Dec 11, 2015

It might not have been clear in my previous messages but I was actually surprised this was considered for 4.03. So I don't particularly mind if it's delayed, I'm just mildly annoyed by your change of mind :) (there are other things I could have been working on, getting the native toplevel to work with flambda for example).

Are we still considering flambda in 4.03?

The last update I got was "flambda should be merged on Friday the 18th".
Of course this has been moving delayed by a week each week, so :)

@gasche
Copy link
Member

gasche commented Dec 11, 2015

I'm sorry, indeed it was my mistake to be too hasty merging it. I should have realized the importance of @maranget review and not taken a decision without this oversight.

@maranget
Copy link
Contributor

I have probably over-reacted on this.
The fix should be relatively simple considering that pressure_variant should probably be called
on value and exception matchings separately, just like unused clause and exhaustivity are called.

Consider that those two matchings are independent as regards semantics. Grouping them in syntax is
let us say "by accident"

@trefis
Copy link
Contributor Author

trefis commented Dec 11, 2015

This is indeed what is done for the other checks.

For pressure_variants I went another way because I didn't consider that variants could be present deep inside exception patterns. That means that contrary to my bold claims #343 is not correct and should be reverted.
I will try to send an updated pull request later today.

gasche added a commit that referenced this pull request Dec 12, 2015
This week we merged several changes from Thomas Refis, to allow the
use of exception patterns under or-patterns, to write code such as

  match foo x with
    | None | exception Not_found -> ...
    | Some -> ...

Unfortunately, I failed to properly assess the impact of this change,
and in particular to make sure that Luc Maranget had properly reviewed
this code -- any change to the pattern-matching machinery should be
reviewed by Luc.

The problem that I had not foreseen and that he would have immediately
realized is that, while adapting the pattern-matching *compiler* is
relatively easy (Thomas inserted a transformation at the right place
to separate exception patterns from the others and handle them
separately, using the staticraise construct used by the
pattern-matching compiler to avoid duplicating the
right-hand-side branch), adapting the pattern-matching warnings
machinery is both more subtle and easier to overlook (it may fail
silently and nobody notices, unlike wrong code production). This part
of the compiler is subtle and best understood by Luc, but he does not
have the time to do a proper review of those changes in the timeframe
for the 4.03 feature freeze (mid-December).

I believe the right move in this case, implemented in the present
commit, is to revert the change from trunk (this is not a feature that
we must *imperatively* have in 4.03), do a proper job of understanding
the changes, and integrate the change when we are confident it is
ready. I hope to do this in 2016, together with Luc Maranget and
Thomas Refis -- hopefully this would allow Thomas and I to be more
confident when changing the pattern-matching machinery in the future.

Revert "Merge pull request #343 from trefis/pr7083"
This reverts commit 22681b8, reversing
changes made to a24e4ed.

Revert "Merge pull request #341 from trefis/or-exception"
This reverts commit f8f68bd, reversing
changes made to 1534fe8.

Revert "Merge pull request #305 from trefis/or-exception"
This reverts commit cfeda89, reversing
changes made to 77cf36c.
@gasche
Copy link
Member

gasche commented Dec 12, 2015

So (after some further discussion with @trefis) I went ahead and reverted this change from trunk ( d071da2 ). I would like to work on it again in early 2016, so hopefully there will be an updated PR at that time.

314eter added a commit to 314eter/language-ocaml that referenced this pull request Jun 2, 2016
mshinwell pushed a commit to mshinwell/ocaml that referenced this pull request Feb 12, 2021
* Fix `Inline_attribute.equal`

The current implementation defers to `==`, which is a problem because
`Unroll 1 == Unroll 1` is false.

Signed-off-by: Luke Maurer <[email protected]>

* Don't use `==` at all in `Inline_attribute.equal`
poechsel pushed a commit to poechsel/ocaml that referenced this pull request Mar 1, 2021
* Fix `Inline_attribute.equal`

The current implementation defers to `==`, which is a problem because
`Unroll 1 == Unroll 1` is false.

Signed-off-by: Luke Maurer <[email protected]>

* Don't use `==` at all in `Inline_attribute.equal`
poechsel pushed a commit to poechsel/ocaml that referenced this pull request Mar 1, 2021
* Fix `Inline_attribute.equal`

The current implementation defers to `==`, which is a problem because
`Unroll 1 == Unroll 1` is false.

Signed-off-by: Luke Maurer <[email protected]>

* Don't use `==` at all in `Inline_attribute.equal`
chambart added a commit to chambart/ocaml-1 that referenced this pull request Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants