Support "exception" under or-patterns (PR#6422)#305
Conversation
|
I'm very pleased to see this proposal. There's a regression in the behaviour of match patterns without value cases, like this: match f () with exception Not_found -> ()This is currently disallowed: # match f () with exception Not_found -> ();;
Characters 0-41:
match f () with exception Not_found -> ();;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: None of the patterns in this 'match' expression match values.following discussion on Mantis, but with your patch it's accepted.
I don't think we should support aliasing exception patterns in this way. |
Ah indeed, I had misread the previous code. Thank you!
I agree. (Your comment seem to suggest there is another way though, what would it be?) |
|
As a side note, it seems all uses of |
typing/parmatch.ml
Outdated
There was a problem hiding this comment.
If you're saying that Tpat_exception should not occur here and its not clear what this function should do if it still appeared, you should probably assert false, no?
There was a problem hiding this comment.
Fair point, I will update the commit.
|
I only browsed the commit quickly. Is it indeed the case that guards are duplicated (i.e. the code is generated twice for them)? |
Indeed, I didn't check for that.
Yes. I might have missed something but I don't know how to handle code such as exception Found of int
let guarded f =
match f () with
| Some i | exception (Found i) when i > 0 -> print_endline "positive!"
| _ -> print_endline "onoes"if the guard is shared. Indeed when the condition is not verified the code path differs for the value and exception branches. (function f/1209
(catch
(catch
(try (exit -5 (apply f/1209 0a)) with exn/1229
(catch
(if (== (field 0 exn/1229) (field 2 (global Test!)))
(let (i/1210 =a (field 1 exn/1229))
(if (> i/1210 0) (exit -4 i/1210) (exit 7)))
(exit 7))
with (7) (reraise exn/1229)))
with (-5 val/1228)
(catch
(if val/1228
(let (i/1210 =a (field 0 val/1228))
(if (> i/1210 0) (exit -4 i/1210) (exit 6)))
(exit 6))
with (6) (apply (field 30 (global Pervasives!)) "onoes")))
with (-4 i/1210)
(apply (field 30 (global Pervasives!)) "positive"))) |
Committed to trunk.
I think @chambart had a proposal to extend "static jumps" in intermediate languages with "first class label parameters" (i.e. you could pass to a jump target a symbolic reference to another target, which it could jump to). This might allow to cover such cases without duplicating the code. I'm not completely decided on which is the best approach: duplicate the code or reject this situation (guards in presence of or-combined exception and non-exception patterns), at least for now. If we keep the duplication, I think we need to rename all identifiers, since some optimizations (e.g. in simplif.ml) might assume that all let-bound identifiers in the lambda code are different. (I'm not sure this would break code in practice, but let's be on the safe side.) |
@lpw25 just proposed another alternative which is to wrap the guard in a function. We suspect this to be easier to implement that the renaming necessary if we keep the code duplicated. What do you think, should I try this? PS: I don't like the idea of rejecting guards in this context. It is a minor restriction but just as arbitrary as the one rejecting exceptions under or-patterns and eventually someone is going to complain about it. |
People complain about implicit allocations (even though there are already a bunch of them), so this might not be very consensual. Even if flambda could remove the allocation, this won't be guaranteed and this won't be available in bytecode. (For js_of_ocaml, the overhead could be quite big.) I've no strong opinion on it, though. |
|
Duplicating code is something that needs to be done quite carefully. First you effectively need to substitute every variable, but you also must take care of not duplicating any string literal. exception Exit
let r = ref ""
let guarded f =
match f () with
| true | exception Exit when r := "hello"; true -> !r
| _ -> "other"
let a = guarded (fun () -> raise Exit)
let b = guarded (fun () -> true)
assert(a == b)This is of course quite an involved example. One other way is to compile the pattern matching in a cps style using |
|
|
The lambda code generated by @hhugo example is: The same code compiled with -g works and produces: The problem is very probably related to the reuse of the same identifier, which indeed breaks assumptions in simplif.mf. |
This seems to be an interesting approach, although it probably requires some non-trivial refactoring. |
|
Thank you! I added a new commit with the fix and @hhugo 's test. I will squash it with the other commits once it is reviewed and you guys are happy with it (and I'm not sure you will be, I don't know much about that part of the compiler). |
|
Duplicating affects the sharing of string literals, as @chambart noticed. I don't know how bad this is considered (esp. now that string are mostly immutable, the physical equality on them is no longer specified). As for the implementation strategy, between duplicating the code (with proper renaming) or introducing a local function (hoping it will be inlined), I think I have a preference for duplicating: since guards are usually very small, this is likely to be ok, and the impact on bytecode and js_of_ocaml (assuming it can deal with this new construction!) will be better. |
|
I expect that defining a local function with an |
|
It will not be inlined in bytecode and thus not by js_of_ocaml (which does not do any non-trivial inlining for now). If we force inlining anyway, I don't see any problem with duplicating the code (modulo string literals). |
My plan was to let flambda introduce the sharing before duplication, but indeed that does not work for bytecode (although @chambart's example already works with ocamlc, the problem only happens with the native backend). |
|
As a principle of OCaml Pattern Matching : thou shalst not duplicate. |
I'm rather of the opinion than rejecting this situation is the best move (and that creating an auxiliary function would be better than duplicating the code). "Mixing return and exception cases under when-guards is not supported.". (A change proposal is easily shot down if the implementation has a defect. Duplicating code and implicit allocation are defects. The only way out is to accept a design limitation instead. There is little love for |
Ok, it seems this is the way to go to get this PR merged quickly. We can always revisit later to drop the limitation. |
What do you mean by quickly? If you mean 4.03 I can do the required change this week, otherwise it might take a few more weeks before I get back on this. |
|
I'm in favor of including this in 4.03 if you can do the change quickly, and according to @gasche reply, I think he would be on the same line. So unless another core dev objects to it, indeed, this could go in 4.03. |
|
Note that I'm still not sure how would js_of_ocaml handle this. |
Well, this could indeed be a blocking point. We don't want libraries to start using this new feature and become unusable for js_of_ocaml. Can you check with your fellow js_of_ocaml developpers if there is some hope to have this supported in the near future? |
|
I just noticed the following line in the manual (in the chapter about language extensions, section "Exception cases in pattern matching"):
Should it be updated? |
Yes, that would be great. The new text should clarify that the restriction has been relaxed in 4.03. |
|
This PR seems to have introduced a bug in the exhaustiveness checker. See PR#7083. |
PR7083: fix assertion failure introduced by #305
|
Thomas ( @trefis ) proposed in #343 a potential fix for the exhaustiveness issue. However, a preliminary review seems to suggest (to me) that we don't properly understand the implications of this change (the cause for the bug and the impact of the fix) to assess that it is correct. I have second thoughts about having merged this PR in trunk. For now the only person that would be able to carefully review the code to check that it is correct is Luc ( @maranget ), and the timeframe to do this before the 4.03 release is too short. I would be glad to help (I'm starting to understand this part of the compiler better, and I think @trefis is as well), but I don't have the time before the release either. Finally, someone remarked to me recently that this PR is a real language change that we hadn't discussed during the last development meeting (while we somewhat agreed to restrict ourselves, for large changes, to features that had been approved at that time). One last point against this PR. I am thinking of reverting the present PR ( #305 ). I would like to work on warnings for pattern-matching in the first half of 2016 with @maranget , and @trefis if he has the time to do so. Then would be a better time to reconsider this change and its implications, and make sure we have a correct implementation. I think the feature is nice, but it's not absolutely essential that we rush it into 4.03. |
|
I'm not sure about the timeframe argument. Are we still considering flambda in 4.03? If so, it will probably take several weeks or perhaps months before the release, and a lot of testing with a lot of code in the wild will hopefully happen, which should allow to find any regression caused by this one as well. I don't see the point of postponing nice additions which are ready to merge. If they are included only in a few months, this will simply delay so much the testing and people motivated by the change now might have less motivation or time to go back to them. All this could be alleviated if the feature freeze came along the branching for 4.03 so that the trunk remains open for inclusion of ready additions (even if they do not go to 4.03), but I know this is not the chosen policy. |
|
To add to the previous note: I'm pretty sure the bug would not have been found if the feature had not be included in trunk. |
|
It might not have been clear in my previous messages but I was actually surprised this was considered for 4.03. So I don't particularly mind if it's delayed, I'm just mildly annoyed by your change of mind :) (there are other things I could have been working on, getting the native toplevel to work with flambda for example).
The last update I got was "flambda should be merged on Friday the 18th". |
|
I'm sorry, indeed it was my mistake to be too hasty merging it. I should have realized the importance of @maranget review and not taken a decision without this oversight. |
|
I have probably over-reacted on this. Consider that those two matchings are independent as regards semantics. Grouping them in syntax is |
|
This is indeed what is done for the other checks. For |
This week we merged several changes from Thomas Refis, to allow the
use of exception patterns under or-patterns, to write code such as
match foo x with
| None | exception Not_found -> ...
| Some -> ...
Unfortunately, I failed to properly assess the impact of this change,
and in particular to make sure that Luc Maranget had properly reviewed
this code -- any change to the pattern-matching machinery should be
reviewed by Luc.
The problem that I had not foreseen and that he would have immediately
realized is that, while adapting the pattern-matching *compiler* is
relatively easy (Thomas inserted a transformation at the right place
to separate exception patterns from the others and handle them
separately, using the staticraise construct used by the
pattern-matching compiler to avoid duplicating the
right-hand-side branch), adapting the pattern-matching warnings
machinery is both more subtle and easier to overlook (it may fail
silently and nobody notices, unlike wrong code production). This part
of the compiler is subtle and best understood by Luc, but he does not
have the time to do a proper review of those changes in the timeframe
for the 4.03 feature freeze (mid-December).
I believe the right move in this case, implemented in the present
commit, is to revert the change from trunk (this is not a feature that
we must *imperatively* have in 4.03), do a proper job of understanding
the changes, and integrate the change when we are confident it is
ready. I hope to do this in 2016, together with Luc Maranget and
Thomas Refis -- hopefully this would allow Thomas and I to be more
confident when changing the pattern-matching machinery in the future.
Revert "Merge pull request #343 from trefis/pr7083"
This reverts commit 22681b8, reversing
changes made to a24e4ed.
Revert "Merge pull request #341 from trefis/or-exception"
This reverts commit f8f68bd, reversing
changes made to 1534fe8.
Revert "Merge pull request #305 from trefis/or-exception"
This reverts commit cfeda89, reversing
changes made to 77cf36c.
* Fix `Inline_attribute.equal` The current implementation defers to `==`, which is a problem because `Unroll 1 == Unroll 1` is false. Signed-off-by: Luke Maurer <[email protected]> * Don't use `==` at all in `Inline_attribute.equal`
* Fix `Inline_attribute.equal` The current implementation defers to `==`, which is a problem because `Unroll 1 == Unroll 1` is false. Signed-off-by: Luke Maurer <[email protected]> * Don't use `==` at all in `Inline_attribute.equal`
* Fix `Inline_attribute.equal` The current implementation defers to `==`, which is a problem because `Unroll 1 == Unroll 1` is false. Signed-off-by: Luke Maurer <[email protected]> * Don't use `==` at all in `Inline_attribute.equal`
Cf. http://caml.inria.fr/mantis/view.php?id=6422
The implementation
Texp_matchwas modified to merge the value and exception cases, the distinction doesn't make sense if we allow exception both exception and value patterns in the same branch.Tpat_exceptionwas introduced to allow just that.The splitting is done later, in
Translcore.transl_match, to be able to share the body of branches with both value and exception patterns.That code should be fairly straightforward.
Slight correction on what was just said: the splitting does still happen (temporarily) in typecore, to be able to run the exhaustivity checks performed by parmatch. This is required because the exhaustivity of value and exception branches should be checked separately.
So where are exception patterns allowed ?
In the same places as before and under or-patterns.
The only other places where an exception pattern might make sense is under
Tpat_aliasorTpat_lazy.The first one is actually problematic, consider:
which type do you give to
x?As for exceptions under lazy patterns, this might be possible, but can always be added at a later time and is not covered by this pull request.
Last comments
The careful reviewer will notice the apparition of
Typedtree.pat_equiv, this was used during developpement to check the correctness ofTypedtree.split_pattern; the assertion is still present inTranslcore.transl_matchbut can be removed before merging (along with thepat_equivfunction).I added some comments in parmatch and typecore where I have some doubts about the correctness of my implementation.