Add String.{split,rsplit} by dbuenzli · Pull Request #10 · ocaml/ocaml

dbuenzli · 2014-02-14T16:35:36Z

I recently cut and pasted in (library) code a restricted version of String.split at least trice (well maybe if we didn't get stacktraces as strings I wouldn't need it...). Since I see String.trim was introduced in 4.00.0. I'm tempted to propose this one.

Below you'll find the behaviour of String.split on a few edge cases. It should be noted that it coincides with the behaviour of Str.(split_delim (regexp str) s) except on a few cases (see the Str = comments). On these cases I think that what String.split does is better behaved and more regular since what Str.split_delim does is not captured by the simple textual specification of String.split I give in the documentation which is:

Update after discussion the docs and invariants are now:
Update added rsplit for consistency with String's design.

    [String.split sep s] is the list of all (possibly empty)
    substrings of [s] that are delimited by matches of the non empty
    separator string [sep].

    Matching separators in [s] starts from the begining of [s] and once
    one is found, the separator is skipped and matching starts again
    (i.e. separator matches can't overlap). If there is no separator 
    match in [s], [[s]] is returned.

    The invariants [String.concat sep (String.split sep s) = s] and
    [String.split sep s <> []] always hold.

  (* String.split *) 
  assert (try ignore (String.split "" ""); false with Invalid_argument _ -> true); 
  assert (try ignore (String.split "" "123"); false with Invalid_argument _ -> true);
  assert (String.split "," "" = [""]); 
                       (* Str = [] *) 
  assert (String.split "," "," = [""; ""]); 
  assert (String.split "," ",," = [""; ""; ""]); 
  assert (String.split "," ",,," = [""; ""; ""; ""]); 
  assert (String.split "," "123" = ["123"]); 
  assert (String.split "," ",123" = [""; "123"]);
  assert (String.split "," "123," = ["123"; ""]); 
  assert (String.split "," "1,2,3" = ["1"; "2"; "3"]); 
  assert (String.split "," "1, 2, 3" = ["1"; " 2"; " 3"]); 
  assert (String.split "," ",1,2,,3," = [""; "1"; "2"; ""; "3"; ""]);
  assert (String.split "," ", 1, 2,, 3," = [""; " 1"; " 2"; ""; " 3"; ""]);
  assert (String.split "<>" "" = [""]); 
                        (* Str = [] *) 
  assert (String.split "<>" "<>" = [""; ""]);
  assert (String.split "<>" "<><>" = [""; ""; ""]);
  assert (String.split "<>" "<><><>" = [""; ""; ""; ""]);
  assert (String.split "<>" "123" = [ "123" ]);
  assert (String.split "<>" "<>123" = [""; "123"]);
  assert (String.split "<>" "123<>" = ["123"; ""]);
  assert (String.split "<>" "1<>2<>3" = ["1"; "2"; "3"]);
  assert (String.split "<>" "1<> 2<> 3" = ["1"; " 2"; " 3"]);
  assert (String.split "<>" "<>1<>2<><>3<>" = [""; "1"; "2"; ""; "3"; ""]);
  assert (String.split "<>" "<> 1<> 2<><> 3<>" = [""; " 1"; " 2"; ""; " 3";""]);
  assert (String.split "<>" ">>><>>>><>>>><>>>>" = 
          [">>>"; ">>>"; ">>>"; ">>>" ]);
  assert (String.split "<->" "<->>->" = [""; ">->"]);
  assert (String.split "aa" "aa" = [""; ""]); 
  assert (String.split "aa" "aaa" = [""; "a"]);
  assert (String.split "aa" "aaaa" = [""; ""; ""]);
  assert (String.split "aa" "aaaaa" = [""; ""; "a"]);
  assert (String.split "aa" "aaaaaa" = [""; ""; ""; ""]);
  (* String.rsplit *)
  assert (try ignore (String.rsplit "" ""); false with Invalid_argument _ -> true); 
  assert (try ignore (String.rsplit "" "123"); false with Invalid_argument _ -> true);
  assert (String.rsplit "," "" = [""]); 
                       (* Str = [] *) 
  assert (String.rsplit "," "," = [""; ""]); 
  assert (String.rsplit "," ",," = [""; ""; ""]); 
  assert (String.rsplit "," ",,," = [""; ""; ""; ""]); 
  assert (String.rsplit "," "123" = ["123"]); 
  assert (String.rsplit "," ",123" = [""; "123"]);
  assert (String.rsplit "," "123," = ["123"; ""]); 
  assert (String.rsplit "," "1,2,3" = ["1"; "2"; "3"]); 
  assert (String.rsplit "," "1, 2, 3" = ["1"; " 2"; " 3"]); 
  assert (String.rsplit "," ",1,2,,3," = [""; "1"; "2"; ""; "3"; ""]);
  assert (String.rsplit "," ", 1, 2,, 3," = [""; " 1"; " 2"; ""; " 3"; ""]);
  assert (String.rsplit "<>" "" = [""]); 
                        (* Str = [] *) 
  assert (String.rsplit "<>" "<>" = [""; ""]);
  assert (String.rsplit "<>" "<><>" = [""; ""; ""]);
  assert (String.rsplit "<>" "<><><>" = [""; ""; ""; ""]);
  assert (String.rsplit "<>" "123" = [ "123" ]);
  assert (String.rsplit "<>" "<>123" = [""; "123"]);
  assert (String.rsplit "<>" "123<>" = ["123"; ""]);
  assert (String.rsplit "<>" "1<>2<>3" = ["1"; "2"; "3"]);
  assert (String.rsplit "<>" "1<> 2<> 3" = ["1"; " 2"; " 3"]);
  assert (String.rsplit "<>" "<>1<>2<><>3<>" = [""; "1"; "2"; ""; "3"; ""]);
  assert (String.rsplit "<>" "<> 1<> 2<><> 3<>" = [""; " 1"; " 2"; ""; " 3";""]);
  assert (String.rsplit "<>" ">>><>>>><>>>><>>>>" = 
          [">>>"; ">>>"; ">>>"; ">>>" ]);
  assert (String.rsplit "<->" "<->>->" = [""; ">->"]);
  assert (String.rsplit "aa" "aa" = [""; ""]); 
  assert (String.rsplit "aa" "aaa" = ["a"; ""]);
  assert (String.rsplit "aa" "aaaa" = [""; ""; ""]);
  assert (String.rsplit "aa" "aaaaa" = ["a"; ""; "";]);
  assert (String.rsplit "aa" "aaaaaa" = [""; ""; ""; ""]);

dbuenzli · 2014-02-14T16:58:49Z

Just one note, part of the difference in behaviour between String.split and Str.split_delim can be captured by the fact that for any sep and s in which sep does not occur, String.split guarantees the following:

assert (String.split sep sep = [""; ""])
assert (String.split sep s = [ s ])

whereas Str.split_delim has different behaviour whenever either s or sep is the empty string in these cases.

protz · 2014-02-14T17:15:29Z

stdlib/string.ml

Since your split function implements the (missing) char_list_of_string function in the particular case that sep is the empty string, it may be worth mentioning it in the .mli file.

Not really, first you don't get chars and then you also get a list with an empty string first and last.

protz · 2014-02-14T17:20:30Z

While I trust you that the code is correct, I found the addition particularly hard to proofread given that there is absolutely no comment in the file. Could you either describe here the algorithm you're implementing or add a few comments here and there to say what i, k and others do? (Granted, I'm playing the naive guy, I do have a rough idea of what these variables do, but I still think it would help your chances of being integrated.)

Thanks!

gasche · 2014-02-14T17:39:50Z

With my library designer hat, I can confirm that this is the right design for a split function. I have written a full blog post about it last summer, which I somehow forgot to publish. An invariant which you could emphasize more is that the returned list of strings is always nonempty -- some implementations of split get it wrong and have split sep "" = [].

I haven't looked at the code, and I have no opinion as to whether that should go into stdlib.

tomjridge · 2014-02-14T17:55:58Z

I have no view on the code, but the invariant mentioned in the documentation is the invariant I use with split-like functions, so I also think this is the right design. With gasche, I also make use of the second invariant.

In general, I would like to see more functions in stdlib, providing there is general agreement over the specification, the name, the order of arguments etc. (perhaps I am too optimistic?)

alainfrisch · 2014-02-14T18:06:09Z

What about exposing functions to search for the first or last occurrence (within a given range) of a substring? split can be implemented quite trivially on top of them, and they are useful on their own.

(One argument against this approach is that a general purpose "search substring" algorithm could be well served by a more clever algorithm (KMP or Boyer-Moore), but this might be counter-productive for their use in 'split', where the delimiter is typically very short.)

dbuenzli · 2014-02-14T18:09:26Z

@protz I tried to follow the programming guidelines... joke apart I did add some comments but it's still hard to figure out that all edges cases are well treated. @gasche added that invariant to the doc.

gasche · 2014-02-14T18:10:44Z

In fact, I spoke too fast. An edge case of an edge case is slightly wrong in this patch.

When I wrote my comment, I thought that the non-emptiness invariant was implied by the invariants Daniel asserted, as concat sep (split sep s) = s forces the return list to be non-empty whenever s is non-empty. When s is empty, concat sep "" is specified to return [""] whenever "" "does not contain" sep. But this leaves the split "" "" unspecified. It is still underspecified if you additionally require that it be non-empty.

I think the return value of split "" "" should be either an error or [""] (my preference), but not [""; ""].

dbuenzli · 2014-02-14T18:35:37Z

@gasche A little bit hard to follow you but "" is a substring of "" so this property mentioned above

assert (String.split sep s = [ s ])

if sep is not in s shall not apply. Now it seems you prefer to have split "" "" = [""], but this would be an exception to the rule that for any sep:

assert (String.split sep sep = [""; ""])

gasche · 2014-02-14T19:49:58Z

My reasoning is the following: if you split s1 and s2 among the same separator sep, you can merge back the splitting results as follows (relying on the non-emptyness invariant): if split sep s1 is of the form left @ [last], split sep s2 is of the form [first] @ right, then split sep (s1 ^ s2) is equal to left @ (split sep (last ^ first)) @ right.

This is coherent with split sep s = [s] when sep is not in s, with split sep sep = [""; ""] when sep is not empty (this gives split sep (sep^sep) = [""; ""; ""] as expected), but doesn't work if split "" "" is anything else than [""]. More generally, it casts doubt over your treatment of split "" s.

alainfrisch · 2014-02-14T20:28:51Z

Just mentioning in case this matters: the behavior does not seem to be specified by the disucssed invariants in the case where the delimiter string can overlap with itself (e.g. split "aaa" "aa").

dbuenzli · 2014-02-14T20:36:08Z

@alainfrisch you meant split "aa" "aaa". You are right, we need to specify matching algorithm more clearly: starting from the beginning of the string looks for sep and then cuts. The current implementation is wrong in my eyes (I'm doing the matching from the end to avoid a List.rev. I will change that.

dbuenzli · 2014-02-14T21:17:54Z

@gasche Interesting, I just note however that if you compare the resulting concatenation of the string, you get the same string i.e:

String.concat sep ((String.split sep s1) @ (String.split sep s2)) = 
String.concat (String.split sep (s1 @ s2))

Besides note that with the current implementation the property you mention doesn't hold for any s in split "" s. Just to make things clear here's an example:

# let s1 = split "" "12";;
val s1 : string list = [""; "1"; "2"; ""]
# let s2 = split "" "34";;
val s2 : string list = [""; "3"; "4"; ""]
# split "" "1234";;
- : string list = [""; "1"; "2"; "3"; "4"; ""]
# [""; "1"] @ split "" "23" @ ["4"; ""];;
- : string list = [""; "1"; ""; "2"; "3"; ""; "4"; ""]

So either:

I keep it as it is now.
I fix split "" s so that split "" "" = [""] and split "" s has no leading and trailling empty string.
Require sep to be non empty.

My gripe with 2. is that it treats "" specially w.r.t. to how split works with other non empty separator. 2. would be fine with me if leading and trailing seps would not generate empty strings, i.e. if we had for all sep and s with sep not in s:

assert (String.split sep sep = [""]); (* rather than ["",""] as it is now *)
assert (String.split sep (sep ^ s) = [s]); (* rather than ["", s] as it is now *)
assert (String.split sep (s ^ sep) = [s]); (* rather than [s, ""] as it is now *)

But we would loose the round trip with String.concat so I prefer not.

I tend to be in favor of 3. what do you think ? Spliting with the empty string seems a bad idea anyway especially since, as we see, can be filled with misunderstandings. What do you think ?

gasche · 2014-02-14T21:25:14Z

The round-trip to concat is an absolute requirement. I agree that either (2) or (3) are acceptable.

My suspicion is that users would find split "" "abc" = ["a"; "b"; "c"] a useful behavior in many situations, and it doesn't break the natural contracts for the function, so I would be tempted to accept it. You could always fail specifically in the split "" "" case.

Another way to find peace as an obsessive API designer is to let the code speak: find a way to write the function that seems simple (even better, "elegant"), and respect the way it handles edge cases. Your treatment of empty separators in the current implementation is a bit ad-hoc, but maybe another implementation could "explain" by its conciseness how splitting on empty strings should be done.

alainfrisch · 2014-02-14T21:36:54Z

How often do we need to split with anything else than a single character? Treating only this case makes it easy to specify the exact behavior and leads to a more efficient implementation.

dbuenzli · 2014-02-14T22:08:05Z

Actually the "restricted version" of Split.string I was talking about did only that and sure it was much simpler... But one of the advantage of having a string is that you could still split utf-8 strings at a given UTF-8 encoded char.

dbuenzli · 2014-02-14T23:30:42Z

I gave up on 2. failing to be able to give a reasonable, terse, explanation. I updated my PR with 3.

Also I now do the match from beginning to end. I tried to explain the match strategy in the doc to make it clear that matches don't overlap and run from start to end. So here are the properties of what is currently implemented (I added @alainfrisch's overlapping example):

let () = 
  assert (try ignore (String.split "" ""); false with Invalid_argument _ -> true); 
  assert (try ignore (String.split "" "123"); false with Invalid_argument _ -> true);
  assert (String.split "," "" = [""]); 
                       (* Str = [] *) 
  assert (String.split "," "," = [""; ""]); 
  assert (String.split "," ",," = [""; ""; ""]); 
  assert (String.split "," ",,," = [""; ""; ""; ""]); 
  assert (String.split "," "123" = ["123"]); 
  assert (String.split "," ",123" = [""; "123"]);
  assert (String.split "," "123," = ["123"; ""]); 
  assert (String.split "," "1,2,3" = ["1"; "2"; "3"]); 
  assert (String.split "," "1, 2, 3" = ["1"; " 2"; " 3"]); 
  assert (String.split "," ",1,2,,3," = [""; "1"; "2"; ""; "3"; ""]);
  assert (String.split "," ", 1, 2,, 3," = [""; " 1"; " 2"; ""; " 3"; ""]);
  assert (String.split "<>" "" = [""]); 
                        (* Str = [] *) 
  assert (String.split "<>" "<>" = [""; ""]);
  assert (String.split "<>" "<><>" = [""; ""; ""]);
  assert (String.split "<>" "<><><>" = [""; ""; ""; ""]);
  assert (String.split "<>" "123" = [ "123" ]);
  assert (String.split "<>" "<>123" = [""; "123"]);
  assert (String.split "<>" "123<>" = ["123"; ""]);
  assert (String.split "<>" "1<>2<>3" = ["1"; "2"; "3"]);
  assert (String.split "<>" "1<> 2<> 3" = ["1"; " 2"; " 3"]);
  assert (String.split "<>" "<>1<>2<><>3<>" = [""; "1"; "2"; ""; "3"; ""]);
  assert (String.split "<>" "<> 1<> 2<><> 3<>" = [""; " 1"; " 2"; ""; " 3";""]);
  assert (String.split "<>" ">>><>>>><>>>><>>>>" = 
          [">>>"; ">>>"; ">>>"; ">>>" ]);
  assert (String.split "<->" "<->>->" = [""; ">->"]);
  assert (String.split "aa" "aa" = [""; ""]); 
  assert (String.split "aa" "aaa" = [""; "a"]);
  assert (String.split "aa" "aaaa" = [""; ""; ""]);
  assert (String.split "aa" "aaaaa" = [""; ""; "a"]);
  assert (String.split "aa" "aaaaaa" = [""; ""; ""; ""]);
  ()

dbuenzli · 2014-02-15T00:21:28Z

I think a good case can be made for (3) if you read the current description of the matching procedure in the documentation. You cannot skip the empty separator --- its length is 0 --- so it makes no sense to have an empty separator, the matching procedure would loop.

ygrek · 2014-02-15T01:41:36Z

In general, I would like to see more functions in stdlib, providing there is general agreement over the specification, the name, the order of arguments etc. (perhaps I am too optimistic?)

In extlib :

split : string -> string -> (string * string)
nsplit : string -> string -> string list

and the order of arguments is reversed..

PS Having the invariants being discussed here is surely nice

dbuenzli · 2014-02-15T08:40:56Z

@ygrek I never found extlib to be particularly well designed, so I'm not sure it should be taken as an example:

Having the argument reversed is not what you want in the average case. I guess it must come from dogma that the main "object" should be the first argument, which is a wrong dogma in a language with currying. You want to write List.rev_map (String.split ",") (lines file) and not List.rev_map (fun s -> String.split s ",") (lines file).
A case could be maybe made for the name, but I note that if you look at perl, ruby, python, racket, scala, javascript to name a few, all have that function named split

Chris00 · 2014-02-15T08:54:17Z

Having the argument reversed is not what you want in the average case. I
guess it must come from dogma that the main "object" should be the first
argument, which is a wrong dogma in a language with currying. You want to
write List.rev_map (String.split ",") (lines file) and not List.rev_map
(fun s -> String.split s ",") (lines file).

Since the two arguments have the same type, it would be good that the
separator be labeled.

val split : sep:string -> string -> string

dbuenzli · 2014-02-15T08:58:03Z

@Chris00, agreed but my understanding of the stdlib is that the String module shall have no labels and StringLabels have them, that's what my patch does.

gasche · 2014-02-15T10:01:55Z

I gave up on 2. failing to be able to give a reasonable, terse, explanation. I updated my PR with 3.

Same with me. The code really says sep = "" should return an infinite list of empty strings, so forbidding it is more reasonable.

ygrek · 2014-02-15T11:05:31Z

My point was about "agreed upon" names and conventions, should probably also take batteries into account. Proposed design is no doubt superior.

gasche · 2014-02-15T11:19:22Z

Indeed, using nsplit would make life easier for extlib or batteries user.

dbuenzli · 2014-02-15T11:37:34Z

I'm not sure it's a good idea to take into account what these extended libraries do. Since there are many of them and they don't agree on their names, we can say that their names are not necessarily the "agreed upon" names. For example js core's split is this split but restricted to char separators. I think that the fact that every other programming language out there uses split for that function is a quite good argument.

dbuenzli · 2014-02-15T11:43:55Z

Also if we agree that the argument order of this split is the right one, using nsplit would actually be more confusing and error prone to extlib or batteries users.

pw374 · 2014-02-15T12:18:34Z

A split that would be quite more useful to me would take a function as a splitter rather than a string (or a char).

Although I have no hope at all that it will ever make it to stdlib, please let me share with you my idea of a far more powerful design (however it probably has poor performance compared to classic split functions for classic cases):

let split f s =
  let res = ref [] in
  let b = Buffer.create 42 in
  let rec loop i s =
    if i >= String.length s then
      let bc = Buffer.contents b in
      Buffer.clear b;
      res := bc :: !res;
    else
      match f i s with
      | `Split ->
         let bc = Buffer.contents b in
         Buffer.clear b;
         res := bc :: !res;
         loop (i+1) s
      | `Continue ->
         Buffer.add_char b (s.[i]);
         loop (i+1) s
      | `Split_with (add, new_i, new_s) ->
         Buffer.add_string b add;
         let bc = Buffer.contents b in
         Buffer.clear b;
         res := bc :: !res;
         loop new_i new_s
  in
  loop 0 s;
  List.rev !res
(* val split :
   (int ->
   string -> [< `Continue | `Split | `Split_with of string * int * string ]) ->
   string -> string list *)

let _ = (* split on " " *)
  split (fun i s -> match s.[i] with ' ' -> `Split | _ -> `Continue) "hello foo bar"
(* ["hello"; "foo"; "bar"] *)

let _ = (* split on "foo" *)
  split
    (fun i s ->
       if String.length s > i+3 && String.sub s i 3 = "foo" then
         `Split_with ("", i+3, s)
       else `Continue)
    "hello foo bar"
(* ["hello "; " bar"] *)

let split_on_string sep s =
  if sep = "" then invalid_arg "split_on_string";
  let ls = String.length sep in 
  split
    (fun i s ->
       if String.length s > i+ls && String.sub s i ls = sep then
         `Split_with ("", i+ls, s)
       else `Continue)
    s
(* val split_on_string : string -> string -> string list *)

let _ =
  split_on_string "foo" "hello foo bar baz"
(* ["hello "; " bar baz"] *)

let _ =
  split_on_string "foo" ""
(* [""] *)

let _ =
  split_on_string "" ""
(* Exception: Invalid_argument "split_on_string". *)

let split_on_spaces s =
  split
    (fun i s ->
       match s.[i] with
       | ' ' ->
         let rec loop i =
           if String.length s > i && s.[i] = ' ' then
             loop (i+1)
           else
             `Split_with("", i, s)
         in loop i
       | _ -> `Continue 
    )
    s
(* val split_on_spaces : string -> string list *)

let _ = split_on_spaces "plop                plop          "
(* ["plop"; "plop"; ""] *)

let split_sentences s =
  split
    (fun i s ->
       match s.[i] with
       | '.' | '!' | '?' ->
         let b = Buffer.create 42 in
         let rec loop i =
           if String.length s > i then
             match s.[i] with
             | '.' | '!' | '?' | ' ' ->
               Buffer.add_char b s.[i];
               loop (i+1)
             | _ ->
               `Split_with(Buffer.contents b, i, s)
           else
             `Split_with(Buffer.contents b, i, s)
         in loop i
       | _ -> `Continue)
    s
(* val split_sentences : string -> string list *)

let _ = split_sentences "Bonjour, comment ça va?! Ça va. Merci. Au revoir! Déjà?? Oui!!"
(* ["Bonjour, comment ça va?! "; "Ça va. "; "Merci. "; "Au revoir! "; "Déjà?? "; "Oui!!"; ""] *)

alainfrisch · 2014-02-15T12:59:56Z

@dbuenzli: regarding "you could still split utf-8 strings at a given UTF-8 encoded char", this is also true with a single char delimiter, as long as the delimiter is actually a ASCII character. I believe it is by far the most common case even when manipulating utf-8 strings. Considering the advantages of the simple version (simpler and more efficient implementation, simple specification, no risk of confusing the two arguments), I think I'd be in favor of that one.

Another useful function would a "split2", which splits a string in two at the first occurrence of a delimiter.

dbuenzli · 2014-02-15T14:09:31Z

@alainfrisch There are quite a few unicode delimiters (for example the unicode line or paragraph separators (U+2028, U+2029), or the various dashes) that are well beyond the ASCII repertoire. I still think that having a sequence of bytes for delimiters is a good thing to have (I'll again invoke the argument of other programming languages which all allow you to split with a string). I think the current specification I gave in the documentation is not hard to comprehend; if you don't think so please tell how it can be improved.

Regarding split2 which I would rather call cut or something like this, we can sure add a lot of things (e.g. rsplit, that would try to match from the end). However I'm just trying to make one single addition in the very lean and conservative (which I actually see as a virtues) standard library that I miss very much without trying to open a pandora box. I'd rather have only that one in than proposing much more but get everything rejected.

Switch from Cstubs.FOREIGN to Ctypes.FOREIGN to eliminate cstubs runtime dependency

…ections Make sure that stw_sections can never overlap

* Create a backend specific copy of testsuite/tools * Partially revert "Replace tuple with record in Cextcall (ocaml#10)" This partially reverts commit 2cd07e649566a078246f4ad84369c467cbf52e11. Revert the changes to ocaml/testsuite/tools * Change 'make runtest-upstream' to use backend-specific copy of testsuite tools

This commit, which was part of PR#55, was lost when PR#55 was ported to 4.12. Partially revert "Replace tuple with record in Cextcall (ocaml#10)" This partially reverts commit 2cd07e649566a078246f4ad84369c467cbf52e11. Revert the changes to ocaml/testsuite/tools

…ocaml#84) This commit, which was part of PR#55, was lost when PR#55 was ported to 4.12. Partially revert "Replace tuple with record in Cextcall (ocaml#10)" This partially reverts commit 2cd07e649566a078246f4ad84369c467cbf52e11. Revert the changes to ocaml/testsuite/tools

@inline

23a7f73 flambda-backend: Fix some Debuginfo.t scopes in the frontend (ocaml#248) 33a04a6 flambda-backend: Attempt to shrink the heap before calling the assembler (ocaml#429) 8a36a16 flambda-backend: Fix to allow stage 2 builds in Flambda 2 -Oclassic mode (ocaml#442) d828db6 flambda-backend: Rename -no-extensions flag to -disable-all-extensions (ocaml#425) 68c39d5 flambda-backend: Fix mistake with extension records (ocaml#423) 423f312 flambda-backend: Refactor -extension and -standard flags (ocaml#398) 585e023 flambda-backend: Improved simplification of array operations (ocaml#384) faec6b1 flambda-backend: Typos (ocaml#407) 8914940 flambda-backend: Ensure allocations are initialised, even dead ones (ocaml#405) 6b58001 flambda-backend: Move compiler flag -dcfg out of ocaml/ subdirectory (ocaml#400) 4fd57cf flambda-backend: Use ghost loc for extension to avoid expressions with overlapping locations (ocaml#399) 8d993c5 flambda-backend: Let's fix instead of reverting flambda_backend_args (ocaml#396) d29b133 flambda-backend: Revert "Move flambda-backend specific flags out of ocaml/ subdirectory (ocaml#382)" (ocaml#395) d0cda93 flambda-backend: Revert ocaml#373 (ocaml#393) 1c6eee1 flambda-backend: Fix "make check_all_arches" in ocaml/ subdirectory (ocaml#388) a7960dd flambda-backend: Move flambda-backend specific flags out of ocaml/ subdirectory (ocaml#382) bf7b1a8 flambda-backend: List and Array Comprehensions (ocaml#147) f2547de flambda-backend: Compile more stdlib files with -O3 (ocaml#380) 3620c58 flambda-backend: Four small inliner fixes (ocaml#379) 2d165d2 flambda-backend: Regenerate ocaml/configure 3838b56 flambda-backend: Bump Menhir to version 20210419 (ocaml#362) 43c14d6 flambda-backend: Re-enable -flambda2-join-points (ocaml#374) 5cd2520 flambda-backend: Disable inlining of recursive functions by default (ocaml#372) e98b277 flambda-backend: Import ocaml#10736 (stack limit increases) (ocaml#373) 82c8086 flambda-backend: Use hooks for type tree and parse tree (ocaml#363) 33bbc93 flambda-backend: Fix parsecmm.mly in ocaml subdirectory (ocaml#357) 9650034 flambda-backend: Right-to-left evaluation of arguments of String.get and friends (ocaml#354) f7d3775 flambda-backend: Revert "Magic numbers" (ocaml#360) 0bd2fa6 flambda-backend: Add [@inline ready] attribute and remove [@inline hint] (not [@inlined hint]) (ocaml#351) cee74af flambda-backend: Ensure that functions are evaluated after their arguments (ocaml#353) 954be59 flambda-backend: Bootstrap dd5c299 flambda-backend: Change prefix of all magic numbers to avoid clashes with upstream. c2b1355 flambda-backend: Fix wrong shift generation in Cmm_helpers (ocaml#347) 739243b flambda-backend: Add flambda_oclassic attribute (ocaml#348) dc9b7fd flambda-backend: Only speculate during inlining if argument types have useful information (ocaml#343) aa190ec flambda-backend: Backport fix from PR#10719 (ocaml#342) c53a574 flambda-backend: Reduce max inlining depths at -O2 and -O3 (ocaml#334) a2493dc flambda-backend: Tweak error messages in Compenv. 1c7b580 flambda-backend: Change Name_abstraction to use a parameterized type (ocaml#326) 07e0918 flambda-backend: Save cfg to file (ocaml#257) 9427a8d flambda-backend: Make inlining parameters more aggressive (ocaml#332) fe0610f flambda-backend: Do not cache young_limit in a processor register (upstream PR 9876) (ocaml#315) 56f28b8 flambda-backend: Fix an overflow bug in major GC work computation (ocaml#310) 8e43a49 flambda-backend: Cmm invariants (port upstream PR 1400) (ocaml#258) e901f16 flambda-backend: Add attributes effects and coeffects (ocaml#18) aaa1cdb flambda-backend: Expose Flambda 2 flags via OCAMLPARAM (ocaml#304) 62db54f flambda-backend: Fix freshening substitutions 57231d2 flambda-backend: Evaluate signature substitutions lazily (upstream PR 10599) (ocaml#280) a1a07de flambda-backend: Keep Sys.opaque_identity in Cmm and Mach (port upstream PR 9412) (ocaml#238) faaf149 flambda-backend: Rename Un_cps -> To_cmm (ocaml#261) ecb0201 flambda-backend: Add "-dcfg" flag to ocamlopt (ocaml#254) 32ec58a flambda-backend: Bypass Simplify (ocaml#162) bd4ce4a flambda-backend: Revert "Semaphore without probes: dummy notes (ocaml#142)" (ocaml#242) c98530f flambda-backend: Semaphore without probes: dummy notes (ocaml#142) c9b6a04 flambda-backend: Remove hack for .depend from runtime/dune (ocaml#170) 6e5d4cf flambda-backend: Build and install Semaphore (ocaml#183) 924eb60 flambda-backend: Special constructor for %sys_argv primitive (ocaml#166) 2ac6334 flambda-backend: Build ocamldoc (ocaml#157) c6f7267 flambda-backend: Add -mbranches-within-32B to major_gc.c compilation (where supported) a99fdee flambda-backend: Merge pull request ocaml#10195 from stedolan/mark-prefetching bd72dcb flambda-backend: Prefetching optimisations for sweeping (ocaml#9934) 27fed7e flambda-backend: Add missing index param for Obj.field (ocaml#145) cd48b2f flambda-backend: Fix camlinternalOO at -O3 with Flambda 2 (ocaml#132) 9d85430 flambda-backend: Fix testsuite execution (ocaml#125) ac964ca flambda-backend: Comment out `[@inlined]` annotation. (ocaml#136) ad4afce flambda-backend: Fix magic numbers (test suite) (ocaml#135) 9b033c7 flambda-backend: Disable the comparison of bytecode programs (`ocamltest`) (ocaml#128) e650abd flambda-backend: Import flambda2 changes (`Asmpackager`) (ocaml#127) 14dcc38 flambda-backend: Fix error with Record_unboxed (bug in block kind patch) (ocaml#119) 2d35761 flambda-backend: Resurrect [@inline never] annotations in camlinternalMod (ocaml#121) f5985ad flambda-backend: Magic numbers for cmx and cmxa files (ocaml#118) 0e8b9f0 flambda-backend: Extend conditions to include flambda2 (ocaml#115) 99870c8 flambda-backend: Fix Translobj assertions for Flambda 2 (ocaml#112) 5106317 flambda-backend: Minor fix for "lazy" compilation in Matching with Flambda 2 (ocaml#110) dba922b flambda-backend: Oclassic/O2/O3 etc (ocaml#104) f88af3e flambda-backend: Wire in the remaining Flambda 2 flags (ocaml#103) 678d647 flambda-backend: Wire in the Flambda 2 inlining flags (ocaml#100) 1a8febb flambda-backend: Formatting of help text for some Flambda 2 options (ocaml#101) 9ae1c7a flambda-backend: First set of command-line flags for Flambda 2 (ocaml#98) bc0bc5e flambda-backend: Add config variables flambda_backend, flambda2 and probes (ocaml#99) efb8304 flambda-backend: Build our own ocamlobjinfo from tools/objinfo/ at the root (ocaml#95) d2cfaca flambda-backend: Add mutability annotations to Pfield etc. (ocaml#88) 5532555 flambda-backend: Lambda block kinds (ocaml#86) 0c597ba flambda-backend: Revert VERSION, etc. back to 4.12.0 (mostly reverts 822d0a0 from upstream 4.12) (ocaml#93) 037c3d0 flambda-backend: Float blocks 7a9d190 flambda-backend: Allow --enable-middle-end=flambda2 etc (ocaml#89) 9057474 flambda-backend: Root scanning fixes for Flambda 2 (ocaml#87) 08e02a3 flambda-backend: Ensure that Lifthenelse has a boolean-valued condition (ocaml#63) 77214b7 flambda-backend: Obj changes for Flambda 2 (ocaml#71) ecfdd72 flambda-backend: Cherry-pick 9432cfdadb043a191b414a2caece3e4f9bbc68b7 (ocaml#84) d1a4396 flambda-backend: Add a `returns` field to `Cmm.Cextcall` (ocaml#74) 575dff5 flambda-backend: CMM traps (ocaml#72) 8a87272 flambda-backend: Remove Obj.set_tag and Obj.truncate (ocaml#73) d9017ae flambda-backend: Merge pull request ocaml#80 from mshinwell/fb-backport-pr10205 3a4824e flambda-backend: Backport PR#10205 from upstream: Avoid overwriting closures while initialising recursive modules f31890e flambda-backend: Install missing headers of ocaml/runtime/caml (ocaml#77) 83516f8 flambda-backend: Apply node created for probe should not be annotated as tailcall (ocaml#76) bc430cb flambda-backend: Add Clflags.is_flambda2 (ocaml#62) ed87247 flambda-backend: Preallocation of blocks in Translmod for value let rec w/ flambda2 (ocaml#59) a4b04d5 flambda-backend: inline never on Gc.create_alarm (ocaml#56) cef0bb6 flambda-backend: Config.flambda2 (ocaml#58) ff0e4f7 flambda-backend: Pun labelled arguments with type constraint in function applications (ocaml#53) d72c5fb flambda-backend: Remove Cmm.memory_chunk.Double_u (ocaml#42) 9d34d99 flambda-backend: Install missing artifacts 10146f2 flambda-backend: Add ocamlcfg (ocaml#34) 819d38a flambda-backend: Use OC_CFLAGS, OC_CPPFLAGS, and SHAREDLIB_CFLAGS for foreign libs (ocaml#30) f98b564 flambda-backend: Pass -function-sections iff supported. (ocaml#29) e0eef5e flambda-backend: Bootstrap (ocaml#11 part 2) 17374b4 flambda-backend: Add [@@Builtin] attribute to Primitives (ocaml#11 part 1) 85127ad flambda-backend: Add builtin, effects and coeffects fields to Cextcall (ocaml#12) b670bcf flambda-backend: Replace tuple with record in Cextcall (ocaml#10) db451b5 flambda-backend: Speedups in Asmlink (ocaml#8) 2fe489d flambda-backend: Cherry-pick upstream PR#10184 from upstream, dynlink invariant removal (rev 3dc3cd7 upstream) d364bfa flambda-backend: Local patch against upstream: enable function sections in the Dune build 886b800 flambda-backend: Local patch against upstream: remove Raw_spacetime_lib (does not build with -m32) 1a7db7c flambda-backend: Local patch against upstream: make dune ignore ocamldoc/ directory e411dd3 flambda-backend: Local patch against upstream: remove ocaml/testsuite/tests/tool-caml-tex/ 1016d03 flambda-backend: Local patch against upstream: remove ocaml/dune-project and ocaml/ocaml-variants.opam 93785e3 flambda-backend: To upstream: export-dynamic for otherlibs/dynlink/ via the natdynlinkops files (still needs .gitignore + way of generating these files) 63db8c1 flambda-backend: To upstream: stop using -O3 in otherlibs/Makefile.otherlibs.common eb2f1ed flambda-backend: To upstream: stop using -O3 for dynlink/ 6682f8d flambda-backend: To upstream: use flambda_o3 attribute instead of -O3 in the Makefile for systhreads/ de197df flambda-backend: To upstream: renamed ocamltest_unix.xxx files for dune bf3773d flambda-backend: To upstream: dune build fixes (depends on previous to-upstream patches) 6fbc80e flambda-backend: To upstream: refactor otherlibs/dynlink/, removing byte/ and native/ 71a03ef flambda-backend: To upstream: fix to Ocaml_modifiers in ocamltest 686d6e3 flambda-backend: To upstream: fix dependency problem with Instruct c311155 flambda-backend: To upstream: remove threadUnix 52e6e78 flambda-backend: To upstream: stabilise filenames used in backtraces: stdlib/, otherlibs/systhreads/, toplevel/toploop.ml 7d08e0e flambda-backend: To upstream: use flambda_o3 attribute in stdlib 403b82e flambda-backend: To upstream: flambda_o3 attribute support (includes bootstrap) 65032b1 flambda-backend: To upstream: use nolabels attribute instead of -nolabels for otherlibs/unix/ f533fad flambda-backend: To upstream: remove Compflags, add attributes, etc. 49fc1b5 flambda-backend: To upstream: Add attributes and bootstrap compiler a4b9e0d flambda-backend: Already upstreamed: stdlib capitalisation patch 4c1c259 flambda-backend: ocaml#9748 from xclerc/share-ev_defname (cherry-pick 3e937fc) 00027c4 flambda-backend: permanent/default-to-best-fit (cherry-pick 64240fd) 2561dd9 flambda-backend: permanent/reraise-by-default (cherry-pick 50e9490) c0aa4f4 flambda-backend: permanent/gc-tuning (cherry-pick e9d6d2f) git-subtree-dir: ocaml git-subtree-split: 23a7f73

da6ff04 Accept [@ocaml.local] without -extension, and move autogenerated attrs to [@extension.local] (ocaml#9) 30ce67d Improve inclusion error messages for [@local_opt] (ocaml#10) f925a62 Remove some uneeded mode variables (ocaml#8) dec721c Local solver speedups (ocaml#7) e9afc49 Fix check_all_arches build 0b9b32a Fix i386 build a515093 Merge flambda-backend changes git-subtree-dir: ocaml git-subtree-split: da6ff04

Better alias errors

protz reviewed Feb 14, 2014
View reviewed changes

dra27 referenced this pull request in dra27/ocaml Mar 20, 2019

Merge pull request #10 from yallop/cstubs-dep

babc489

Switch from Cstubs.FOREIGN to Ctypes.FOREIGN to eliminate cstubs runtime dependency

stedolan pushed a commit to stedolan/ocaml that referenced this pull request Feb 20, 2020

Merge pull request ocaml#10 from ctk21/stw_minor_gc_non_overlap_stw_s…

1552c78

…ections Make sure that stw_sections can never overlap

poechsel pushed a commit to poechsel/ocaml that referenced this pull request Jun 30, 2021

Replace tuple with record in Cextcall (ocaml#10)

83d7c46

poechsel pushed a commit to poechsel/ocaml that referenced this pull request Jul 2, 2021

Replace tuple with record in Cextcall (ocaml#10)

9305603

stedolan pushed a commit to stedolan/ocaml that referenced this pull request Dec 13, 2021

flambda-backend: Replace tuple with record in Cextcall (ocaml#10)

b670bcf

stedolan added a commit to stedolan/ocaml that referenced this pull request Mar 22, 2022

Improve inclusion error messages for [@local_opt] (ocaml#10)

30ce67d

andreas-schwab mentioned this pull request Nov 28, 2022

ocamlopt.opt crashes from stack overflow while building coccinelle on RISC-V #11765

Closed

turly221 pushed a commit to scantist-ossops-m2/ocaml that referenced this pull request Nov 30, 2024

Merge pull request ocaml#10 from janestreet/patch/better-alias-errors

1950d18

Better alias errors

Conversation

dbuenzli commented Feb 14, 2014

Uh oh!

dbuenzli commented Feb 14, 2014

Uh oh!

protz Feb 14, 2014

Choose a reason for hiding this comment

Uh oh!

dbuenzli Feb 14, 2014

Choose a reason for hiding this comment

Uh oh!

protz commented Feb 14, 2014

Uh oh!

gasche commented Feb 14, 2014

Uh oh!

tomjridge commented Feb 14, 2014

Uh oh!

alainfrisch commented Feb 14, 2014

Uh oh!

dbuenzli commented Feb 14, 2014

Uh oh!

gasche commented Feb 14, 2014

Uh oh!

dbuenzli commented Feb 14, 2014

Uh oh!

gasche commented Feb 14, 2014

Uh oh!

alainfrisch commented Feb 14, 2014

Uh oh!

dbuenzli commented Feb 14, 2014

Uh oh!

dbuenzli commented Feb 14, 2014

Uh oh!

gasche commented Feb 14, 2014

Uh oh!

alainfrisch commented Feb 14, 2014

Uh oh!

dbuenzli commented Feb 14, 2014

Uh oh!

dbuenzli commented Feb 14, 2014

Uh oh!

dbuenzli commented Feb 15, 2014

Uh oh!

ygrek commented Feb 15, 2014

Uh oh!

dbuenzli commented Feb 15, 2014

Uh oh!

Chris00 commented Feb 15, 2014

Uh oh!

dbuenzli commented Feb 15, 2014

Uh oh!

gasche commented Feb 15, 2014

Uh oh!

ygrek commented Feb 15, 2014

Uh oh!

gasche commented Feb 15, 2014

Uh oh!

dbuenzli commented Feb 15, 2014

Uh oh!

dbuenzli commented Feb 15, 2014

Uh oh!

pw374 commented Feb 15, 2014

Uh oh!

alainfrisch commented Feb 15, 2014

Uh oh!

dbuenzli commented Feb 15, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants