Option-returning variants of stdlib functions#885
Conversation
The only exception is the rarely used Buffer.add_substitute, where the [Not_found] can really be interpreted as an error condition. Most new functions are implemented directly (instead of wrapping the raising version). This is for performance reasons and also to avoid destroying the stacktrace (if the function is used in an exception handler). One could instead implement the raising versions on top of the new functions, but there might be a small penalty. Since code duplication remains reasonable, I think the current state is ok. Sys.getenv is still implemented on top of the raising version (itself a runtime primitive).
…ctions. This adds option-returning variants for some stdlib functions currently raising Failure or Invalid_arg. This comments does not try to be exhaustive in providing non-raising versions. For instance, invalid ranges in arrays/strings are considered as programming errors (they can be checked explicitly by the caller in constant time). And so are errors related to applying binary iterators on containers with different sizes. (I'm not sure about List.hd/List.tl: one could argue that matching on the list is as simple as matching the result of hd_opt/tl_opt.) otherlibs/ is not covered by this commit.
|
nativeint.ml, int32.ml and int64.ml should all have the TODO comment for different primitives (as in sys.mlp and pervasives.ml) |
stdlib/list.mli
Outdated
| [Failure "tl"] if the list is empty. *) | ||
| [Failure "tl"] if the list is empty. *) | ||
|
|
||
| val tl_opt: 'a list -> 'a list option |
There was a problem hiding this comment.
I'm more strongly convinced that hd_opt and tl_opt are a bad idea than you were! hd and tl should only be used in the "get" scenario (i.e. where you know that the list is non-empty), otherwise you should be pattern matching the list (always), surely? Having hd_opt and tl_opt would seem to encourage poor constructs like:
match List.hd_opt l with
Some hd -> ...
| None -> ...or even worse:
List.hd_opt l |> Option.get |> ...rather than the obvious:
match l with
hd::_ -> ...
| [] -> ...Perhaps a change to the docs for hd and tl explaining that philosophy, rather than these two functions? Or is there a really good scenario for wanting to use the _opt versions which I'm not seeing?
There was a problem hiding this comment.
Agreed. I removed these functions.
Thanks, done. |
|
Some timings for comparing performances between options and exceptions: #869 (comment) |
|
I guess Num/Big_int isn't strictly part of stdlib, but here's a request to also do |
|
I know it's still work-in-progress, so just to get it on a list: the changes to pervasives.ml need to go to otherlibs/threads/pervasives.ml (blocking Travis) and the relevant testsuite tests for these functions should be duplicated to test the _opt variants too. |
I've added _opt variants for conversions function in the nums library. That said, the doc did not say explicitly which functions could raise (see final section in num.mli), and I did a very quick pass to figure that out. (I've added a TODO note to add some documentation to this section; it's not just error condition that are not specified.) |
I don't see any test for int_of_string (in particular its error condition) or for List.find. But point taken: all new functions will have to be tested. |
|
I was thinking things like |
|
I think it's worth pointing out that the average-case performance of allocation versus an exception trap installation can be misleading. The tails of that distribution are bad; you might trigger a GC and trash your cache. As such, for some applications, the exception-raising versions may be more satisfactory. I agree that the option-returning versions should be added in any case. The right "performance" thing to do, for functions that return options that cannot be inlined for whatever reason, is probably not to use the exception-returning versions but to allow non-inlined functions to return values of variant type without boxing (and then use the option-returning versions). We are working on that (we had a prototype a while back, but we're waiting until we've finished some structural changes to Flambda before polishing it up). |
|
I've added a test for most new functions (not covered: Ephemerons, *Labels modules, Bytes (but String is covered), and the nums library). |
|
Let's wait for a few days to give time to comment on that, but this has received support from two other core developers, so this is on good track for inclusion. If someone objects to the change, please speak up! |
|
It seems like a very good change. I have one minor issue though: in |
|
I would be very mildly in favor of having |
I'm not a big fan of the |
Point taken. In order to simplify the discussion, I suggest to leave that out for now, if you are ok? These functions can always be added later. |
|
Since there is rather strong support for this and no opposition, I think this can be merged. I've heard that the best practice is that the contributor does not click on "Merge" himself, so I'll let the honor to another core developer ( @dra27 ? ). |
|
Sure - just for paranoia's sake, I shall wait for Travis to be green (though I think it's only checking the Changes file!) |
|
It's green now, the Changes file was ok 😄 |
|
One further thought - do you want to fix up the commits or just squash the whole thing into one? |
|
I don't think there is any value in keeping separate commits. I'd suggest to click on "Squash and merge". |
|
Be very afraid 😉 |
Finds the first/last binding where the key satisfies a monotonic predicate, returning an option. Followup to ocaml#885.
Finds the first/last binding where the key satisfies a monotonic predicate, returning an option. Followup to ocaml#885.
Finds the first/last element that satisfies a monotonic predicate, returning an option. Followup to ocaml#885.
Finds the first/last binding where the key satisfies a monotonic predicate, returning an option. Followup to ocaml#885.
Finds the first/last element that satisfies a monotonic predicate, returning an option. Followup to ocaml#885.
Finds the first/last binding where the key satisfies a monotonic predicate, returning an option. Followup to ocaml#885.
Finds the first/last element that satisfies a monotonic predicate, returning an option. Followup to ocaml#885.
Finds the first/last binding where the key satisfies a monotonic predicate, returning an option. Followup to ocaml#885.
Finds the first/last element that satisfies a monotonic predicate, returning an option. Followup to ocaml#885.
Finds the first/last binding where the key satisfies a monotonic predicate, returning an option. Followup to ocaml#885.
Finds the first/last element that satisfies a monotonic predicate, returning an option. Followup to ocaml#885.
Provide an xxx_opt alternative for functions raising Not_found and many instances of Failure/Invalid_arg. The only exception is the rarely used Buffer.add_substitute, where the [Not_found] can really be interpreted as an error condition. Most new functions are implemented directly (instead of wrapping the raising version). This is for performance reasons and also to avoid destroying the stacktrace (if the function is used in an exception handler). One could instead implement the raising versions on top of the new functions, but there might be a small penalty.
Stdlib uses exceptions ([Not_found], [Failure], [Invalid_argument]) to report non-exceptional conditions, such failed conversions from strings to numbers, or failing lookups in datastructures. This is commonly thought as a design mistake (esp. because exceptions are not captured in types), and also has potentially bad performance consequences (allocating a [Some] block is usually faster than setting up an exception handler; and raising exception can be really slow when stacktraces are enabled). Also, using exceptions can destroy previous stack traces if those functions are used within exception handlers. Another arguments is that exceptions don't play very nicely with monadic abstraction libraries.
Considering how "brittle" exceptions are, it is best to have users explicitly use them if they want rather than have basic stdlib function raise them.
This PR adds option-returning variants of existing functions. It does not attempt to cover all functions, since in particular [Failure]/[Invalid_argument] have arguably more legitimate uses (see commit messages for some more details on that).