Expand option by bschommer · Pull Request #778 · ocaml/ocaml

bschommer · 2016-08-25T07:37:40Z

This PR adds a new spec for the Arg parser: Expand of (string -> string array) which takes a function as arguments and parses the returned string array recursively with the same set of specs.

This implements the approach mentioned by @xavierleroy in #748.

Expand expecets a function that takes a string and returns an array of string which then are parsed inplace of the string argument.

Comment string for the argumnet is added as well as a test for the new spec.

alainfrisch · 2016-08-25T07:44:58Z

It's actually a bit more general than @xavierleroy suggestion in #748, since with this PR the user is responsible for expanding the argument (which could amount to reading a response file, or something else e.g. some environment variable).

bschommer · 2016-08-25T07:51:45Z

However it does not enforce the user to choose some predefined response file style.

gasche · 2016-09-02T13:46:26Z

I personally like this way to separate the two aspects of the discussed feature and thing this one could be merged as soon as we want to -- but of course I won't consider MPR#7050 solved before we have at least one easy-to-use parsing function.

gasche · 2016-09-02T13:57:11Z

I think Expand should be a bit more carefully specified, as there are some slightly non-trivial semantics coming from the bracketing of buildup/teardown parts of the parse_argv_dynamic recursive call. For example, if -count <int> is an integer-taking action, it is not possible for the expanded arguments to end on -count with the expected count coming after that in the non-expanded argument list -- I'm not saying we have to support that. The implemented behaviour could arguably be understood from the documentation "[...] and recursively parse the returned array" (which is more a description of the implementation than a behavioral specification), but it's not clear from the name "Expand" (that to me suggests an expand-first-and-interpret after, as opposed to interpret-subarray-and-then-continue).

bschommer · 2016-09-03T15:49:37Z

You are right, there are some complications concerning the documentation and also the implementation.
Another problem is what happens to the count? The current implementation uses a new count for the recursive call since when the argument is parsed we don't parse arguments of the original array. What one could additionally implement is that either the function also returns an optional counter or add a an additional Expand_with_counter that contains this as argument.
Also is it allowed to modify the spec list in the recursively call and if further recursion is allowed/wanted.

gasche · 2016-09-03T16:30:47Z

Given the heavy skew of the current API towards heavily imperative left-to-right processing, I think that the following specification could be the simplest one:

if the remaining arguments to process are of the form ["-foo"; "arg"] @ rest where -foo is registered as Expand f, then the arguments f "arg" @ rest are processed.

This immediately suggests an implementation that actually concatenates the argument arrays, and makes no validity check on f "arg" as an argument subsequence. This is a very weak specification, but it feels "canonical" in a sense -- its a stable point in the design space. This dictates the way the current counter should evolve (namely, the same counter should be used after concatenation is performed).

Regarding recursion, it seems natural to me that nested expansions should be permitted and I think that it matters in practice: the idea should be that if a given command-line is too long, it is always possible to stuff it in a file and use -args <file> instead, and this guarantee would be broken by forbidding expansions inside expansions. Of course, that raises an issue of non-termination; given that expansion functions may have side-effects a simple cycle-detection strategy would not work, so the best option I think is to set a hard limit on the limit of successive expansions.

(I'll temporarily remove the "approved tag" during the design discussion to avoid confusion.)

bschommer · 2016-09-05T08:30:46Z

This immediately suggests an implementation that actually concatenates the argument arrays, and makes no validity check on f "arg" as an argument subsequence. This is a very weak specification, but it feels "canonical" in a sense -- its a stable point in the design space. This dictates the way the current counter should evolve (namely, the same counter should be used after concatenation is performed).

Makes sense. I will update the PR accordingly.

Regarding recursion, it seems natural to me that nested expansions should be permitted and I think that it matters in practice: the idea should be that if a given command-line is too long, it is always possible to stuff it in a file and use -args instead, and this guarantee would be broken by forbidding expansions inside expansions. Of course, that raises an issue of non-termination; given that expansion functions may have side-effects a simple cycle-detection strategy would not work, so the best option I think is to set a hard limit on the limit of successive expansions.

I would let the user decide on the recursion for himself.

Instead of introducing a new counter the recursive call takes the original counter. Furthermore the documentation is updated to reflect the actually semantics of Expand.

gasche · 2016-09-05T11:19:03Z

I would let the user decide on the recursion for himself.

Well there is a realistic risk that someone would put --args foo whithin foo, and just looping silently is not an excellent user interface I think. But in the interest of having this feature progress, we might defer that discussion until someone actually hits the issue and is brave enough to report it?

bschommer · 2016-09-05T11:43:33Z

How about an adding an additional bool to expand that states if recursion is allowed?

bschommer · 2016-09-05T12:03:32Z

I just recognized that when the current variable is used for the recursive call the whole function needs to be changed, since the current variable is used to get the arguments.

Reusing the current pointer only works if the arrays are concatenated.

bschommer · 2016-09-05T12:34:19Z

The arrays now are concatenated and the recursion is removed.

gasche · 2016-09-05T14:12:01Z

How about an adding an additional bool to expand that states if recursion is allowed?

No, I feel that this is not a meaningful choice to give to the users (because why would they not allow it?). But not having any termination-checking behavior is probably fine for a first PR.

Thanks for the new implementation that is about what I would expect. Note that it should be possible to implement it without actual array concatenation (which has a quadratic behavior) by keeping a stack (a list) of (currently processed array, offset of the first element to substract when indexing in it) pairs, but I think implementing that is probably premature optimization and a simpler, easier-to-understand concatenating implementation is fine.

I would replace the !l accesses with just Array.length !argv, which I suspect would make the code clearer and remove one way to get it wrong.

bschommer · 2016-09-05T14:18:45Z

My first implementation try did use a list, however the possibility to have arguments of the form -foo=bar made this far more complicated. I will replace l with the length.

This avoids forgetting to update the lenght of after the Array is updated.

gasche · 2016-09-05T14:53:10Z

No, I was thinking of a list of (array, offset) pairs, where offset is the value of !counter at the time the array was pushed in by an Expand node. The code to get the next array element would not be !argv.(!current) anymore, but something like (untested)

let rec elem idx = function
| [] -> assert false
| (argv, offset) :: rest ->
  if idx < offset then elem idx rest
  else argv.(idx - offset)
in
elem !current !argv_stack

but, again, I think this is premature optimization (also I'm not completely sure how reliably that works if/when user code modifies the current reference).

bschommer · 2016-09-06T13:36:48Z

Okay. However I think the concatenation is not that bad since ideally it should not happen to often.

ghost · 2016-09-09T12:33:06Z

BTW, at some point we might want to allow the user to get access to the expanded array. There are programs such as the toplevel or ocamldebug that are using the current reference as an index in Sys.argv and for this reason they can't use Expand arguments. To use Expand arguments in such programs we would need to pass argv as a reference as well

bschommer · 2016-09-12T06:57:41Z

Adding such a function should not be that complicated and makes sense. Any suggestions for a name?

The new function takes a reference to the argv array and modifies it if an Expand spec is encountered. Also added tests for this.

bschommer · 2016-09-13T12:33:00Z

I added a function parse_and_expand_argv_dynamic which takes the argv as reference arguments and modifies it.

ghost · 2016-09-15T07:26:08Z

Thanks, I reviewed the changes and they look good to me. @gasche are you happy as well with the current patch?

One thing I was wondering is whether parse_and_exnapdn_argv_dynamic should be the only one allowed to treat expand arguments. With the other ones the user might end up with a current reference pointing to an array they don't have access to which can be a source of errors. But maybe that's too inconvenient.

gasche · 2016-09-15T07:59:03Z

I would like most users to be able to take advantage of Expand easily, by just adding one such option in their specification list. This use-case suggests that while forbidding expand nodes with parse_argv{,_dynamic} functions is a possible choice, it should definitely be allowed with the more common parse{,_dynamic} functions.

One suggestion I considered is to have a global Arg.argv : string array ref reference that would be modified in line with Arg.current : int ref, such that upon return of one of the current-modifying function we have the invariant that !Arg.current is a valid index into !Arg.argv. However that is very ugly because to get the useful behavior with parse_argv one would need to either set !Arg.argv even on user-passed local current references (ugly) or do a physical equality test to set it only when current == Arg.current (also ugly).

I think I dislike the current, very stateful API of current : int ref parameters and global value in this module, and I'm rather unwilling to add more out-variables and global mutable states. Is there not a better API for the need solved by ?(current : int ref) today that we could use for both "current" and "argv" in the new expand-specific functions?

bschommer · 2016-09-15T08:06:20Z

One could use a slightly different approach to handle Expand options by adding a function which takes only care of expanding these options and nothing else. That way one could expand all options before an actual call to any of the parse_argv functions and we would not need worry about the problems from usages of current etc.

gasche · 2016-09-15T08:12:46Z

That doesn't really work if the expansion depends on state modified by the processing of previous options (for example one could think of -I dir/ options to add a directory to the set of included directories, and -args foo option that would look for a file named foo in the current directory or any included directory). Having this seems natural given the general semantics of Args processing (in particular with the _dynamic variants where the specification themselves can be modified by argument processing). That said, it does simplify the API quite a bit, as you suggest.

bschommer · 2016-09-15T08:40:15Z

Then one probably wants to forbid the usage of Expand in every function except for parse_and_expand_argv_dynamic. Otherwise someone the whole usage of current breaks if Expand is used.

ghost · 2016-09-15T10:21:42Z

About a better API for ?(current : int ref), I can think of two different usages of current:

to skip some arguments while parsing. For instance using this one could implement an option that takes two parameters (I think, I didn't check that it works)
to parse the remaining of the command line in a different way once a certain argument is reached

(1) doesn't seem worth supporting. (2) is what is more commonly used, by the toplevel or ocamldebug for instance.

For (2), we can have an API without out-variables:

val parse_blah
  :  int * string array
  -> (key * spec * doc) list ref
  -> anon_fun
  -> usage_msg
  -> int * string array

one issue is how to tell Arg to stop parsing. In the toplevel, the run_script functions is called directly inside the argument handler, so we never leave Arg.parse. ocamldebug raise a specific exception to stop the parsing. Both are inspecting Arg.current.

To use the toplevel method with an API without out-variables, we would need to pass current and the updated argv to every handler in Arg.spec. This is a big API change.

We could use the ocamldebug method and have a special exception to stop parsing. It's not super nice but we already have a few special exceptions (Arg.Help and Arg.Bad) and this doesn't require to change the types.

That said, the whole API of Arg rely on mutating global states. I'm wondering if it's worth trying to change this particular case. At least adding Arg.argv is in line with the rest of the current API

bschommer · 2016-09-16T07:57:02Z

My suggestions:

For the Expand option/this PR forbid the usage of Expand in all functions except for the parse_and_expand_argv_dynamic which also modifies the array to avoid problems if someone used current in combination with Expand.
For a new API without out-variables I would add a function:
```
val parse_next_arg: int * string array
-> (key * spec * doc) list ref
-> anon_fun 
-> usage_msg 
-> int * string array
```
which parses only one argument starting at the given index returns the (potentially) modified array and the index of the next argument. This would avoid the usage of current and any other global state. Also this would make it easier to abort at a certain index etc.

ghost · 2016-09-30T11:43:55Z

Sorry for the delay, yes these suggestions look right to me. For parse_next_arg I suppose that we don't even need to use a reference for the spec, given that we'll be calling at most one callback

Since Expand modifies the array and if the array is not passed as reference the usage of Expand would break lead to problems when current is used.

bschommer · 2016-10-05T08:03:55Z

I just noticed a bug in the current implementation of Arg. The following program:

open Arg

let _ =
   let test1 = ref 0
  and test2 = ref false in
  let test_int i = test1 := i
  and test_bool b = test2 := b
  and f_anon _ = () in
  let spec = ["-test", Tuple [Int test_int;Bool test_bool], "Blub"] in
  let args = [|"prog";"-test=10";"true"|] in
  parse_argv args spec f_anon "blub";
  assert (!test1 = 10);
  assert (!test2)

Fails with:

Fatal error: exception Arg.Bad("prog: wrong argument '10'; option '-test=10' expects a boolean.
blub
  -test Blub
  -help  Display this list of options
  --help  Display this list of options
")

which is due to the fact that follow ref is never unset.

ghost · 2016-10-07T10:04:30Z

stdlib/arg.ml


-let parse_argv_dynamic ?(current=current) argv speclist anonfun errmsg =
-  let l = Array.length argv in
+let parse_and_expand_argv_dynamic_aux expand current argv speclist anonfun errmsg =


I find expand not very descriptive, maybe expand_allowed?

It is now named expand_allowed.

ghost · 2016-10-07T10:10:17Z

stdlib/arg.ml

            done;
+        | Expand f ->
+            if not expand then
+              raise (Stop (Message ("Expand is not allowed")));


This is in fact a misuse of the API so it should raise Invalid_argument. I would put some informative message to help the programmer debug the problem: "Arg.Expand is only allowed with Arg.parse_and_expand_argv_dynamic"

Is replaced.

ghost · 2016-10-07T10:20:18Z

stdlib/arg.mli

    See {!Arg.parse_dynamic}.
 *)

+val parse_and_expand_argv_dynamic : ?current:int ref -> string array ref ->


Having ?current default to Arg.current doesn't seem right. We might as well make current a mandatory argument for this function

Current is now mandatory.

For the parse_and_expand_argv_dynamic function a current reference is now mandatory. As well as Invalid_arg is raised if Expand is used in any other function then parse_and_expand_argv_dynamic.

ghost · 2016-10-07T11:40:42Z

The current state looks good to me. I'm planning to merge this PR unless someone disagrees (/cc @gasche )

gasche · 2016-10-07T16:47:43Z

I haven't followed the last iterations but I trust @diml's opinion.

ghost · 2016-10-07T17:29:30Z

Alright, it's now merged. @bschommer thanks for your contribution

gasche · 2016-10-07T19:11:25Z

(Also, thanks everyone for the work!)

Add the `Arg.Expand` constructor to `Arg.spec`. This new specification allows the user to expand the `argv` array being parsed, for instance to implement responsefile support. A new function `Arg.parse_and_expand_argv_dynamic` is added. It takes both `current` and `argv` as references and as mandatory arguments. This function allows the user to access the `argv` array after expansion. To avoid confusion regarding the `?current` argument of the various parsing functions as well as with `Arg.current`, `Arg.Expand` is only allowed with the new function. Tests for this PR are added to the testsuite.

Comment on caml_domain_spawn also calling in install_backup_thread

25188da flambda-backend: Missed comment from PR802 (ocaml#887) 9469765 flambda-backend: Improve the semantics of asynchronous exceptions (new simpler version) (ocaml#802) d9e4dd0 flambda-backend: Fix `make runtest` on NixOS (ocaml#874) 4bbde7a flambda-backend: Simpler symbols (ocaml#753) ef37262 flambda-backend: Add opaqueness to Obj.magic under Flambda 2 (ocaml#862) a9616e9 flambda-backend: Add build system hooks for ocaml-jst (ocaml#869) 045ef67 flambda-backend: Allow the compiler to build with stock Dune (ocaml#868) 3cac5be flambda-backend: Simplify Makefile logic for natdynlinkops (ocaml#866) c5b12bf flambda-backend: Remove unnecessary install lines (ocaml#860) ff12bbe flambda-backend: Fix unused variable warning in st_stubs.c (ocaml#861) c84976c flambda-backend: Static check for noalloc: attributes (ocaml#825) ca56052 flambda-backend: Build system refactoring for ocaml-jst (ocaml#857) 39eb7f9 flambda-backend: Remove integer comparison involving nonconstant polymorphic variants (ocaml#854) c102688 flambda-backend: Fix soundness bug by using currying information from typing (ocaml#850) 6a96b61 flambda-backend: Add a primitive to enable/disable the tick thread (ocaml#852) f64370b flambda-backend: Make Obj.dup use a new primitive, %obj_dup (ocaml#843) 9b78eb2 flambda-backend: Add test for ocaml#820 (include functor soundness bug) (ocaml#841) 8f24346 flambda-backend: Add `-dtimings-precision` flag (ocaml#833) 65c2f22 flambda-backend: Add test for ocaml#829 (ocaml#831) 7b27a49 flambda-backend: Follow-up PR#829 (comballoc fixes for locals) (ocaml#830) ad7ec10 flambda-backend: Use a custom condition variable implementation (ocaml#787) 3ee650c flambda-backend: Fix soundness bug in include functor (ocaml#820) 2f57378 flambda-backend: Static check noalloc (ocaml#778) aaad625 flambda-backend: Emit begin/end region only when stack allocation is enabled (ocaml#812) 17c7173 flambda-backend: Fix .cmt for included signatures (ocaml#803) e119669 flambda-backend: Increase delays in tests/lib-threads/beat.ml (ocaml#800) ccc356d flambda-backend: Prevent dynamic loading of the same .cmxs twice in private mode, etc. (ocaml#784) 14eb572 flambda-backend: Make local extension point equivalent to local_ expression (ocaml#790) 487d11b flambda-backend: Fix tast_iterator and tast_mapper for include functor. (ocaml#795) a50a818 flambda-backend: Reduce closure allocation in List (ocaml#792) 96c9c60 flambda-backend: Merge ocaml-jst a775b88 flambda-backend: Fix ocaml/otherlibs/unix 32-bit build (ocaml#767) f7c2679 flambda-backend: Create object files internally to avoid invoking GAS (ocaml#757) c7a46bb flambda-backend: Bugfix for Cmmgen.expr_size with locals (ocaml#756) b337cb6 flambda-backend: Fix build_upstream for PR749 (ocaml#750) 8e7e81c flambda-backend: Differentiate is_int primitive between generic and variant-only versions (ocaml#749) git-subtree-dir: ocaml git-subtree-split: 25188da

bschommer added 2 commits August 12, 2016 13:32

Added expand spec type.

d41ebff

Expand expecets a function that takes a string and returns an array of string which then are parsed inplace of the string argument.

Added doc and tests.

9f81d3e

Comment string for the argumnet is added as well as a test for the new spec.

gasche added the approved label Sep 2, 2016

gasche removed the approved label Sep 3, 2016

bschommer added 2 commits September 5, 2016 10:55

Update recursive call to use original counter.

480b191

Instead of introducing a new counter the recursive call takes the original counter. Furthermore the documentation is updated to reflect the actually semantics of Expand.

Added missing " in comment.

c9b018e

Actually concatenate arrays.

6933f5e

Reusing the current pointer only works if the arrays are concatenated.

Replace !l with Array.length.

8c0b28c

This avoids forgetting to update the lenght of after the Array is updated.

Added missing Change entry.

2991fd6

Added parse_and_expand_argv_dynamic.

fee0a83

The new function takes a reference to the argv array and modifies it if an Expand spec is encountered. Also added tests for this.

bschommer mentioned this pull request Sep 29, 2016

[WIP] Implement responsefiles support PR7050 #748

Closed

Disallow Expand in several functions.

1f1e1cd

Since Expand modifies the array and if the array is not passed as reference the usage of Expand would break lead to problems when current is used.

ghost suggested changes Oct 7, 2016

View reviewed changes

ghost self-assigned this Oct 7, 2016

Make current mandatory and raise Invalid_arg.

edb3660

For the parse_and_expand_argv_dynamic function a current reference is now mandatory. As well as Invalid_arg is raised if Expand is used in any other function then parse_and_expand_argv_dynamic.

ghost approved these changes Oct 7, 2016

View reviewed changes

ghost merged commit 64ef11a into ocaml:trunk Oct 7, 2016

bschommer deleted the expand-option branch October 10, 2016 07:23

jberdine mentioned this pull request Aug 25, 2018

Avoid exhausting argument limit for windows cmd.exe ocaml/dune#1178

Closed

vicuna mentioned this pull request Mar 14, 2019

An "-args <path>" option to pass command-line flags in a file #7050

Closed

EduardoRFS pushed a commit to esy-ocaml/ocaml that referenced this pull request Dec 17, 2021

Merge pull request ocaml#778 from Engil/comment_on_bt_install

318b239

Comment on caml_domain_spawn also calling in install_backup_thread

stedolan pushed a commit to stedolan/ocaml that referenced this pull request Sep 21, 2022

Static check noalloc (ocaml#778)

c75a7f1

sadiqj pushed a commit to sadiqj/ocaml that referenced this pull request Feb 21, 2023

flambda-backend: Static check noalloc (ocaml#778)

2f57378

This pull request was closed.

Conversation

bschommer commented Aug 25, 2016

Uh oh!

alainfrisch commented Aug 25, 2016

Uh oh!

bschommer commented Aug 25, 2016

Uh oh!

gasche commented Sep 2, 2016

Uh oh!

gasche commented Sep 2, 2016

Uh oh!

bschommer commented Sep 3, 2016

Uh oh!

gasche commented Sep 3, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bschommer commented Sep 5, 2016

Uh oh!

gasche commented Sep 5, 2016

Uh oh!

bschommer commented Sep 5, 2016

Uh oh!

bschommer commented Sep 5, 2016

Uh oh!

bschommer commented Sep 5, 2016

Uh oh!

gasche commented Sep 5, 2016

Uh oh!

bschommer commented Sep 5, 2016

Uh oh!

gasche commented Sep 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bschommer commented Sep 6, 2016

Uh oh!

ghost commented Sep 9, 2016

Uh oh!

bschommer commented Sep 12, 2016

Uh oh!

bschommer commented Sep 13, 2016

Uh oh!

ghost commented Sep 15, 2016

Uh oh!

gasche commented Sep 15, 2016

Uh oh!

bschommer commented Sep 15, 2016

Uh oh!

gasche commented Sep 15, 2016

Uh oh!

bschommer commented Sep 15, 2016

Uh oh!

ghost commented Sep 15, 2016

Uh oh!

bschommer commented Sep 16, 2016

Uh oh!

ghost commented Sep 30, 2016

Uh oh!

bschommer commented Oct 5, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost Oct 7, 2016 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bschommer Oct 7, 2016

Choose a reason for hiding this comment

Uh oh!

ghost Oct 7, 2016

Choose a reason for hiding this comment

Uh oh!

bschommer Oct 7, 2016

Choose a reason for hiding this comment

Uh oh!

ghost Oct 7, 2016

Choose a reason for hiding this comment

Uh oh!

bschommer Oct 7, 2016

Choose a reason for hiding this comment

Uh oh!

ghost commented Oct 7, 2016

gasche commented Sep 3, 2016 •

edited

Loading

gasche commented Sep 5, 2016 •

edited

Loading

bschommer commented Oct 5, 2016 •

edited

Loading

ghost Oct 7, 2016 •

edited by ghost

Loading