Skip to content

Rigorously handle -o and -c options in presence of multiple arguments#464

Closed
whitequark wants to merge 3 commits intoocaml:trunkfrom
whitequark:output-name
Closed

Rigorously handle -o and -c options in presence of multiple arguments#464
whitequark wants to merge 3 commits intoocaml:trunkfrom
whitequark:output-name

Conversation

@whitequark
Copy link
Member

Please see the individual (very elaborate) commit messages for motivation and justifications for this PR.

The behavior of ocamlc and ocamlopt drivers before this commit is
that the command-line options and arguments are processed exactly
sequentially; encountered options (e.g. "-o") modify the state of
the driver, and encountered arguments (e.g. "t.ml") compile
the corresponding file with whatever state the driver had at the time.

This can be quite confusing, because compiler drivers (e.g. gcc/g++,
clang/clang++, rustc, javac, go, ...) either parse the entire command
line before going on to compile files or reject options after
the first argument (only in the case of go). Thus the behavior
of ocamlc and ocamlopt is unexpected.

The following commit provides another reason for this change.
This addresses PR#6475.

In 4.02 the behavior of ocamlc/ocamlopt with regards to these options
was as follows:
  * options and arguments are parsed left-to-right in the exact order
    in which they are passed, with compilation taking into account
    only the options leftwards from it;
  * "foo.c" is compiled to "foo.o" in current directory;
  * when "-c" is not specified:
    * "foo.ml" is compiled to "foo.cmo"/"foo.cmxo"
      in current directory;
    * after all files have been compiled, if any .ml files are passed,
      all provided files are linked as:
      * when "-o" is not specified: "a.out" in current directory;
      * when "-o out" is specified: "out".
  * when "-c" is specified:
    * "foo.ml" is compiled to:
      * when "-o" is not specified: "foo.cmo"/"foo.cmxo"
        in current directory;
      * when "-o out" is specified: "out.cmo"/"out.cmxo";
        and then compilation proceeds as if the last "-o" option
        has disappeared.
    * no final link is performed.

The behavior where the build product of the C sources always ended up
in the current directory was problematic: it required buildsystem
hacks to move the file in its proper place and ultimately was racy,
as multiple files with the same basename in different directories
may well end up overwriting each other with e.g. ocamlbuild.

On top of that, the behavior was quite confusing, since it is not
only stateful and dependent on argument order, but also the mere act
of compilation changed state.

The commit 1d8e590 has attempted to rectify that by looking at
the "-o" option when compiling C files, too. After that commit,
the behavior of ocamlc/ocamlopt was as follows (only the handling
of C files was changed, but the entire chart is provided for
posterity):
  * options and arguments are parsed left-to-right in the exact order
    in which they are passed, with compilation taking into account
    only the options leftwards from it;
  * "foo.c" is compiled to:
    * when "-o" is not specified: "foo.o" in current directory;
    * when "-o out" is specified: "out".
  * when "-c" is not specified:
    * "foo.ml" is compiled to "foo.cmo"/"foo.cmxo"
      in current directory;
    * after all files have been compiled, if any .ml files are passed,
      all provided files are linked as:
      * when "-o" is not specified: "a.out" in current directory;
      * when "-o out" is specified: "out".
  * when "-c" is specified:
    * "foo.ml" is compiled to:
      * when "-o" is not specified: "foo.cmo"/"foo.cmxo"
        in current directory;
      * when "-o out" is specified: "out.cmo"/"out.cmxo";
        and then compilation proceeds as if the last "-o" option
        has disappeared.
    * no final link is performed.

There is a non-obvious bug here. Specifically, what happens if more
than one C source file is passed together with a "-o" option? Also,
what happens if a C source file is passed together with a "-o" option
and then a final link is performed? The answer is that
the intermediate build product gets silently overwritten, with quite
opaque errors as a result.

There is some code (and even buildsystems) in the wild that is relying
on the fact that the -o option does not affect compilation of C source
files, e.g. by running commands such as (from ocamlnet):
  ocamlc -custom -o t tend.c t.ml

It might seem that the solution would be to make the behavior of
the compiler drivers for C files match that for the OCaml files;
specifically, pretend that the "-o" option has disappeared once
the C compiler has written a build product to the specified place.

However, this would still break the existing code, and moreover
does not address the underlying problem: that the option parsing
of the OCaml compiler driver is confusing and prone to creating
latent bugs.

Instead, this commit finishes (after 1d8e590 and 55d2d42) overhauls
the way option parsing in ocamlc/ocamlopt works to behave as follows:
  * options are parsed left-to-right in the order they are specified;
  * after all options are parsed, arguments are parsed left-to-right
    in the order they were specified;
  * when "-o out" and "-c" are specified:
    * when more than one file is passed, an error message
      is displayed.
    * when one file is passed:
      * "foo.c" is compiled to "out";
      * "foo.ml" is compiled to "out.cmo"/"out.cmxo".
  * when "-o out" is not specified or "-c" is not specified:
    * "foo.c" is compiled to "foo.o" in current directory;
    * "foo.ml" is compiled to "foo.cmo"/"foo.cmxo"
      in current directory;
  * when "-c" is not specified:
    * after all files have been compiled, if any .ml files are passed,
      all provided files are linked as:
      * when "-o" is not specified: "a.out" in current directory;
      * when "-o out" is specified: "out".

In short, the combination of "-o", "-c" and a single source file
compiles that one file to the corresponding intermediate
build product. Otherwise, passing "-o" will either set the name of
the final build product only or error out.

This preserves compatibility with old code, makes the handling of
C and OCaml sources consistent, and overall makes the behavior
of the option parser more straightforward. However, this may break
code that relies on the fact that options are parsed in-order, e.g.
  ocamlc -o t a.ml -g b.ml
where debug info would be built only for "b.ml".

Some alternative implementation paths I have considered:
  * Collect the C sources and process them after OCaml sources,
    while paying attention to any "-o" or "-c" that may have
    been set. This doesn't work because compilation of C sources
    is also affected by many flags, e.g. "-I", and so this would
    have the same drawbacks but none of the benefits;
  * Compile C and OCaml sources in-order as usual, but error out
    when an improper combination of flags is encountered in
    the middle of a compilation. This is technically feasible,
    and is the option that maximally preserves compatibility, but
    it is very complex: it doubles the amount of implicitly mutated
    global state, and there's no guarantee I will get all edge
    cases right. Moreover, the option parsing remains confusing,
    and I strongly believe that the current behavior should not
    remain in place.

On top of that it is hard to imagine cases where setting new options
in the middle of compilation would actually be desirable, because
this mechanism is very inexpressive: it can only add new options and
option values, since there is no way to negate or clear most of
the driver's state. Most likely is that any code that does so,
does it in error and remains operational by pure chance.
@damiendoligez
Copy link
Member

Interpret all command-line options before compiling any files.

I agree this is the right way to do it, but we can't merge such an incompatible change this late into the release cycle.

@whitequark
Copy link
Member Author

Sigh. Well, what can I do? Revert it.

@damiendoligez damiendoligez added this to the 4.04-or-later milestone Feb 10, 2016
@damiendoligez
Copy link
Member

We'll merge into trunk it after branching 4.03. Please put a * in the Changes entry.

@damiendoligez
Copy link
Member

Reverted (commit 1c28b23).

@gasche there are thunks of 1d8e590 that were in ocamlbuild, I'll do a pull request against ocamlbuild.

(edit -- done: ocaml/ocamlbuild#51)

@lpw25
Copy link
Contributor

lpw25 commented Feb 10, 2016

The revert missed the ~output_name parameter of compile_file, so now trunk does not compile.

@whitequark
Copy link
Member Author

I'll add the * after branching, since it should probably go into a different changelog section and I'll rather rebase once (hashes will need updating).

@gasche
Copy link
Member

gasche commented Jun 11, 2016

Can we merge this now for 4.04? @damiendoligez ?

@damiendoligez
Copy link
Member

Can we have the * in Changes? Other than that, good to merge.

@gasche
Copy link
Member

gasche commented Jun 14, 2016

I'll do the rebase and merge, thanks.

@gasche gasche closed this Jun 14, 2016
@whitequark whitequark deleted the output-name branch June 15, 2016 02:54
@lefessan
Copy link
Contributor

cf #758

@gasche
Copy link
Member

gasche commented Aug 12, 2016

opam-builder reports a weird failure on bitstring.2.0.4 that seems related to this change:

http://opam.ocamlpro.com/builder/html/bitstring/bitstring.2.0.4/3f885c72f293b8a5250a43370efd568b

# ocamlfind ocamlc bitstring.cma -I +camlp4 dynlink.cma camlp4lib.cma \
#   -pp camlp4of -c pa_bitstring.ml -o pa_bitstring.cmo
# Options -c -o are incompatible with compiling multiple files
# Makefile:113: recipe for target 'pa_bitstring.cmo' failed

I have a very poor network connection and am thus unable to download the sources of bitstring 2.0.4 to investigate: https://code.google.com/archive/p/bitstring/downloads.

@gasche
Copy link
Member

gasche commented Aug 12, 2016

A related failure on dht, that is not a bug or not suspicious as the bitstring failure reported above. I report it here as it is representative of build failures caused by this change:

http://opam.ocamlpro.com/builder/html/dht/dht.0.2.0/94d71b07c2221b31dfcef74d0aeeab09

# ocamlfind ocamlc -safe-string -g -bin-annot -I lib/ -o lib/dht.cmo -c lib/dht.mli lib/dht.ml
# Options -c -o are incompatible with compiling multiple files
# Makefile:20: recipe for target 'lib/dht.cmo' failed

cc @nojb

@nojb
Copy link
Contributor

nojb commented Aug 12, 2016

Thanks for the ping @gasche, I can't take a look right now but will do it a little later.

readenv ppf Before_args;
Arg.parse Options.list anonymous usage;
if !output_name <> None && !compile_only &&
List.length !process_thunks > 1 then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@damiendoligez
Oups, this check will fail with the fix that we did yesterday, since -cclib will increase the length of process_thunks. But there was already a bug before, since ocamlc -c toto.ml -o toto.cmo toto.a will also be rejected since toto.a is also in process_thunks.

I think we should remove the functions in process_thunks, and use a type with multiple constructors, so that we can be more precise here.

@lefessan
Copy link
Contributor

Indeed, both errors are triggered by the new error message introduced by this PR, because the criteria List.length !process_thunks > 1 is not strict enough (it will fail when bitstring.cma and foo.ml are both provided, even if only foo.ml has to be compiled).

@lefessan
Copy link
Contributor

cf #761

@damiendoligez
Copy link
Member

Indeed, both errors are triggered by the new error message introduced by this PR, because the criteria List.length !process_thunks > 1 is not strict enough (it will fail when bitstring.cma and foo.ml are both provided, even if only foo.ml has to be compiled).

I don't think it's a bug per se : it doesn't make sense to give both -c and a library on the command line. OCaml 4.03 just ignores the library, while 4.04 with this patch signals an error. I think the new behavior makes more sense but of course it's not backward-compatible.

EduardoRFS pushed a commit to esy-ocaml/ocaml that referenced this pull request May 17, 2021
Gbury pushed a commit to Gbury/ocaml that referenced this pull request Jun 10, 2021
* Add a regression test

* Record correct use kind for switch actions

* Clear demoted trap actions in switch

* Update .depend

* Add new test dirs to CI tests
Gbury pushed a commit to Gbury/ocaml that referenced this pull request Jun 22, 2021
* Add a regression test

* Record correct use kind for switch actions

* Clear demoted trap actions in switch

* Update .depend

* Add new test dirs to CI tests
chambart pushed a commit to chambart/ocaml-1 that referenced this pull request Feb 1, 2022
Replaces some annoying faff with some less-annoying faff. Previously
you had to edit `middle_end/flambda2/parser/dune` if you wanted to
change the parser. Now you can just make changes and say
`make regen-flambda2-parser`. The Dune file has to go through the
diff/promotion cycle to accomplish this without making
`flambda_parser.ml` and friends targets (which forces them to be
built either always or never, depending on the mode).
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
* generalize hero component
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants