Skip to content

MPR#7037: fix erroneous Location.input_name setting in Pparse.file#862

Merged
gasche merged 1 commit intoocaml:trunkfrom
gasche:PR7037-input_name-in-Pparse
Mar 8, 2017
Merged

MPR#7037: fix erroneous Location.input_name setting in Pparse.file#862
gasche merged 1 commit intoocaml:trunkfrom
gasche:PR7037-input_name-in-Pparse

Conversation

@gasche
Copy link
Member

@gasche gasche commented Oct 17, 2016

This patch was rotting in MPR#7037, so I decided to make a pull-request for it. I might have merged it directly, but it is relatively invasive: it affects Pparse so there is a risk of other users and tool being affected. Cc: @alainfrisch

When reading a serialized-ast file, Pparse.file sets
Location.input_name to the filename embedded in the AST, and this is
correct. But before this patch it would also set Location.input_name
to the filename if this is a regular file, and this is wrong: this
filename may be a temporary file used for preprocessing (-pp option),
with a randomly-generated name, while Location.input_name is in fact
already correctly set to the user-provided source path.

I needed to fix two lines in ocamldoc/odoc_analyze.ml that
used !Location.input_file but actually required access to the
post-processing file (ocamldoc re-opens source files and rereads them
to detect documentation comments). This is not an invasive change as
the path to the post-processing file is available at this point in the
code (as the input_file variable).

This ocamldoc issue was caught thanks to Debian downstream work in bug
triaging ( mlpost breaks if it is not fixed, see
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=802347 ) and package
maintenance (this bug would not have been found if Debian maintainers
had not kept the mlpost documentation generator working even after
non-trivial ocamldoc changes).

When reading a serialized-ast file, Pparse.file sets
Location.input_name to the filename embedded in the AST, and this is
correct. But before this patch it would also set Location.input_name
to the filename if this is a regular file, and this is wrong: this
filename may be a temporary file used for preprocessing (-pp option),
with a randomly-generated name, while Location.input_name is in fact
already correctly set to the user-provided source path.

I needed to fix two lines in ocamldoc/odoc_analyze.ml that
used !Location.input_file but actually required access to the
post-processing file (ocamldoc re-opens source files and rereads them
to detect documentation comments). This is not an invasive change as
the path to the post-processing file is available at this point in the
code (as the [input_file] variable).

This ocamldoc issue was caught thanks to Debian downstream work in bug
triaging ( mlpost breaks if it is not fixed, see
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=802347 ) and package
maintenance (this bug would not have been found if Debian maintainers
had not kept the mlpost documentation generator working even after
non-trivial ocamldoc changes).
@gasche gasche added this to the 4.05-or-later milestone Oct 17, 2016
@gasche gasche changed the title [PR#7037](http://caml.inria.fr/mantis/view.php?id=7037): fix erroneous Location.input_name setting in Pparse.file MPR#7037: fix erroneous Location.input_name setting in Pparse.file Oct 17, 2016
@mshinwell
Copy link
Contributor

@alainfrisch Please could you comment?

@alainfrisch
Copy link
Contributor

Some observations:

  • In the compiler, input_name is used only to report "unit-wide" error messages (cf !input_name in Location).

  • It could be used in other tools such as ocamldoc (which you treated) or ppx rewriters (e.g. to insert the name of the source file somewhere in the Parsetree).

  • The suggested change impacts Pparse.{parse_implementation,parse_interface} (leaving Location.input_name equal to the original file name rather than some intermediate post-processed source file), and also Pparse.file (used by ocamldoc and ocamldep, but also perhaps by external tools), which leaves input_name untouched (i.e. possibly undefined).

It's hard to evaluate the risk of introducing problems in external tools with the change. The risk might be reduced a bit if Location.input_name is set in Location.preprocess (which is usually called explicitly before Pparse.file) rather than in Location.parse_file. Typically, this might avoid possible problems with error reporting in ocamldep with the current proposal.

Now, is the change a good idea in itself? I'd say "yes", although I can also see the argument that more local errors would be reported on a filename which is the temporary file (if a preprocessor outputs a source file -- without line directives -- not a marshaled Parsetree), and it might not be very coherent to treat global errors differently. So I'm not sure.

@gasche
Copy link
Member Author

gasche commented Jan 3, 2017

In the case of both ocamldep and ocamlopt, Location.input_name is explicitly set before any Pparse function is called. If we set it in preprocess, that means that we could override this value in a way that is wrong (for example if the input was copied to a temporary _build directory first?).

I am not sure how other tools handle this. I couldn't find any code calling Pparse.file on github.

(I am skeptical about the "more local errors" argument: in all cases I have interacted with, talking about intermediate files they don't know about (especially when they have random names, but _build/foo was also an issue) was perceived as a usability bug by users. If the tool use Pparse is done well, it should report its errors on the post-processed file using error positions that refer to the source code, so mentioning the source path is the way to go.)

@alainfrisch
Copy link
Contributor

If the tool use Pparse is done well, it should report its errors on the post-processed file using error positions that refer to the source code, so mentioning the source path is the way to go.)

How could it do that? If a preprocessor outputs a source file without line directives, the "driver" tools has no way to map an error in this generated file back to a location in the original unprocessed file. At least if we show the temporary file and keep this file around, one can see what the error actually is. (It could also be useful to show the original filename as well.)

@Drup
Copy link
Contributor

Drup commented Jan 3, 2017

If you are worry about tools calling the wrong functions, maybe they should be documented ?

In particular, I had no idea that half of Pparse existed (since it's not in parsing/), the last time I needed to read an ml file, I did it like that (and it took me some time to realize I needed to set Location.input_name to have decent error messages).

@alainfrisch
Copy link
Contributor

Yeah, Pparse has been nicely improved through the use of GADTs, but the API remains a bit of of under-documented mess. It would make sense to move its "modern" functionalities to a new, well-documented module under parsing/ and deprecate Pparse itself. Contributions are welcome!

@damiendoligez damiendoligez modified the milestone: 4.05-or-later Feb 15, 2017
@mshinwell
Copy link
Contributor

@gasche Can you try to bring this pull request to a conclusion? I've asked @diml to look at this too since he has a better idea of what might break. (Not that compiler-libs has any stability guarantees, IIUC.)

@gasche
Copy link
Member Author

gasche commented Mar 8, 2017

My gut feeling is that "making it easier to get reproducible builds of OCaml project" is useful and offsets the pain of asking unknown Pparse-using OCaml tools to set their input locations explicitly.

@gasche gasche merged commit 6e94a85 into ocaml:trunk Mar 8, 2017
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Oct 25, 2022
sadiqj pushed a commit to sadiqj/ocaml that referenced this pull request Feb 21, 2023
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Mar 21, 2023
25188da flambda-backend: Missed comment from PR802 (ocaml#887)
9469765 flambda-backend: Improve the semantics of asynchronous exceptions (new simpler version) (ocaml#802)
d9e4dd0 flambda-backend: Fix `make runtest` on NixOS (ocaml#874)
4bbde7a flambda-backend: Simpler symbols (ocaml#753)
ef37262 flambda-backend: Add opaqueness to Obj.magic under Flambda 2 (ocaml#862)
a9616e9 flambda-backend: Add build system hooks for ocaml-jst (ocaml#869)
045ef67 flambda-backend: Allow the compiler to build with stock Dune (ocaml#868)
3cac5be flambda-backend: Simplify Makefile logic for natdynlinkops (ocaml#866)
c5b12bf flambda-backend: Remove unnecessary install lines (ocaml#860)
ff12bbe flambda-backend: Fix unused variable warning in st_stubs.c (ocaml#861)
c84976c flambda-backend: Static check for noalloc: attributes (ocaml#825)
ca56052 flambda-backend: Build system refactoring for ocaml-jst (ocaml#857)
39eb7f9 flambda-backend: Remove integer comparison involving nonconstant polymorphic variants (ocaml#854)
c102688 flambda-backend: Fix soundness bug by using currying information from typing (ocaml#850)
6a96b61 flambda-backend: Add a primitive to enable/disable the tick thread (ocaml#852)
f64370b flambda-backend: Make Obj.dup use a new primitive, %obj_dup (ocaml#843)
9b78eb2 flambda-backend: Add test for ocaml#820 (include functor soundness bug) (ocaml#841)
8f24346 flambda-backend: Add `-dtimings-precision` flag (ocaml#833)
65c2f22 flambda-backend: Add test for ocaml#829 (ocaml#831)
7b27a49 flambda-backend: Follow-up PR#829 (comballoc fixes for locals) (ocaml#830)
ad7ec10 flambda-backend: Use a custom condition variable implementation (ocaml#787)
3ee650c flambda-backend: Fix soundness bug in include functor (ocaml#820)
2f57378 flambda-backend: Static check noalloc (ocaml#778)
aaad625 flambda-backend: Emit begin/end region only when stack allocation is enabled (ocaml#812)
17c7173 flambda-backend: Fix .cmt for included signatures (ocaml#803)
e119669 flambda-backend: Increase delays in tests/lib-threads/beat.ml (ocaml#800)
ccc356d flambda-backend: Prevent dynamic loading of the same .cmxs twice in private mode, etc. (ocaml#784)
14eb572 flambda-backend: Make local extension point equivalent to local_ expression (ocaml#790)
487d11b flambda-backend: Fix tast_iterator and tast_mapper for include functor. (ocaml#795)
a50a818 flambda-backend: Reduce closure allocation in List (ocaml#792)
96c9c60 flambda-backend: Merge ocaml-jst
a775b88 flambda-backend: Fix ocaml/otherlibs/unix 32-bit build (ocaml#767)
f7c2679 flambda-backend: Create object files internally to avoid invoking GAS (ocaml#757)
c7a46bb flambda-backend: Bugfix for Cmmgen.expr_size with locals (ocaml#756)
b337cb6 flambda-backend: Fix build_upstream for PR749 (ocaml#750)
8e7e81c flambda-backend: Differentiate is_int primitive between generic and variant-only versions (ocaml#749)

git-subtree-dir: ocaml
git-subtree-split: 25188da
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants