Skip to content

More robust determination of full path to current executable#795

Merged
gasche merged 3 commits intoocaml:trunkfrom
xavierleroy:fix-exe-name
Sep 15, 2016
Merged

More robust determination of full path to current executable#795
gasche merged 3 commits intoocaml:trunkfrom
xavierleroy:fix-exe-name

Conversation

@xavierleroy
Copy link
Contributor

The OCaml runtime system has some system-dependent code (Linux, Windows) to recover the full path to the current executable, in ways that are more reliable than searching argv[0] in $PATH.

Previously, this code worked only for paths of 256 characters or less, owing to the static allocation of a buffer.

Some OCaml users informed us that this limitation was cramping their style and that executables with very long names indeed are important to their business. "Surely you must be joking?" questions were answered with utter seriousness.

Always happy to help OCaml users do the weirdest things, I offer in this PR a reimplementation of these parts of the runtime where the buffer for the full path name is allocated dynamically and grows on demand.

As a bonus, special code to determine the path to the executable was added for Mac OS X as well, courtesy of informative StackOverflow posts.

The new code was tested under Linux, but not even compiled under Mac OS X nor under Windows. Caveat emptor.

…table

In several places, the old code was limiting this file name to 256 characters at most.  At least one user is bothered by this restriction.

- Change API of caml_executable_name() to return a string dynamically allocated with caml_stat_alloc.
- Update implementation of caml_executable_name() byterun/unix.c and byterun/win32.c to handle file names of (almost) arbitrary length.
- Add special code for MacOS X, not tested yet.

Fun fact of the day: under (some versions of) Linux, lstat() on "/proc/self/exe" returns 0 as length, so we can't use it to determine the size of the name in advance.
asmrun/startup.c Outdated
else
exe_name = caml_search_exe_in_path(exe_name);
caml_sys_init(exe_name, argv);
caml_stat_free(exe_name);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this (caml_stat_free) lead to a use-after-free error when exe_name is subsequently accessed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, caml_sys_init doesn't make a copy of exe_name, I missed that. Will fix later.

@gasche
Copy link
Member

gasche commented Sep 5, 2016

I used the CI machines to check that the compiler builds fine under MacOS, mingw and mscvc machines with the proposed changes. Could you maybe include a test that uses a super-duper-long executable name to actually test the new behavior?

There is a very suspicious CI failure on openbsd32, a segfault while running an innocuous typing test of the testsuite: see "Segmentation fault" in the logs https://ci.inria.fr/ocaml/view/precheck/job/precheck-openbsd-32/2/consoleText . I don't know where this comes from, and asked for a new build to see if it is reproducible but the results are not in yet.

Edit: the new openbsd32 build succeeded, so the observed segmentation fault (while testing basic-more/top_level_patterns.ml with ocamlopt) seems to be a transient error. Curiouser and Curiouser, but probably unrelated to the present PR.

@yminsky
Copy link

yminsky commented Sep 5, 2016

In case anyone is curious, the crazy organization that runs into this is Jane Street. The problem comes up when we take a large change and break it into a long sequence of small patches, each dependent on the last. This is represented in our code review system as nested directories, one per patch. We don't often cross the 256 character limit, but when someone does, it's annoying.

So, while we're not joking, it doesn't mean we don't appreciate that our request is a little absurd.

Anyway, thanks for the patch!

caml_sys_init does not copy its first argument and just keeps the pointer.

(Reported by Jeremy Yallop.)
@xavierleroy
Copy link
Contributor Author

Ping!

  • I fixed the use-after-free bug.
  • I really think the code is good to go.
  • I'd rather not merge my own pull requests.
  • I'd rather not maintain my pull request.

So: please merge in trunk. Thank you.

@gasche
Copy link
Member

gasche commented Sep 15, 2016

Thanks for the ping, merged.

@gasche gasche merged commit 70d880a into ocaml:trunk Sep 15, 2016
@alainfrisch
Copy link
Contributor

What about adding a non-regression test and a Changes entry?

@xavierleroy xavierleroy deleted the fix-exe-name branch September 15, 2016 12:31
@damiendoligez
Copy link
Member

I don't think a regression test is really necessary. I've added the Changes entry (commit 12bd764).

name = caml_stat_alloc(namelen + 1);
retcode = readlink("/proc/self/exe", name, namelen);
if (retcode == -1) { caml_stat_free(name); return NULL; }
if (retcode <= namelen) break;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this check right? If readlink has to truncate its response, won't retcode == namelen?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it can very well happen that more bytes than necessary were allocated, so no truncation is necessary, in which case you can have retcode < namelen.

Copy link

@ianthehenry ianthehenry Aug 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but it seems like the only time that we want to break out of the loop is when retcode < namelen -- if retcode == namelen that's a sign that we want to try again with a longer name buffer.

Rephrase: if I'm understanding correctly, I think that this branch is always taken, and we never actually resize the buffer (readlink should never return retcode > namelen).

(The win32 implementation seems to have the right comparison)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sounds right. In fact, wouldn't it be even better to pass namelen + 1 as third parameter of readlink and leave the comparison as it is ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that is even better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed on trunk, commit e49e6ce, and cherry-picked to 4.05, commit 44c67eb.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if we leave the comparison as it is and we break out of the loop it means that retcode <= namelen so that there is space for the terminating 0, right? Using retcode < namelen as exit test seems slightly less efficient because there could be some cases where retcode == namelen but not due to truncation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, you're both right. I think the clearer code is to have namelen as the size of the buffer and as the 3rd parameter to readlink, and to break when retcode < namelen, which indeed leaves room for the final zero. Pushing this improved fix in a minute...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commits 7a315bd on trunk and b5610e3.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah. That's a lot more clear.

camlspotter pushed a commit to camlspotter/ocaml that referenced this pull request Oct 17, 2017
More robust determination of full path to current executable
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Sep 21, 2022
The initial implementation missed that we must iterate over the coercion inside
the new include_kinds
sadiqj pushed a commit to sadiqj/ocaml that referenced this pull request Feb 21, 2023
…r. (ocaml#795)

The initial implementation missed that we must iterate over the coercion inside
the new include_kinds
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Mar 21, 2023
25188da flambda-backend: Missed comment from PR802 (ocaml#887)
9469765 flambda-backend: Improve the semantics of asynchronous exceptions (new simpler version) (ocaml#802)
d9e4dd0 flambda-backend: Fix `make runtest` on NixOS (ocaml#874)
4bbde7a flambda-backend: Simpler symbols (ocaml#753)
ef37262 flambda-backend: Add opaqueness to Obj.magic under Flambda 2 (ocaml#862)
a9616e9 flambda-backend: Add build system hooks for ocaml-jst (ocaml#869)
045ef67 flambda-backend: Allow the compiler to build with stock Dune (ocaml#868)
3cac5be flambda-backend: Simplify Makefile logic for natdynlinkops (ocaml#866)
c5b12bf flambda-backend: Remove unnecessary install lines (ocaml#860)
ff12bbe flambda-backend: Fix unused variable warning in st_stubs.c (ocaml#861)
c84976c flambda-backend: Static check for noalloc: attributes (ocaml#825)
ca56052 flambda-backend: Build system refactoring for ocaml-jst (ocaml#857)
39eb7f9 flambda-backend: Remove integer comparison involving nonconstant polymorphic variants (ocaml#854)
c102688 flambda-backend: Fix soundness bug by using currying information from typing (ocaml#850)
6a96b61 flambda-backend: Add a primitive to enable/disable the tick thread (ocaml#852)
f64370b flambda-backend: Make Obj.dup use a new primitive, %obj_dup (ocaml#843)
9b78eb2 flambda-backend: Add test for ocaml#820 (include functor soundness bug) (ocaml#841)
8f24346 flambda-backend: Add `-dtimings-precision` flag (ocaml#833)
65c2f22 flambda-backend: Add test for ocaml#829 (ocaml#831)
7b27a49 flambda-backend: Follow-up PR#829 (comballoc fixes for locals) (ocaml#830)
ad7ec10 flambda-backend: Use a custom condition variable implementation (ocaml#787)
3ee650c flambda-backend: Fix soundness bug in include functor (ocaml#820)
2f57378 flambda-backend: Static check noalloc (ocaml#778)
aaad625 flambda-backend: Emit begin/end region only when stack allocation is enabled (ocaml#812)
17c7173 flambda-backend: Fix .cmt for included signatures (ocaml#803)
e119669 flambda-backend: Increase delays in tests/lib-threads/beat.ml (ocaml#800)
ccc356d flambda-backend: Prevent dynamic loading of the same .cmxs twice in private mode, etc. (ocaml#784)
14eb572 flambda-backend: Make local extension point equivalent to local_ expression (ocaml#790)
487d11b flambda-backend: Fix tast_iterator and tast_mapper for include functor. (ocaml#795)
a50a818 flambda-backend: Reduce closure allocation in List (ocaml#792)
96c9c60 flambda-backend: Merge ocaml-jst
a775b88 flambda-backend: Fix ocaml/otherlibs/unix 32-bit build (ocaml#767)
f7c2679 flambda-backend: Create object files internally to avoid invoking GAS (ocaml#757)
c7a46bb flambda-backend: Bugfix for Cmmgen.expr_size with locals (ocaml#756)
b337cb6 flambda-backend: Fix build_upstream for PR749 (ocaml#750)
8e7e81c flambda-backend: Differentiate is_int primitive between generic and variant-only versions (ocaml#749)

git-subtree-dir: ocaml
git-subtree-split: 25188da
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
Very few grammar suggestions. Looks great!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants