More robust determination of full path to current executable#795
More robust determination of full path to current executable#795gasche merged 3 commits intoocaml:trunkfrom
Conversation
…table In several places, the old code was limiting this file name to 256 characters at most. At least one user is bothered by this restriction. - Change API of caml_executable_name() to return a string dynamically allocated with caml_stat_alloc. - Update implementation of caml_executable_name() byterun/unix.c and byterun/win32.c to handle file names of (almost) arbitrary length. - Add special code for MacOS X, not tested yet. Fun fact of the day: under (some versions of) Linux, lstat() on "/proc/self/exe" returns 0 as length, so we can't use it to determine the size of the name in advance.
asmrun/startup.c
Outdated
| else | ||
| exe_name = caml_search_exe_in_path(exe_name); | ||
| caml_sys_init(exe_name, argv); | ||
| caml_stat_free(exe_name); |
There was a problem hiding this comment.
Won't this (caml_stat_free) lead to a use-after-free error when exe_name is subsequently accessed?
There was a problem hiding this comment.
Indeed, caml_sys_init doesn't make a copy of exe_name, I missed that. Will fix later.
|
I used the CI machines to check that the compiler builds fine under MacOS, mingw and mscvc machines with the proposed changes. Could you maybe include a test that uses a super-duper-long executable name to actually test the new behavior? There is a very suspicious CI failure on openbsd32, a segfault while running an innocuous typing test of the testsuite: see "Segmentation fault" in the logs https://ci.inria.fr/ocaml/view/precheck/job/precheck-openbsd-32/2/consoleText . I don't know where this comes from, and asked for a new build to see if it is reproducible but the results are not in yet. Edit: the new openbsd32 build succeeded, so the observed segmentation fault (while testing |
|
In case anyone is curious, the crazy organization that runs into this is Jane Street. The problem comes up when we take a large change and break it into a long sequence of small patches, each dependent on the last. This is represented in our code review system as nested directories, one per patch. We don't often cross the 256 character limit, but when someone does, it's annoying. So, while we're not joking, it doesn't mean we don't appreciate that our request is a little absurd. Anyway, thanks for the patch! |
caml_sys_init does not copy its first argument and just keeps the pointer. (Reported by Jeremy Yallop.)
|
Ping!
So: please merge in trunk. Thank you. |
|
Thanks for the ping, merged. |
|
What about adding a non-regression test and a Changes entry? |
|
I don't think a regression test is really necessary. I've added the Changes entry (commit 12bd764). |
| name = caml_stat_alloc(namelen + 1); | ||
| retcode = readlink("/proc/self/exe", name, namelen); | ||
| if (retcode == -1) { caml_stat_free(name); return NULL; } | ||
| if (retcode <= namelen) break; |
There was a problem hiding this comment.
Is this check right? If readlink has to truncate its response, won't retcode == namelen?
There was a problem hiding this comment.
Yes, but it can very well happen that more bytes than necessary were allocated, so no truncation is necessary, in which case you can have retcode < namelen.
There was a problem hiding this comment.
Right, but it seems like the only time that we want to break out of the loop is when retcode < namelen -- if retcode == namelen that's a sign that we want to try again with a longer name buffer.
Rephrase: if I'm understanding correctly, I think that this branch is always taken, and we never actually resize the buffer (readlink should never return retcode > namelen).
(The win32 implementation seems to have the right comparison)
There was a problem hiding this comment.
Yes, sounds right. In fact, wouldn't it be even better to pass namelen + 1 as third parameter of readlink and leave the comparison as it is ?
There was a problem hiding this comment.
But if we leave the comparison as it is and we break out of the loop it means that retcode <= namelen so that there is space for the terminating 0, right? Using retcode < namelen as exit test seems slightly less efficient because there could be some cases where retcode == namelen but not due to truncation.
There was a problem hiding this comment.
OK, you're both right. I think the clearer code is to have namelen as the size of the buffer and as the 3rd parameter to readlink, and to break when retcode < namelen, which indeed leaves room for the final zero. Pushing this improved fix in a minute...
More robust determination of full path to current executable
The initial implementation missed that we must iterate over the coercion inside the new include_kinds
…r. (ocaml#795) The initial implementation missed that we must iterate over the coercion inside the new include_kinds
25188da flambda-backend: Missed comment from PR802 (ocaml#887) 9469765 flambda-backend: Improve the semantics of asynchronous exceptions (new simpler version) (ocaml#802) d9e4dd0 flambda-backend: Fix `make runtest` on NixOS (ocaml#874) 4bbde7a flambda-backend: Simpler symbols (ocaml#753) ef37262 flambda-backend: Add opaqueness to Obj.magic under Flambda 2 (ocaml#862) a9616e9 flambda-backend: Add build system hooks for ocaml-jst (ocaml#869) 045ef67 flambda-backend: Allow the compiler to build with stock Dune (ocaml#868) 3cac5be flambda-backend: Simplify Makefile logic for natdynlinkops (ocaml#866) c5b12bf flambda-backend: Remove unnecessary install lines (ocaml#860) ff12bbe flambda-backend: Fix unused variable warning in st_stubs.c (ocaml#861) c84976c flambda-backend: Static check for noalloc: attributes (ocaml#825) ca56052 flambda-backend: Build system refactoring for ocaml-jst (ocaml#857) 39eb7f9 flambda-backend: Remove integer comparison involving nonconstant polymorphic variants (ocaml#854) c102688 flambda-backend: Fix soundness bug by using currying information from typing (ocaml#850) 6a96b61 flambda-backend: Add a primitive to enable/disable the tick thread (ocaml#852) f64370b flambda-backend: Make Obj.dup use a new primitive, %obj_dup (ocaml#843) 9b78eb2 flambda-backend: Add test for ocaml#820 (include functor soundness bug) (ocaml#841) 8f24346 flambda-backend: Add `-dtimings-precision` flag (ocaml#833) 65c2f22 flambda-backend: Add test for ocaml#829 (ocaml#831) 7b27a49 flambda-backend: Follow-up PR#829 (comballoc fixes for locals) (ocaml#830) ad7ec10 flambda-backend: Use a custom condition variable implementation (ocaml#787) 3ee650c flambda-backend: Fix soundness bug in include functor (ocaml#820) 2f57378 flambda-backend: Static check noalloc (ocaml#778) aaad625 flambda-backend: Emit begin/end region only when stack allocation is enabled (ocaml#812) 17c7173 flambda-backend: Fix .cmt for included signatures (ocaml#803) e119669 flambda-backend: Increase delays in tests/lib-threads/beat.ml (ocaml#800) ccc356d flambda-backend: Prevent dynamic loading of the same .cmxs twice in private mode, etc. (ocaml#784) 14eb572 flambda-backend: Make local extension point equivalent to local_ expression (ocaml#790) 487d11b flambda-backend: Fix tast_iterator and tast_mapper for include functor. (ocaml#795) a50a818 flambda-backend: Reduce closure allocation in List (ocaml#792) 96c9c60 flambda-backend: Merge ocaml-jst a775b88 flambda-backend: Fix ocaml/otherlibs/unix 32-bit build (ocaml#767) f7c2679 flambda-backend: Create object files internally to avoid invoking GAS (ocaml#757) c7a46bb flambda-backend: Bugfix for Cmmgen.expr_size with locals (ocaml#756) b337cb6 flambda-backend: Fix build_upstream for PR749 (ocaml#750) 8e7e81c flambda-backend: Differentiate is_int primitive between generic and variant-only versions (ocaml#749) git-subtree-dir: ocaml git-subtree-split: 25188da
Very few grammar suggestions. Looks great!
The OCaml runtime system has some system-dependent code (Linux, Windows) to recover the full path to the current executable, in ways that are more reliable than searching
argv[0]in$PATH.Previously, this code worked only for paths of 256 characters or less, owing to the static allocation of a buffer.
Some OCaml users informed us that this limitation was cramping their style and that executables with very long names indeed are important to their business. "Surely you must be joking?" questions were answered with utter seriousness.
Always happy to help OCaml users do the weirdest things, I offer in this PR a reimplementation of these parts of the runtime where the buffer for the full path name is allocated dynamically and grows on demand.
As a bonus, special code to determine the path to the executable was added for Mac OS X as well, courtesy of informative StackOverflow posts.
The new code was tested under Linux, but not even compiled under Mac OS X nor under Windows. Caveat emptor.