Skip to content

Allow linking empty .cmxa files on MSVC#9011

Merged
dra27 merged 3 commits intoocaml:trunkfrom
dra27:fix-msvc-empty-cmxa
Jun 24, 2020
Merged

Allow linking empty .cmxa files on MSVC#9011
dra27 merged 3 commits intoocaml:trunkfrom
dra27:fix-msvc-empty-cmxa

Conversation

@dra27
Copy link
Member

@dra27 dra27 commented Oct 3, 2019

MSVC .lib format doesn't support having no .obj files in the library, so ocamlopt -o foo.cmxa -a generates foo.cmxa but not foo.lib (there's no error from the Microsoft Linker). The resulting foo.cmxa is un-linkable, since OCaml passes foo.lib on to the linker. This doesn't affect cc-based toolchains since ar rather more logically allows empty files!

This patch relaxes the requirement for foo.lib if foo.cmxa contains no units.

This is part of a fix for ocaml/stdlib-shims#11

cc @db4

MSVC .lib format doesn't support having no .obj files in the library, so
ocamlopt -o foo.cmxa -a generates foo.cmxa but not foo.lib (there's no
error from the Microsoft Linker). The resulting foo.cmxa is unlinkable,
since OCaml passes foo.lib on to the linker.

This patch relaxes the requirement for foo.lib if the .cmxa contains no
units.
@dra27 dra27 requested a review from a user October 3, 2019 09:56
@dra27
Copy link
Member Author

dra27 commented Oct 3, 2019

I intentionally haven't done anything to stop the linker being called when the .cmxa is created - at present, lib silently does nothing. If in future Microsoft were to make it an error (unlikely) then that'd need patching, but if the less unlikely change of supporting empty .lib files were added, then it would magically work (and the existing code would be there for compatibility with older releases)

Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is something I don't understand with list reversals. Plus some suggestions below.

end else
reqd)
infos.lib_units tolink
in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For symmetry I suggest

   let tolink = ...
   and objfiles_rev = ...
   in (tolink, objfiles_rev)

The two computations are independent, as far as I can see.

Comment on lines 207 to 216
if Config.ccomp_type = "msvc"
&& infos.lib_units = []
&& not (Sys.file_exists (object_file_name obj_name)) then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would make sense to just test infos.lib_units = []. Even with non-MSVC toolchains, it makes sense to not pass an empty library to the linker, resulting in a shorter command line and less work for the linker.

As to testing that the file doesn't exist, this is to be commended as very prudent, but 1- it causes one extra system call per file to link, and 2- I can't imagine a scenario where object files would be added by hand to the library file that goes with an OCaml .cmxa file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no problem with doing that - the test I've put is just belt-and-braces to fix exactly the bug. The the .a file may have been subsequently altered (e.g. to add stub objects directly) - I've no knowledge of somewhere doing this, but do we definitely want to break some undefined behaviour while fixing this particular bug?

I did also consider whether it would be better to add a field to the .cmxa format which explicitly specifies the object to link - so on MSVC for an empty library, this field would explicitly be blank. In principle, .cmx format could do the same thing, although a .o and .obj file is always associated, even for an empty .ml file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I (potentially temporarily) left this as-is. This change alters this PR from a bugfix to a potentially breaking change (albeit by using largely unspecified behaviour of the linker).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I reading right that the third part of the test is here to protect users that were creating a .lib file by hand behind the back on the compiler on msvc? Do we really intend to support this use case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Octachron - it's more that I don't wish to intentionally break it in this PR. I believe there are some Mirage libraries which do this (although none which would hit this explicit case - i.e. an empty library with objects being added)

@dra27
Copy link
Member Author

dra27 commented Oct 6, 2019

Thanks, @xavierleroy! I'll push fixes later - what do you think of shifting the fix into .cmxa format?

@xavierleroy
Copy link
Contributor

what do you think of shifting the fix into .cmxa format?

I don't feel it would be a significant improvement. But it would be a bigger change. I like small changes :-)

@dra27
Copy link
Member Author

dra27 commented Oct 10, 2019

Code corrected (I hope).

@dra27 dra27 force-pushed the fix-msvc-empty-cmxa branch from b23aada to cef6c3d Compare October 15, 2019 09:05
@dra27 dra27 added this to the 4.10 milestone Oct 15, 2019
@damiendoligez
Copy link
Member

Note that there is a similar problem on Macos (#6550, #1094) which was fixed in a completely different way. Is there a way to merge the two fixes?

@dra27
Copy link
Member Author

dra27 commented Oct 15, 2019

No - the problem is not just that an empty .lib file isn't generated, it's also that the Microsoft Linker returns an error if you give it a .lib file containing no units (it's an error to try to remove the last one using the LIB tool as well)

@dra27 dra27 force-pushed the fix-msvc-empty-cmxa branch from cef6c3d to 21d4036 Compare October 19, 2019 15:26
@damiendoligez damiendoligez modified the milestones: 4.10, 4.11 Feb 20, 2020
@dra27 dra27 force-pushed the fix-msvc-empty-cmxa branch from 21d4036 to cfa0f06 Compare April 16, 2020 16:50
@xavierleroy
Copy link
Contributor

xavierleroy commented Apr 22, 2020

Note that there is a similar problem on Macos (#6550, #1094) which was fixed in a completely different way. Is there a way to merge the two fixes?

There is a way to simplify both fixes at the same time:

  • At creation time, if the library has zero members, just create an empty .a/.lib file (close_out (open_out f)), don't call ar/lib at all.
  • At link time, if the library has zero members, do not pass its .a/.lib file to the C linker.

That would be nicer than the current code + the current PR.

@dra27
Copy link
Member Author

dra27 commented Apr 22, 2020

@xavierleroy - I'm puzzled, this PR does exactly your second point, with the only difference that the test is made specific to MSVC, doesn't it? (see #9011 (comment) for why it's so specific). I don't at the moment see what I'm to change in this PR.

For #1094, we'd just be simplifying the three calls in Ccomp.macos_create_empty_archive.

However, a null foo.lib is a corrupt file - I'm not comfortable with the idea that we intentionally start generating files which the toolchain regards as corrupt, however weird the requirement of the toolchain may be. #1094 works around a bug where a toolchain doesn't generate a file it should be, which seems different.

Regardless, this PR should have at least one test, which I'm just doing...

@xavierleroy
Copy link
Contributor

  • I'm puzzled, this PR does exactly your second point, with the only difference that the test is made specific to MSVC, doesn't it?

Yes. But in order to simplify the macOS situation as well, linking of empty libraries should also be skipped on macOS. At this point, let's just skip it on all platforms.

What I'm aiming at is a platform-independent solution to both the MSVC and the macOS issues. Wouldn't this be nice?

@dra27
Copy link
Member Author

dra27 commented Apr 22, 2020

So I can simply remove the Config.ccomp_type = "msvc" in the test in asmlink L207 above?

It turns out an empty .a is corrupt too:

dra27@summer:~$ touch foo.a ; ar t foo.a
ar: foo.a: File format not recognised
dra27@summer:~$ ar rc bar.a ; ar t bar.a
dra27@summer:~$ cat bar.a
!<arch>

@dra27
Copy link
Member Author

dra27 commented Apr 22, 2020

OK, I think I've got hung on up on "simpler", but I think I follow where you're coming from. I certainly don't think we should be creating invalid files (so no zero-length files and no using other ways to create .lib files with no members).

MSVC forces us (and so build systems) to consider that foo.cmxa might not have foo.a/foo.lib so if we have MSVC not permitting empty archives and macOS unable to create them without trickery, then we could be entire platform independent - as you suggest - by not calling the archiver when there are no modules including on Linux.

@dra27
Copy link
Member Author

dra27 commented Apr 22, 2020

I have verified that the test fails on trunk built with msvc64

@alainfrisch
Copy link
Contributor

Not creating .a/.lib for empty libs seems natural enough, doesn't this complicate the life for build systems, who typically consider those .a/.lib files as dependencies for executables that link the libraries (and rightly so)?

@dra27
Copy link
Member Author

dra27 commented Apr 22, 2020

@alainfrisch - it does complicate it, yes, but if that build system wishes to support MSVC properly then it must complicate it. That said, a simple build system need only worry about the .cmxa, since the only time a simple build system has to interact with the .a/.lib file is to install it.

@dra27 dra27 force-pushed the fix-msvc-empty-cmxa branch from e9cd7ef to 6b0f113 Compare April 22, 2020 13:56
@alainfrisch
Copy link
Contributor

it does complicate it, yes, but if that build system wishes to support MSVC properly then it must complicate it.

Well, another option would be to make sure the .a/.lib is not really empty, by adding a dummy (but valid) .o/.obj file to it.

That said, a simple build system need only worry about the .cmxa, since the only time a simple build system has to interact with the .a/.lib file is to install it.

I don't understand. If you define a rule to build an executable from a set of modules and libraries, the build system has to be aware of which files are used to build the resulting executable, which include the .a/.lib file for each library, no?

@dra27
Copy link
Member Author

dra27 commented Apr 22, 2020

There's no need for a time-stamp based build system compiling an OCaml program ever to worry about .a and .o dependencies. Depending on .cmx and .cmxa is sufficient and ocamlopt hides all the details (cf. the venerable OCamlMakefile - executables depend on the .cmxa, it's only the install rules which pull in .$(EXT_LIB) files). Checksum based build systems are not simple 😉

My strong opinions on this are that we should allow empty .cmxa files and we should not generate invalid files in other formats (so no zero-byte .a or .lib files). I don't have a strong opinion as to which of the two possible ways to fix it should be done. The primary user of this feature is stdlib-shims, and Dune is already supports this (it was part of Dune 2.0). I lean more towards fixing a very small number of build systems once for something which should then never change vs a slightly dirtier hack (what you propose is basically what the macOS workaround does) which if it does affect a package will affect it in a weird way (conflict with the dummy name, or anything like that)

@xavierleroy
Copy link
Contributor

The problem is not with the build system but with the "install" Makefile entries, ocamlfind install, and the like. My suggestion to produce empty files with .a/.lib suffixes was so that these installers have something to copy. Of course no linker will ever be given those fake .a/.lib files as argument.

@dra27
Copy link
Member Author

dra27 commented Apr 22, 2020

Sure, but that's either one of two easy fixes in a Makefile, or an easy enough fix in ocamlfind for to understand that if foo.a can't be found and foo.cmxa is being installed then it's OK (and ocamlfind usually requires updating with every release of OCaml anyway). Is it just me who finds the idea of intentionally generating and installing a corrupt file distasteful? The Debian guys don't even like when we put valid extra bits in other people's file formats!

@alainfrisch
Copy link
Contributor

Checksum based build systems are not simple 😉

I was thinking about checksum based build systems indeed. The build system might not know whether a library is empty or not (in particular is not under the control of the same build system), and anyway, it might be non-trivial logic to add in order to support this case.

@dra27 dra27 force-pushed the fix-msvc-empty-cmxa branch from 6b0f113 to eab4c4f Compare June 23, 2020 15:55
@dra27
Copy link
Member Author

dra27 commented Jun 23, 2020

Rebased onto latest trunk. I'm indescribably keen for the actual bugfix in this PR to be in 4.11 (that's the ability to run echo 'print_endline "Hello, empty world!"' > bar.ml ; ocamlopt -a -o foo.cmxa ; ocamlopt -o bar foo.cmxa bar.ml on msvc64 without error which is what the first commit fixes (and the second commit adds a test for).

The final commit goes further and alters the behaviour of ocamlopt -a -o foo.cmxa so that foo.a is not generated either. This latter part seems to have stalled the PR. Please can I either have approval to merge this PR as-is, or split it into two, with the first part being approved soon?

Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure the files are now in the correct order, and I agree that not giving the linker archive files that were previously empty is a good thing to do.

I'm still a little worried about installation rules (in Makefiles or elsewhere) that systematically install a .a/.obj file for every library. These will break, but perhaps the "empty library" case occurs rarely enough that we don't care. I'm curious to know what Dune would do in this case, for example.

@dra27 dra27 force-pushed the fix-msvc-empty-cmxa branch from eab4c4f to e6ab329 Compare June 23, 2020 19:06
@dra27
Copy link
Member Author

dra27 commented Jun 23, 2020

Dune's been fixed since 2.0.0 - see ocaml/dune#2829

@dra27
Copy link
Member Author

dra27 commented Jun 23, 2020

Thanks @xavierleroy, I'll merge and cherry-pick once CI replies

@dra27
Copy link
Member Author

dra27 commented Jun 24, 2020

After a discussion with @Octachron, the plan is to merge this PR but only cherry-pick the first (bug-fix) commit to 4.11. The aforementioned Dune PR was written with the assumption that it was MSVC only, so it doesn't like Unix compilers not producing .a files. The fix for Dune is trivial (and I'm about to open the PR for it), but we think it's a little late in the release cycle for that change in 4.11.

So: 4.11.0 will be able to build stdlib-shims correctly on msvc and msvc64 and 4.12.0 gains the cleaned-up and simplified linker code.

@dra27 dra27 merged commit 7b6098a into ocaml:trunk Jun 24, 2020
dra27 added a commit that referenced this pull request Jun 24, 2020
Allow linking empty .cmxa files on MSVC
(cherry picked from commit 7b6098a)
dra27 added a commit that referenced this pull request Jun 24, 2020
@dra27 dra27 deleted the fix-msvc-empty-cmxa branch June 24, 2020 10:16
dbuenzli added a commit to dbuenzli/ocaml that referenced this pull request Sep 26, 2020
An analysis of the code should convince yourself that this function is
not needed and that its erroring paths are dead code.

The reasoning is the following: all the uses of [object_file_name] in
asmlink.ml are performed on [obj_infos] which went through [read_file]
before. The latter does exactly the same file name lookup and erroring
treatement, except not in a stringly manner like [object_file_name]
does.

This first commit removes use of the function in [scan_file] which was
introduced by PR ocaml#9011. The result of [object_file_name] here is equal
to [file_name] (follow [read_file]).
dbuenzli added a commit to dbuenzli/ocaml that referenced this pull request Sep 26, 2020
This function derives the same information as [object_file_name]
except it does it on the [file] datatype returned by [read_file].
Note that all the erroring code paths of [object_file_name] have been
handled by the [read_file] which derived the [file] value.

We integrate the logic added by PR ocaml#9011 for empty cmxa here.
dbuenzli added a commit to dbuenzli/ocaml that referenced this pull request Sep 26, 2020
This commit does the following four things (it's difficult
to them in separate commits that compile).

1) It removes the [read_file] from [scan_file]. Reading the files is
   done seperately before returning an [obj_infos] list of [file]
   values. This turns [scan_file] into a function that operates on
   values of type [file].

2) In [scan_file] it removes the separate list of [obj_files]
   introduced by ocaml#9011. We can derive the same list using the function
   [object_file_name_of_file] introduced in the previous commit on the
   list of [obj_infos]. Effectively we bring back [scan_file] to the
   state before ocaml#9011 modulo the [read_file] removal.

3) We derive the list of [obj_files] directly from the
   [obj_infos] list via [object_file_name_of_file]. Note that
   the logic introduced by ocaml#9011 is preserved by virtue of
   [object_file_name_of_file]'s optional result.

4) Deletes [object_file_name] which is no longer used.
dbuenzli added a commit to dbuenzli/ocaml that referenced this pull request Sep 27, 2020
An analysis of the code should convince yourself that the
[object_file_name] function is not needed and that its erroring paths
are dead code.

The reasoning is the following: all the uses of [object_file_name] in
asmlink.ml are performed on [obj_name]s which went through [read_file]
before. The latter does exactly the same file name lookup and erroring
treatement, except not in a stringly manner like [object_file_name]
does.

This new function derives the same information as [object_file_name]
except it does it on the [file] datatype returned by [read_file].
Note that all the erroring code paths of [object_file_name] have been
handled by the [read_file] which derived the [file] value.

We integrate the logic added by PR ocaml#9011 for empty cmxa here.
dbuenzli added a commit to dbuenzli/ocaml that referenced this pull request Sep 27, 2020
This commit does the following four things (it's difficult
to them in separate commits that compile).

1) It removes the [read_file] from [scan_file]. Reading the files is
   done seperately before returning an [obj_infos] list of [file]
   values. This turns [scan_file] into a function that operates on
   values of type [file].

2) In [scan_file] it removes the separate list of [obj_files]
   introduced by ocaml#9011. We can derive the same list using the function
   [object_file_name_of_file] introduced in the previous commit on the
   list of [obj_infos]. Effectively we bring back [scan_file] to the
   state before ocaml#9011 modulo the [read_file] removal.

3) We derive the list of [obj_files] directly from the
   [obj_infos] list via [object_file_name_of_file]. Note that
   the logic introduced by ocaml#9011 is preserved by virtue of
   [object_file_name_of_file]'s optional result.

4) Deletes [object_file_name] which is no longer used.
dbuenzli added a commit to dbuenzli/ocaml that referenced this pull request Mar 25, 2021
An analysis of the code should convince yourself that the
[object_file_name] function is not needed and that its erroring paths
are dead code.

The reasoning is the following: all the uses of [object_file_name] in
asmlink.ml are performed on [obj_name]s which went through [read_file]
before. The latter does exactly the same file name lookup and erroring
treatement, except not in a stringly manner like [object_file_name]
does.

This new function derives the same information as [object_file_name]
except it does it on the [file] datatype returned by [read_file].
Note that all the erroring code paths of [object_file_name] have been
handled by the [read_file] which derived the [file] value.

We integrate the logic added by PR ocaml#9011 for empty cmxa here.
dbuenzli added a commit to dbuenzli/ocaml that referenced this pull request Mar 25, 2021
This commit does the following four things (it's difficult
to them in separate commits that compile).

1) It removes the [read_file] from [scan_file]. Reading the files is
   done seperately before returning an [obj_infos] list of [file]
   values. This turns [scan_file] into a function that operates on
   values of type [file].

2) In [scan_file] it removes the separate list of [obj_files]
   introduced by ocaml#9011. We can derive the same list using the function
   [object_file_name_of_file] introduced in the previous commit on the
   list of [obj_infos]. Effectively we bring back [scan_file] to the
   state before ocaml#9011 modulo the [read_file] removal.

3) We derive the list of [obj_files] directly from the
   [obj_infos] list via [object_file_name_of_file]. Note that
   the logic introduced by ocaml#9011 is preserved by virtue of
   [object_file_name_of_file]'s optional result.

4) Deletes [object_file_name] which is no longer used.
stedolan pushed a commit to stedolan/ocaml that referenced this pull request May 24, 2022
…rt upstream PR#9943) (ocaml#557)

* Asmlink.object_file_name removal: add object_file_name_of_file

An analysis of the code should convince yourself that the
[object_file_name] function is not needed and that its erroring paths
are dead code.

The reasoning is the following: all the uses of [object_file_name] in
asmlink.ml are performed on [obj_name]s which went through [read_file]
before. The latter does exactly the same file name lookup and erroring
treatement, except not in a stringly manner like [object_file_name]
does.

This new function derives the same information as [object_file_name]
except it does it on the [file] datatype returned by [read_file].
Note that all the erroring code paths of [object_file_name] have been
handled by the [read_file] which derived the [file] value.

We integrate the logic added by PR ocaml#9011 for empty cmxa here.

* Asmlink.object_file_name removal: remove.

This commit does the following four things (it's difficult
to them in separate commits that compile).

1) It removes the [read_file] from [scan_file]. Reading the files is
   done seperately before returning an [obj_infos] list of [file]
   values. This turns [scan_file] into a function that operates on
   values of type [file].

2) In [scan_file] it removes the separate list of [obj_files]
   introduced by ocaml#9011. We can derive the same list using the function
   [object_file_name_of_file] introduced in the previous commit on the
   list of [obj_infos]. Effectively we bring back [scan_file] to the
   state before ocaml#9011 modulo the [read_file] removal.

3) We derive the list of [obj_files] directly from the
   [obj_infos] list via [object_file_name_of_file]. Note that
   the logic introduced by ocaml#9011 is preserved by virtue of
   [object_file_name_of_file]'s optional result.

4) Deletes [object_file_name] which is no longer used.

Co-authored-by: Daniel Bünzli <[email protected]>
lpw25 pushed a commit to lpw25/ocaml that referenced this pull request Jun 21, 2022
…s) in asmlink (port upstream PR#9943) (ocaml#557)

* Asmlink.object_file_name removal: add object_file_name_of_file

An analysis of the code should convince yourself that the
[object_file_name] function is not needed and that its erroring paths
are dead code.

The reasoning is the following: all the uses of [object_file_name] in
asmlink.ml are performed on [obj_name]s which went through [read_file]
before. The latter does exactly the same file name lookup and erroring
treatement, except not in a stringly manner like [object_file_name]
does.

This new function derives the same information as [object_file_name]
except it does it on the [file] datatype returned by [read_file].
Note that all the erroring code paths of [object_file_name] have been
handled by the [read_file] which derived the [file] value.

We integrate the logic added by PR ocaml#9011 for empty cmxa here.

* Asmlink.object_file_name removal: remove.

This commit does the following four things (it's difficult
to them in separate commits that compile).

1) It removes the [read_file] from [scan_file]. Reading the files is
   done seperately before returning an [obj_infos] list of [file]
   values. This turns [scan_file] into a function that operates on
   values of type [file].

2) In [scan_file] it removes the separate list of [obj_files]
   introduced by ocaml#9011. We can derive the same list using the function
   [object_file_name_of_file] introduced in the previous commit on the
   list of [obj_infos]. Effectively we bring back [scan_file] to the
   state before ocaml#9011 modulo the [read_file] removal.

3) We derive the list of [obj_files] directly from the
   [obj_infos] list via [object_file_name_of_file]. Note that
   the logic introduced by ocaml#9011 is preserved by virtue of
   [object_file_name_of_file]'s optional result.

4) Deletes [object_file_name] which is no longer used.

Co-authored-by: Daniel Bünzli <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants