Skip to content

segfault during substitution on x86-64_darwin #9640

@abathur

Description

@abathur

I don't know if this is tractable from the Nix side, but I figure it deserves a report since some users are encountering it when they invoke nix commands.

Describe the bug

Since the stdenv bump to LLVM 16 in Nixpkgs, at least some intel mac users have started seeing segfaults when Nix tries to print the size of missing store paths before substituting.

Note: If you're running into this, you can likely work around the crash by adding --option print-missing false.

Depending on shell used, these can manifest like:

$ nix-shell -p cowsay
...
Segmentation fault: 11

or:

$ nix-shell -p cowsay
...
[1]    43294 segmentation fault  nix-shell -p cowsay

When Nix is wrapped by something else like darwin-rebuild, these may also just be indicated by exit status (status 139 in one known case).

Reports so far indicate that this affects x86-64_darwin up to at least macOS 10.15.7. A similar segfault has been reported against Nixpkgs. I'm not 100% sure they share a root cause, but that report suggests this may affect up to macOS 11.x but not 12.x:

Steps To Reproduce

  1. Run an invocation that needs to realize something that could be substituted, such as nix-shell -p cowsay --dry-run

Additional context

Here's the most-relevant part of the crash dump (full copy: nix_2023-12-17-113156_b8793364.txt):

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libc++.1.0.dylib              	0x0000000108f1401e std::__1::istreambuf_iterator<char, std::__1::char_traits<char> > std::__1::num_get<char, std::__1::istreambuf_iterator<char, std::__1::char_traits<char> > >::__do_get_unsigned<unsigned long>(std::__1::istreambuf_iterator<char, std::__1::char_traits<char> >, std::__1::istreambuf_iterator<char, std::__1::char_traits<char> >, std::__1::ios_base&, unsigned int&, unsigned long&) const + 46
1   libc++.1.dylib                	0x00007fff66f45f4b std::__1::basic_ostream<char, std::__1::char_traits<char> >::operator<<(float) + 247
2   libnixmain.dylib              	0x00000001086e832d void boost::io::detail::put<char, std::__1::char_traits<char>, std::__1::allocator<char>, boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&>(boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&, boost::io::detail::format_item<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >::string_type&, boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >::internal_streambuf_t&, std::__1::locale*) + 781
3   libnixmain.dylib              	0x00000001086e7f5e void boost::io::detail::distribute<char, std::__1::char_traits<char>, std::__1::allocator<char>, boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&>(boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&) + 190
4   libnixmain.dylib              	0x00000001086fcb37 void nix::formatHelper<boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >, float, float>(boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, float const&, float const&) + 87
5   libnixmain.dylib              	0x00000001086f3626 nix::printMissing(nix::ref<nix::Store>, std::__1::set<nix::StorePath, std::__1::less<nix::StorePath>, std::__1::allocator<nix::StorePath> > const&, std::__1::set<nix::StorePath, std::__1::less<nix::StorePath>, std::__1::allocator<nix::StorePath> > const&, std::__1::set<nix::StorePath, std::__1::less<nix::StorePath>, std::__1::allocator<nix::StorePath> > const&, unsigned long long, unsigned long long, nix::Verbosity) + 1350

The segfaults happen when printMissing tries to printMsg the float document/NAR sizes here:

if (!willSubstitute.empty()) {
const float downloadSizeMiB = downloadSize / (1024.f * 1024.f);
const float narSizeMiB = narSize / (1024.f * 1024.f);
if (willSubstitute.size() == 1) {
printMsg(lvl, "this path will be fetched (%.2f MiB download, %.2f MiB unpacked):",
downloadSizeMiB,
narSizeMiB);
} else {
printMsg(lvl, "these %d paths will be fetched (%.2f MiB download, %.2f MiB unpacked):",
willSubstitute.size(),
downloadSizeMiB,
narSizeMiB);
}

We can work around the crash with --option print-missing false because Nix only ends up on this code path when print-missing is true (though it is the default):

if (settings.printMissing)
printMissing(ref<Store>(store), willBuild, willSubstitute, unknown, downloadSize, narSize);

Note: I don't recall exactly how the Nix bundled with installer gets assembled, but I suspect we'd already have reports against this repo if this was affecting new Nix installs. Since the flake is targeting inputs.nixpkgs.url = "github:NixOS/nixpkgs/staging-23.05";, I imagine the binaries installed by the installer still use llvm 11. If that's right, I imagine this is mostly just biting people who used something like nix-darwin, home-manager, or nix update with a fairly recent nixpkgs.

If that's right, this issue could affect more users and CI systems if the nixpkgs input is updated to a rev including the stdenv bump before we have a solution here or perhaps a patch/bump in nixpkgs?

Related posts/reports:

Priorities

Add 👍 to issues you find important.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions