Skip to content

Comments

release: forbid use of lib.fileset in Nixpkgs#369694

Merged
infinisil merged 13 commits intoNixOS:masterfrom
emilazy:push-quvxmoouztpo
Jan 1, 2025
Merged

release: forbid use of lib.fileset in Nixpkgs#369694
infinisil merged 13 commits intoNixOS:masterfrom
emilazy:push-quvxmoouztpo

Conversation

@emilazy
Copy link
Member

@emilazy emilazy commented Dec 31, 2024

Due to Nix bug NixOS/nix#11503, builtins.filterSource and chroot stores interact in a confusing and broken way that breaks lib.fileset. This means that uses of the API inside Nixpkgs keep breaking the NixOS installer, blocking the channel. The resulting error messages are inscrutable (they look like “the installer test is trying to download curl…?” and eventually bottom out in a derivation that has the wrong outPath because of the chroot store causing an incorrect lib.fileset result).

Whenever this happens, someone (well, in practice K900 or I) has to bisect the change that introduced it and remove the use of lib.fileset. This has happened at least three times in the past four months (I believe I might actually be missing one here, but these are the ones I remember and could easily dig up):

The options I see here are:

  1. Forbid use of lib.fileset within Nixpkgs until the Nix bug is fixed. This is the approach taken here. External users of Nixpkgs can continue to use the API as normal, but using it from within something that affects any release jobset outPaths will cause an evaluation failure with a hopefully‐helpful error message.

  2. Forbid lib.fileset and also all of the other library APIs that use builtins.filterSource. I’m happy to do this, but so far none of those have broken the installer, so I decided to start small and worry about the others if they end up causing a problem in practice.

  3. Forbid builtins.filterSource directly. This is hard and would require more invasive builtins.scopedImport crimes to do at evaluation time. I think this would realistically have to be done in something like nixpkgs-vet instead and I didn’t have much luck shoehorning a check like this into that codebase when I tried.

  4. Fix the Nix bug. This would be great! But also it doesn’t seem to be happening any time soon, it seems difficult to fix in a way that doesn’t subtly break compatibility with the previous semantics, and arguably the fix would need backporting all the way back to 2.3 given our minimum version policy.

  5. Do nothing; have people continue to innocuously use lib.fileset throughout Nixpkgs, breaking the installer whenever one of them sneaks in to that closure, causing the channel to be blocked and requiring expensive bisections to narrow down the inscrutable test failure to the package using lib.fileset, which then needs moving back off it. This sucks for the people who keep having to track it down, holds back important channel bumps, and the criteria for when it’s okay to use lib.fileset are not realistically possible to teach to all contributors.

I'd be happy to work on (2) as an alternative; (3) would be difficult and seems like overkill, (4) is not really something I trust myself to do and wouldn’t address the immediate problem, and (5) isn’t sustainable. I think that the current approach here is the best trade‐off for now, as lib.fileset seems to be the only prominent user of the builtins.filterSource API that works with full store paths, exposing it to the Nix bug. It’s unfortunate to lose the nice API, but since we can’t rely on it to produce correct results and the channels keep getting blocked as a result, I don’t think we really have an alternative right now.


I have checked that nix-build ci -A eval.full can successfully evaluate the release jobset for all the Hydra platforms and that it catches all the existing uses that I’ve fixed in this PR. I have also checked that the packages I’ve touched here continue to build if they did already, though it’s of course possible I’ve done something wrong when migrating them away from lib.fileset, so reviews of those commits would be helpful.

Result of nixpkgs-review run on aarch64-linux 1

1 package blacklisted:
  • nixos-install-tools
12 packages built:
  • cudaPackages.saxpy
  • cudaPackages_11.saxpy
  • flattenReferencesGraph
  • flattenReferencesGraph.dist
  • nixpkgs-manual
  • purescm
  • python312Packages.waitress-django
  • python312Packages.waitress-django.dist
  • python313Packages.waitress-django
  • python313Packages.waitress-django.dist
  • shopify-cli
  • yarn2nix

nixpkgs-review result

Generated using nixpkgs-review.

Command: nixpkgs-review


x86_64-linux

⏩ 1 package blacklisted:
  • nixos-install-tools
✅ 12 packages built:
  • cudaPackages.saxpy
  • cudaPackages_11.saxpy
  • flattenReferencesGraph
  • flattenReferencesGraph.dist
  • nixpkgs-manual
  • purescm
  • python312Packages.waitress-django
  • python312Packages.waitress-django.dist
  • python313Packages.waitress-django
  • python313Packages.waitress-django.dist
  • shopify-cli
  • yarn2nix

nixpkgs-review result

Generated using nixpkgs-review.

Command: nixpkgs-review


aarch64-darwin

❌ 2 packages failed to build:
  • flattenReferencesGraph
  • flattenReferencesGraph.dist
✅ 8 packages built:
  • nixpkgs-manual
  • purescm
  • python312Packages.waitress-django
  • python312Packages.waitress-django.dist
  • python313Packages.waitress-django
  • python313Packages.waitress-django.dist
  • shopify-cli
  • yarn2nix

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 25.05 Release Notes (or backporting 24.11 and 25.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

I am not sure `nix-gitignore.gitignoreSource` fares any better
with chroot stores, but so far it’s always been `lib.fileset`
that has caused issues with the installer ISO, and a blanket ban on
`lib.fileset` seems like the simplest remedy unless it turns out that
one of the other source filtering helpers causes issues.

This could probably be omitted entirely as it’s just an aid for
development convenience, anyway.

This reverts commit ba5a5fa.
This reverts commit 9ae0f77.
I don’t love this, and I’m not convinced it doesn’t have the
same pitfalls as filesets, but see the `flattenReferencesGraph`
commit for reasoning.

It may be better to ban the relevant builtins entirely and just move
things into subdirectories when needed, but I didn’t want to do
more surgery to this part of the tree than necessary to solve the
immediate problem.
Due to Nix bug <NixOS/nix#11503>,
`builtins.filterSource` and chroot stores interact in a confusing
and broken way that breaks `lib.fileset`. This means that uses of
the API inside Nixpkgs keep breaking the NixOS installer, blocking
the channel. The resulting error messages are inscrutable (they look
like “the installer test is trying to download `curl`…?” and
eventually bottom out in a derivation that has the wrong `outPath`
because of the chroot store causing an incorrect `lib.fileset` result).

Whenever this happens, someone (well, in practice K900 or I)
has to bisect the change that introduced it and remove the use of
`lib.fileset`. This has happened at least three times in the past
four months (I believe I might actually be missing one here, but
these are the ones I remember and could easily dig up):

* <NixOS#340046>
* <NixOS#352491>
* <NixOS#369459>

The options I see here are:

1. Forbid use of `lib.fileset` within Nixpkgs until the Nix bug is
   fixed. This is the approach taken here. External users of Nixpkgs
   can continue to use the API as normal, but using it from within
   something that affects any release jobset `outPath`s will cause an
   evaluation failure with a hopefully‐helpful error message.

2. Forbid `lib.fileset` and also all of the other library APIs that use
   `builtins.filterSource`. I’m happy to do this, but so far none of
   those have broken the installer, so I decided to start small and
   worry about the others if they end up causing a problem in practice.

3. Forbid `builtins.filterSource` directly. This is hard and would
   require more invasive `builtins.scopedImport` crimes to do at
   evaluation time. I think this would realistically have to be done in
   something like nixpkgs-vet instead and I didn’t have much luck
   shoehorning a check like this into that codebase when I tried.

4. Fix the Nix bug. This would be great! But also it doesn’t seem to be
   happening any time soon, it seems difficult to fix in a way that
   doesn’t subtly break compatibility with the previous semantics, and
   arguably the fix would need backporting all the way back to 2.3
   given our minimum version policy.

5. Do nothing; have people continue to innocuously use `lib.fileset`
   throughout Nixpkgs, breaking the installer whenever one of them
   sneaks in to that closure, causing the channel to be blocked and
   requiring expensive bisections to narrow down the inscrutable test
   failure to the package using `lib.fileset`, which then needs moving
   back off it. This sucks for the people who keep having to track it
   down, holds back important channel bumps, and the criteria for when
   it’s okay to use `lib.fileset` are not realistically possible to
   teach to all contributors.

I'd be happy to work on (2) as an alternative; (3) would be difficult
and seems like overkill, (4) is not really something I trust myself
to do and wouldn’t address the immediate problem, and (5) isn’t
sustainable. I think that the current approach here is the best
trade‐off for now, as `lib.fileset` seems to be the only prominent
user of the `builtins.filterSource` API that works with full store
paths, exposing it to the Nix bug. It’s unfortunate to lose the
nice API, but since we can’t rely on it to produce correct results
and the channels keep getting blocked as a result, I don’t think
we really have an alternative right now.
@emilazy emilazy requested review from K900 and infinisil December 31, 2024 15:06
@github-actions github-actions bot added 6.topic: python Python is a high-level, general-purpose programming language. 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: haskell General-purpose, statically typed, purely functional programming language 8.has: documentation This PR adds or changes documentation 6.topic: nodejs Node.js is a free, open-source, cross-platform JavaScript runtime environment 6.topic: testing Tooling for automated testing of packages and modules 6.topic: cuda Parallel computing platform and API labels Dec 31, 2024
@github-actions github-actions bot added the 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. label Dec 31, 2024
@github-actions github-actions bot added the 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. label Dec 31, 2024
@philiptaron
Copy link
Contributor

Hey, this bit Felix (@Stunkymonkey)! Many of the lines changed in this patch were introduced to fix #301014.

The bug explanation makes it now make a little more sense, but it's still crazy to me that it both has this symptom and is so action-at-a-distance.

@emilazy
Copy link
Member Author

emilazy commented Dec 31, 2024

Yeah, it’s really nasty to run into, and it keeps happening; someone uses a normal‐looking API in a random package and then days later we find out that the installer test is trying to download a bunch of stuff for no obvious reason. The problem is that Nix is really bad at keeping track of which chroot store paths should be mapped transparently to /nix/store/… and which should have the full on‐disk path, and that leaks into the builtins.filterSource interface. Non‐lib.fileset users of the API tend to get lucky by not looking at the full path, so the inconsistent prefix doesn’t affect them. It may be possible to work around this in the filesets API but @infinisil wasn’t optimistic about being able to do that without making correctness worse when I asked, so I think this is the best way to stop the bleeding for now.

(FWIW, the tell‐tale sign – other than just “someone is using lib.fileset” – is that with a chroot store (actually it’s a little more elaborate, I think you need to be doing -f '<nixpkgs>' where that points either inside or outside of your chroot store, one of which confuses Nix) the computed src ends up incorrectly being an empty directory.)

@philiptaron
Copy link
Contributor

I'd like to hear from you or @infinisil about the impact on lib.fileset for out-of-tree users. Based on the comment above, I think it'll be rare to run into this issue, since invoking Nix with -f '<nixpkgs>' isn't that common in workflows I've seen (most of which use flakes.)

This is the tracking issue where lib.fileset work happens, btw:

Copy link
Member

@infinisil infinisil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunate that this is necessary, but I agree that this is the best workaround for now while we don't have a fix for the underlying bug. Thanks for pushing this!

I'm not concerned about out-of-tree uses, it indeed shouldn't happen often.

@emilazy
Copy link
Member Author

emilazy commented Dec 31, 2024

The way this PR is done, it’ll only affect GHA / ofborg / Hydra, not external importers of Nixpkgs, since we stub out lib.fileset in the files that handle the release jobsets. In other words, downstream importers of Nixpkgs will get lib.fileset as normal, but our various CI builders will complain if they’re used in ways that could break anything we build. That’s also why some in‐tree stuff that we don’t build, like passthru tests of various derivations, continues to use lib.fileset – if any of that ever gets pulled into a release jobset outPath it’ll start complaining. (Of course it’ll continue to be subject to the Nix bug wherever you use it, but probably most people aren’t using chroot stores.)

This PR is definitely one I’d like to wait for the ofborg eval results on, just in case :) (and to see the perf stats, although it should only be one lib.extend call per Nixpkgs evaluation.)

@emilazy
Copy link
Member Author

emilazy commented Dec 31, 2024

Oh, actually ofborg just died, so maybe we should just merge this.

@wegank wegank added the 12.approvals: 2 This PR was reviewed and approved by two persons. label Jan 1, 2025
Copy link
Member

@winterqt winterqt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this looks like the best solution for now (unfortunately).

Let's hope the Nix bug maybe gets fixed soon?

@infinisil infinisil merged commit 82d084b into NixOS:master Jan 1, 2025
23 checks passed
@nixpkgs-ci
Copy link
Contributor

nixpkgs-ci bot commented Jan 1, 2025

Backport failed for release-24.11, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin release-24.11
git worktree add -d .worktree/backport-369694-to-release-24.11 origin/release-24.11
cd .worktree/backport-369694-to-release-24.11
git switch --create backport-369694-to-release-24.11
git cherry-pick -x 3c65f3d84651fd1b120db0d26bc140f6275aafec 2eae7d63e29c74afb9bca17746528dc62ba5486d f6ce575a03505ef90dbb864009058ac8d1392c3f 6ff076b1b121970ed13a38ffee0f58b540d417e8 f6cc18e0e46a4d3c9cace01eb698ac9c26cbed78 77def225cb60ae7c35ae9c82034aababed016f50 44475aafe3c1f67f7a1f304632c9271c61589a02 ba42e09bf2dca362e84500f1169b921164bbb8de 4e30014c062cec5059552c0dfc4a4656d7d59dd4 34a6c331e79081db4d133fc6692105552bbeb52b bc419f5949c49e1779da9ffecb201e42fc819b71 a38901aa8539613589598f59f567b9001199badd 8725e466ef2bcc5be69106c16dbf69b9ef989273

@emilazy emilazy deleted the push-quvxmoouztpo branch January 1, 2025 13:28
@Mic92 Mic92 mentioned this pull request Jan 27, 2025
14 tasks
@RaitoBezarius
Copy link
Member

Lix is in the process of exploring a fix: https://gerrit.lix.systems/c/lix/+/2570 thanks to Lily Foster.

@Ericson2314
Copy link
Member

I want to be careful about any "bug" framing of this. The desired semantics of store redirection is a non-trivial, and has never been properly designed. IMO the correct approach is to make it so the store cannot be accessed on the root file system at all, which is a https://en.wikipedia.org/wiki/Model_theory vantage point on the fact that the "real store dir" may be any number of places.

If we put in hacks so /nix/store/.... path literals "just work", I would like to then see a deprecation cycle.

Similarly, the fact that builtins.filterSource passes in paths prefixed with the path to the filtered located, and not paths within the filtered location, was always a bad design. If we hadn't done that, we would not be facing these issues.

@Ericson2314
Copy link
Member

Ericson2314 commented Feb 19, 2025

...nevermind! See the rather long comment I wrote up in NixOS/nix#12512 (comment). Tl;dr;

  1. @edolstra convinced me the thing I wanted to do right away is in fact unworkable breakage for impure eval and thus for Nixpkgs. OK then! Can't do that; won't do that.
  2. We can, and are, still making progress on the thing I want, but just in the pure eval case. There is no new union FS (including all the union FS semantics subtleties) in the case. And, in fact, the implementation is more robust than before.

lf- pushed a commit to lix-project/lix that referenced this pull request Feb 27, 2025
Under chroot or diverted store setups, the filtering logic of
`builtins.filterSource` and `builtins.path` (which shares the same filtering
logic as `filterSource`) would incorrectly pass physical paths to the
filter function instead of logical store paths.

This caused actual breakage in nixpkgs when the `lib.fileset` library was
introduced. Due to this unresolved bug in Nix, the library was forbidden
from use: <NixOS/nixpkgs#369694>.

To the best of our knowledge, this bug has existed since CppNix 2.3.

The existing tests were strengthened to cover these cases, but
additional testing may be required, particularly regarding symlink
handling.

References: NixOS/nix#12512 (CppNix fix to the
problem using "union" abstractions).

Co-authored-by: Raito Bezarius <[email protected]>
Co-authored-by: Alois Wohlschlager <[email protected]>
Co-authored-by: eldritch horrors <[email protected]>
Signed-off-by: Raito Bezarius <[email protected]>
Change-Id: Iaf6ca8c506eeca145393ce100c64db12178daa62
jopejoe1 added a commit to jopejoe1/nixpkgs that referenced this pull request May 13, 2025
The usage of lib.fileset was forbidden in nixpkgs due to a bug in nix.
For more information, see 8725e46 and the PR it was merged in NixOS#369694
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: cuda Parallel computing platform and API 6.topic: haskell General-purpose, statically typed, purely functional programming language 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: nodejs Node.js is a free, open-source, cross-platform JavaScript runtime environment 6.topic: python Python is a high-level, general-purpose programming language. 6.topic: testing Tooling for automated testing of packages and modules 8.has: documentation This PR adds or changes documentation 10.rebuild-darwin: 11-100 This PR causes between 11 and 100 packages to rebuild on Darwin. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. 12.approvals: 2 This PR was reviewed and approved by two persons.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

8 participants