Skip to content

writeClosure: add paths pre-instantiation filtering and sorting#300722

Open
ShamrockLee wants to merge 1 commit intoNixOS:masterfrom
ShamrockLee:write-closure-preprocess
Open

writeClosure: add paths pre-instantiation filtering and sorting#300722
ShamrockLee wants to merge 1 commit intoNixOS:masterfrom
ShamrockLee:write-closure-preprocess

Conversation

@ShamrockLee
Copy link
Contributor

@ShamrockLee ShamrockLee commented Apr 1, 2024

Description of changes

Pre-process paths before feeding to exportReferencesGraph.

  • Remove null.
    • Specifying a package as null is a common approach to "cancel" a dependency from the override interface. However, exportReferencesGraph doesn't handle null correctly (with or without __structuredAttrs = true), hence the removal.
  • Manually string-interpolate for sorting.
    • See below.
  • Sort alphabetically to reduce unnecessary rebuilds.
    • There were once discussion about sorting buildInputs and nativeBuildInputs automatically for similar purposes, but such proposal faced the challenge of a few packages' relying on the order of dependency package to determine the order of PATHs. The paths attribute of writeClosure doesn't have such issue, as the closure is, by definition, independent to the permutation of paths.

Regarding the concern about manual string interpolation, the paths will eventually go through string interpolation before being serialized into a store derivation (/nix/store/*.drv) and form $NIX_ATTRS_JSON_FILE and $NIX_ATTRS_SH_FILE. Manual string interpolation doesn't change any of the drvPath or outPath, nor does it introduce extra dependencies or affects realization.

Proof:

The following Nix expression evaluates to true.

let
  pkgs = import <nixpkgs> { };
  dump = with pkgs; paths:
    runCommand "dump-exportReferencesGraph-attrs" {
      __structuredAttrs = true;
      exportReferencesGraph.graph = paths;
    } ''
      mkdir -p "$out"
      cp "$NIX_ATTRS_JSON_FILE" "$NIX_ATTRS_SH_FILE" "$out"
    '';
in
(dump [ pkgs.hello pkgs.cowsay ]) == (dump [ "${pkgs.hello}" "${pkgs.cowsay}" ])

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

Pre-process paths before feeding to `exportReferencesGraph`.
- Remove null.
- Manually string-interpolate for sorting.
- Sort alphabetically to reduce unnecessary rebuilds.
@ofborg ofborg bot added 11.by: package-maintainer This PR was created by a maintainer of all the package it changes. 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Apr 1, 2024
@ShamrockLee
Copy link
Contributor Author

@ofborg build tests.trivial-builders.references
@ofborg build tests.trivial-builders.writeClosure-union
@ofborg build apptainer.tests.image-hello-cowsay

@SomeoneSerge SomeoneSerge self-requested a review April 1, 2024 21:32
@SomeoneSerge
Copy link
Contributor

Regarding the #178717 (comment) about manual string interpolation, the paths will eventually go through string interpolation before

I guess the question I rather wanted to ask is (I'm not particularly familiar with lazy languages, this is probably very basic): does sort limit parallelism in any way? The worst case I imagine is when builds are scheduled e.g. two at a time and have to complete before sort decides which elements to compare next

@ShamrockLee
Copy link
Contributor Author

ShamrockLee commented Apr 2, 2024

Regarding the #178717 (comment) about manual string interpolation, the paths will eventually go through string interpolation before

I guess the question I rather wanted to ask is (I'm not particularly familiar with lazy languages, this is probably very basic): does sort limit parallelism in any way? The worst case I imagine is when builds are scheduled e.g. two at a time and have to complete before sort decides which elements to compare next

AFAIK, evaluation is independent from realization (build) as long as "import from derivation" (IFD) is not used.

For example, if we would like to build apptainer.tests.image-hello-cowsay, Nix will first evaluate apptainer.tests.image-hello-cowsay and all its dependencies, including apptainer, qemu, and writeClosure [ hello cowsay] (which involves tde evaluation of hello and cowsay). Whenever the built-in function derivation is called (by stdenv.mkDerivation), a store derivation (/nix/store/*.drv) is instantaneously written to the Nix Store.

After all the evaluation is finished, Nix reads the store derivation of apptainer.tests.image-hello-cowsay, and resolve all its build-time closure, and starts to realize them.

The only read/write things evaluation can affect, when IFD is not involved, is

  1. The instantiation of store derivations.
  2. The built-in fetchers.
  3. The built-in file writers.

Those are unfortunately done sequentially, as Nix evaluation doesn't seem to support paralellism. That's probably why built-in fetchers are not used often in Nixpkgs.

@SomeoneSerge
Copy link
Contributor

SomeoneSerge commented Aug 2, 2024

AFAIK, evaluation is independent from realization (build) as long as "import from derivation" (IFD) is not used.

Indeed. I guess I was just blabbering something without giving it any actually thought.

Looking back, sort and filter do introduce an extra synchronization point when computing e.g. outPath, but this is mostly irrelevant.

However, I still don't like the idea of sorting by hashes (squinting an eye, this should be equivalent to a random shuffle, when perturbing any inputs). I'd rather sort by names first and then by hashes, just to have stable layouts.

@ShamrockLee
Copy link
Contributor Author

However, I still don't like the idea of sorting by hashes (squinting an eye, this should be equivalent to a random shuffle, when perturbing any inputs). I'd rather sort by names first and then by hashes, just to have stable layouts.

It turns out that the order of the paths taken by exportReferencesGraph doesn't affect the output content, as exportReferencesGraph sorts the printed paths by hashes. However, it could affect the traceback log in terms of evaluation errors.

If we want to change the output order, we'll need to add custom sorting to the jq command pipe.

@wegank wegank added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Jan 2, 2025
@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Apr 2, 2025
@SomeoneSerge
Copy link
Contributor

It turns out that the order of the paths taken by exportReferencesGraph doesn't affect the output content, as exportReferencesGraph sorts the printed paths by hashes.

Nice. It would change the IA outPath of writeClosure [...], however.

I guess the question I rather wanted to ask is (I'm not particularly familiar with lazy languages, this is probably very basic): does sort limit parallelism in any way? The worst case I imagine is when builds are scheduled e.g. two at a time and have to complete before sort decides which elements to compare next

Hello myself from a year back, you were lazy and also wrong about most things. In normal Nix, this would sort by IA hashes. Even in CA Nix this would sort by the unresolved outPaths, which functionally behave same as IA hashes. Sorting does is a synchronization point, but it hardly matters because (1) Nixpkgs already big, (2) Nix already slow. To conclude, I'm now in favour of sorting and deduping.

Out of aesthetic considerations, I would have preferred ORDER BY lib.getName, outPath, so paths are first "grouped" by their highly-non-random names, and then ties are resolved by their highly-random hashes. That would be, of course, just a waste of computation

@nixpkgs-ci nixpkgs-ci bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Sep 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 11.by: package-maintainer This PR was created by a maintainer of all the package it changes.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants