Skip to content

Comments

fetchNodeModules: init#128749

Closed
zhaofengli wants to merge 1 commit intoNixOS:masterfrom
zhaofengli:fetchnodemodules
Closed

fetchNodeModules: init#128749
zhaofengli wants to merge 1 commit intoNixOS:masterfrom
zhaofengli:fetchnodemodules

Conversation

@zhaofengli
Copy link
Member

@zhaofengli zhaofengli commented Jun 30, 2021

Motivation for this change

This PR adds a new fetcher, fetchNodeModules, which can download deterministic dependency sets from the public NPM registry according to the NPM lock file (package-lock.json). This is useful for ensuring that Node.js packages are able to use dependencies at the exact versions specified by upstream, avoiding silent breakages caused by updates to the shared nodePackages package set which does not respect semantic versioning. The fetcher will also greatly simplify the work required to package Node.js projects with complex dependencies without sacrificing reproducibility.

Background

Currently, adding a package that makes use of the NPM ecosystem to Nixpkgs is done by adding all NPM dependencies it requires to one of the following places, both of which come with their own downsides:

Shared `nodePackages` package set

We keep a shared list of NPM packages in pkgs/development/node-packages/node-packages.json. When people add packages to this list, a script (generate.sh) is run which fetches the latest versions of all packages in the list from NPM, writing the URLs to retrieve the packages along with their hashes into node-packages.nix. This workflow has several major downsides:

  • Unrelated changes always get introduced. When the generator script is run, all packages in the shared list are updated at once, regardless of whether they are necessary or not.
  • Semantic versioning is violated. The generator always fetches the latest version of the packages (sans several manual exceptions), without taking compatibility into consideration.
  • Merge conflicts are a frequent occurrence. PRs proposing changes to the shared package set (e.g., iosevka: 5.0.2 -> 7.0.4, iosevka-bin: 5.0.5 -> 7.0.4  #126664, bgpalerter: init at 1.28.1 #115474) are often met with merge conflicts as the generator script often introduces large, unrelated changes to the shared node-packages.nix.
  • The package update process is slow. Fetching all of the shared packages is often a slow process for maintainers.

All of these inevitably lead to silent breakages of unrelated packages. An example of silent breakage can be found in a proposed addition of the BGPAlerter package (#115474). A major update to js-yaml (3.14.1 -> 4.0.0) was introduced in a PR that updates rust-analyzer package. Unfortunately, this release is not backwards-compatible (indicated by the increase of the major release number) and it removes several deprecated functions that were available in earlier versions (e.g., safeLoad). For BGPAlerter, the configuration wizard (run when there is no configuration) works without issue while actually starting the monitoring daemon will fail because of the removed safeLoad function. In other words, such breakages can be invisible even from those directly testing the packages.

Private dependency set

Alternatively, maintainers can also opt to separate their packages from the global shared package list. They work similarly to the global list, except at a smaller scale with changes affecting only the package itself. A tool like node2nix and yarn2nix is used to convert the NPM/Yarn dependencies (with version constraints, and in most cases, lock files) into Nix expressions containing a list of URLs and hashes corresponding to each dependency. Dependencies are downloaded according to the version constraints specified in the package's manifest (package.json), and in most cases, a lock file (yarn.lock, package-lock.json) will be used which specifies the exact version for each required dependency. Because of that, breakages do not occur for packages using their private dependency sets.

Due to the interconnected nature of the NPM ecosystem, the generated Nix expressions can be extremely long, often with line counts in the thousands or even tens of thousands. As a result, reviewers can be reluctant accepting new packages with their private dependency sets. The shared package set remains the preferred way for smaller leaf packages, and thus has to suffer from the issues listed above.

A list of packages that have their own dependency sets can be found by searching for offline_cache = and generated by node2nix in Nixpkgs.

fetchNodeModules

fetchNodeModules is similar to the fetchCargoTarball fetcher for Rust packages in that it relies on the native language-specific package manager (npm) to complete the entire process of resolving and downloading dependencies, resulting in a bundle of frozen packages. A fixed-output derivation is used to allow npm network access and the results are compared against a predetermined hash in the Nix expression.

The process is reproducible because like it relies on a lock file which specifies the exact version for each required dependency along with hashes created from the its contents, like yarn2nix and node2nix mentioned above. The difference is that the dependency list is not converted to Nix and only standard tools for the language are used. Reproducibility is enforced both in npm (lock file) and Nix (hash of final output).

It is important to note that the original lock file and the translated Nix expressions from the aforementioned tools contain the same level of information. fetchNodeModules does not work with projects without a lock file.

Example

A tarball of all runtime dependencies of BGPAlerter can be created with the expression below:

pkgs.nodePackages.fetchNodeModules {
  src = fetchFromGitHub {
    owner = "nttgin";
    repo = "BGPalerter";
    rev = "v1.28.1";
    sha256 = "sha256-Y0atkJfZxjUOGPQ3goXS/YD5SsX9ZjpbM0Nc5IuaFP4=";
  };

  makeTarball = true;
  sha256 = "sha256-850tsQSYWECziz5qMoTUKRpIF7wKMuRj1e54VeWEOV4=";
}
See also
Things done
  • Tested using sandboxing (nix.useSandbox on NixOS, or option sandbox in nix.conf on non-NixOS linux)
  • Built on platform(s)
    • NixOS
    • macOS
    • other Linux distributions
  • Tested via one or more NixOS test(s) if existing and applicable for the change (look inside nixos/tests)
  • Tested compilation of all pkgs that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review wip"
  • Tested execution of all binary files (usually in ./result/bin/)
  • 21.11 Release Notes (or backporting 21.05 Relase notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

@github-actions github-actions bot added 6.topic: fetch Fetchers (e.g. fetchgit, fetchsvn, ...) 8.has: documentation This PR adds or changes documentation labels Jun 30, 2021
@zhaofengli zhaofengli mentioned this pull request Jun 30, 2021
10 tasks
@zhaofengli
Copy link
Member Author

zhaofengli commented Jun 30, 2021

Note that this is just a draft implementation. One thing that is puzzling me right now is that with the following expression, the output directory simply disappears after build:

pkgs.nodePackages.fetchNodeModules {
  src = pkgs.fetchFromGitHub {
    owner = "nttgin";
    repo = "BGPalerter";
    rev = "v1.28.1";
    sha256 = "sha256-Y0atkJfZxjUOGPQ3goXS/YD5SsX9ZjpbM0Nc5IuaFP4=";
  };

  makeTarball = false;
  exposeBin = true;
  sha256 = "sha256-T9W96csyA+/fwFs1eNccslx5qcbVTpAtqLM1Xd32tk4=";
}

Would like to know if this problem is reproducible by others and what the solution is.

@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux. labels Jun 30, 2021
@ajs124
Copy link
Member

ajs124 commented Jun 30, 2021

What does this do that npmlock2nix (or one of the other 20 attempts to do npm things with nix) doesn't?

@Mic92
Copy link
Member

Mic92 commented Jun 30, 2021

What does this do that npmlock2nix (or one of the other 20 attempts to do npm things with nix) doesn't?

I think npmlock2nix is doing import from derivation, which this tool does not.

@Mic92
Copy link
Member

Mic92 commented Jun 30, 2021

fetchNodeModules output should be tested on different platforms if it produces different results. Also if it is stable on the same platform.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

@zhaofengli zhaofengli Jun 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does with the normal npm i which will automatically generate a package-lock.json from yarn.lock, but not the deterministic npm ci we are using which refuses to perform any change to the lock files.

@zhaofengli
Copy link
Member Author

What does this do that npmlock2nix (or one of the other 20 attempts to do npm things with nix) doesn't?

I think npmlock2nix is doing import from derivation, which this tool does not.

Also this tool doesn't attempt to parse the package.json / package-lock.json or otherwise Nixify the dependencies at all - It simply delegates the fetching to the actual npm, since npm ci provides the reproducibility we need, which is why I compared it to fetchCargoTarball (used internally by buildRustPackage) which does the same thing but with cargo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move the files under lib, share or opt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it makes sense to follow the conventions since we are having a bin (if exposeBin is set) as well. Could you also test if you can reproduce the problem that I'm seeing with exposeBin?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have the time right now to fiddle with this.

@zhaofengli
Copy link
Member Author

Pushed a fix to change the directory structure ($out/lib/node_modules) as well as the timestamps. The hashes in the examples have been updated.

Copy link
Member

@roberth roberth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reproducibility is enforced both in npm (lock file) and Nix (hash of final output).

Enforcement in Nix happens, but in a very narrow way. If you forget to update the hash, you will be using stale dependencies, unknowingly, until "hopefully" someone else tries to build it without your store or binary cache failing the fetcher, detecting the problem. Considering that these will end up on cache.nixos.org, these problems can linger for a long time.
It's also worth considering that these store paths will duplicate a lot of files that are shared store paths in case of other solutions. (hard link optimization only applies to a local nix store; nix-copy-closure and other forms of distribution such as docker images are still affected)

The hashes in the examples have been updated.

We don't have a method for finding and updating these when a change logic inevitably requires this again. This is especially true when other projects call this function.

This might be alleviated somewhat by including a hash of relevant inputs in the name, forcing a new store path when those inputs change. However, the selection of those inputs is a trade-off between usability and correctness. Too many and you'll burden the user with unnecessary hash updates. Too few and we'll have reproducibility problems. This is not a choice we should have to make.

Specifically with npm we don't have to make this trade-off because it provides us with perfectly usable lock files that we can parse with Nix to fetch dependencies in a reliably reproducible way.

@Mic92
Copy link
Member

Mic92 commented Jul 1, 2021

Enforcement in Nix happens, but in a very narrow way. If you forget to update the hash, you will be using stale dependencies, unknowingly, until "hopefully" someone else tries to build it without your store or binary cache failing the fetcher, detecting the problem. Considering that these will end up on cache.nixos.org, these problems can linger for a long time.

This problem can be solved by adding the lock file to the dependencies as well and than diff the lock file from the fixed-derivation output with lock file delivered in the source.

It's also worth considering that these store paths will duplicate a lot of files that are shared store paths in case of other solutions. (hard link optimization only applies to a local nix store; nix-copy-closure and other forms of distribution such as docker images are still affected)

Having thousands of node derivations also comes with a cost: evaluation memory and nixpkgs download/unpack size.

The hashes in the examples have been updated.

We don't have a method for finding and updating these when a change logic inevitably requires this again. This is especially true when other projects call this function.

This might be alleviated somewhat by including a hash of relevant inputs in the name, forcing a new store path when those inputs change. However, the selection of those inputs is a trade-off between usability and correctness. Too many and you'll burden the user with unnecessary hash updates. Too few and we'll have reproducibility problems. This is not a choice we should have to make.

Specifically with npm we don't have to make this trade-off because it provides us with perfectly usable lock files that we can parse with Nix to fetch dependencies in a reliably reproducible way.

Does not really scale in nixpkgs. Dependency trees are so large for the average nodejs package that the added file size is an not acceptable overhead to store all lock files in nixpkgs. Otherwise we would have switched tools like https://github.com/nix-community/yarn2nix or similar. We cannot use import from derivation because this would increase the evaluation time.

@Mic92
Copy link
Member

Mic92 commented Jul 11, 2021

To add another data sample. Peertubes yarn.lock file is 8000 lines. This is a huge amount for a single package even without the full dependency tree. I think this fetcher here is the only sane approach for nixpkgs.

@roberth roberth self-requested a review July 14, 2021 14:45
@roberth
Copy link
Member

roberth commented Jul 14, 2021

I expect this fetcher to be implementable as a non-fixed output recursive derivation that

  1. takes a fetchurled lock file as an input
  2. fetches the modules using fetchurl during the build (during the derivation build in case of (general) recursive Nix or after the current derivation in case of ret-cont RFC40 recursive Nix)
  3. creates a link farm in $out

Recursive Nix is implemented but experimental, so while we could try it today, we can't use it by default in Nixpkgs. Eelco has raised a concern

Probably we should also disallow instantiating/building fixed-output derivations (specifically, those that access the network[...])

However, I believe the impurity can be addressed without disallowing it. If for whatever reason that doesn't work out, this idea seems to be compatible with RFC40 ret-cont recursive Nix; a more restrictive form of recursion. Perhaps even RFC92, although I really can't confirm that now.

With this solution on the horizon, I believe we can use this PR in the interim without creating a structural problem for Nix's perceived reliability.

@SuperSandro2000
Copy link
Member

takes a fetchurled lock file as an input

That wont work for everything because upstream does not always provide an (up to date) lock file.

fetches the modules using fetchurl during the build

npm can take anything as a downloaded file and we already fetching random git repos in nodePackages right now.

@roberth
Copy link
Member

roberth commented Jul 16, 2021

takes a fetchurled lock file as an input

That wont work for everything because upstream does not always provide an (up to date) lock file.

While true, I do not think it's relevant to my comment. It is a more general problem but the solution is easy enough: make a PR to update the lock file, and fetch from the PR. Preferably tag it in the fork as well to avoid loss on rebase, but that's yet another topic.

fetches the modules using fetchurl during the build

npm can take anything as a downloaded file and we already fetching random git repos in nodePackages right now.

I intended fetchurl as an example. Any derivation can be used, including those of fetchgit.

@Mic92
Copy link
Member

Mic92 commented Jul 17, 2021

I expect this fetcher to be implementable as a non-fixed output recursive derivation that

1. takes a `fetchurl`ed lock file as an input

2. fetches the modules using `fetchurl` during the build

3. creates a link farm in `$out`

The algorithm you are describing is effectively import-from-derivation. It is not a good idea to implement it this way because this would force a nix eval process to download all node packages just to evaluate nixpkgs. This is slow and takes a lot of disk space and moves computational work from builder to evaluator (bad for how Nix CIs currently work).
Correct me if I am wrong.

@roberth
Copy link
Member

roberth commented Jul 17, 2021

The algorithm you are describing is effectively import-from-derivation.

It is not. Evaluation for the (large number of) fetcher calls happens inside the sandbox thanks to recursive Nix. I've clarified the original comment.

It is not a good idea to implement it this way because this would force a nix eval process to download all node packages just to evaluate nixpkgs.

To the rest of Nixpkgs, a recursive Nix fetchNodeModules appears and performs like any other single derivation. Thanks to recursive Nix, the expensive parts are moved into the sandbox, where it can run later, in parallel and remotely.

This is slow and takes a lot of disk space and moves computational work from builder to evaluator (bad for how Nix CIs currently work).
Correct me if I am wrong.

Hercules CI has solved the user experience around import-from-derivation. It doesn't have such an evaluation bottleneck and it's clear when its evaluator is waiting for IFD. IFD is still sequential as of today, but that's not a problem in practice, unless you're building at the scale of Nixpkgs.

@Mic92
Copy link
Member

Mic92 commented Jul 18, 2021

To the rest of Nixpkgs, a recursive Nix fetchNodeModules appears and performs like any other single derivation. Thanks to recursive Nix, the expensive parts are moved into the sandbox, where it can run later, in parallel and remotely.

Ok. But it is not clear when we will be able to use this feature. So far a nix release is just pushed forward year by year by adding more experimental features. To me it seems we could add it with a checksum and when time comes we could switch over to recursive nix by dropping the need for a checksum.

Hercules CI has solved the user experience around import-from-derivation. It doesn't have such an evaluation bottleneck and it's clear when its evaluator is waiting for IFD. IFD is still sequential as of today, but that's not a problem in practice, unless you're building at the scale of Nixpkgs.

That's nice however hercules evaluation is still slower than both plain nix and hydra (which uses multiple threads).

@roberth
Copy link
Member

roberth commented Jul 18, 2021

not clear when we will be able to use this feature.

and when time comes we could switch over to recursive nix by dropping the need for a checksum.

So we agree.

hercules evaluation is still slower than both plain nix and hydra

Thanks for letting me know. I'll start to optimize eval soon.

@andir
Copy link
Member

andir commented Jul 28, 2021

What does this do that npmlock2nix (or one of the other 20 attempts to do npm things with nix) doesn't?

I think npmlock2nix is doing import from derivation, which this tool does not.

Nope, it doesn't. It works in hydras restricted eval mode. You can use it with IFD if you don't want to copy the lock file into the repo.

Give that this PR proposes yet another big FOD I am really not in favor of this. We already had enough trouble in the Go and Rust world due to these issues.

I was originally planning to propose npmlock2nix for nixpkgs but I want it to be more mature first.

@Mic92
Copy link
Member

Mic92 commented Jul 28, 2021

What does this do that npmlock2nix (or one of the other 20 attempts to do npm things with nix) doesn't?

I think npmlock2nix is doing import from derivation, which this tool does not.

Nope, it doesn't. It works in hydras restricted eval mode. You can use it with IFD if you don't want to copy the lock file into the repo.

Give that this PR proposes yet another big FOD I am really not in favor of this. We already had enough trouble in the Go and Rust world due to these issues.

I was originally planning to propose npmlock2nix for nixpkgs but I want it to be more mature first.

npmlock2nix is also not a solution just look how large a package-lock.json is: https://github.com/andig/evcc/blob/master/package-lock.json. That might work for projects that have the package-lock.json in the repo but certainly not for nixpkgs. It also still will increase evaluation time due to large dependency graph it needs to process. I don't see any other scalable solution but this fixed-input derivations and maybe recursive nix in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you forgot to inherit exposeBin, or am I missing something ?

Copy link
Contributor

@happysalada happysalada Sep 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with the npmFlags ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last comment on this, I don't know what is the best practice here. But rather than inheriting variables that are unrelated to mkDerivation (I had to search the codebase to see if those variables where used elsewhere), could it make sense to use those variables directly where they are required? Something like

  • remove the inherit runScripts
  • inside the build phase have something like
''
if  ! ${runScripts} ; then
        NPM_FLAGS+=" --ignore-scripts"
fi
''

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I should add it there for clarity, even though it's pulled in with the other arguments in the last line with //.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I know about bash is that the -n checks if a variable is non empty.
What is a little confusing to me is that production is either true or false, but it's never empty.
Perhaps this is a behavior of a mkDerivation that will convert variable it inherits as empty if they are false. It seems a littlle unintuitive to me. I just want to confirm here that this is the case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one thing that is curious to me is that cargo by default makes a tarball, there is no way to not make a tarball.
(ref)
I wonder if anybody what motivated that decision.

Checking the go code, it does not make a tarball at all. (ref)

I wonder why

  • go took an opposite approach from rust
  • why it would be good to have both ?

I'm guessing the tarball can help if you need to remote-build ?

@happysalada
Copy link
Contributor

today there was the new nix version released on nixpkgs-unstable

❯ nix --version
nix (Nix) 2.4pre20210908_3c56f62

just saying that the same problem of disappearing derivation happens with that new version as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps the default should be false here ?
My impression is that fetchNodeModules will fetch the node_modules. There is therefore a high chance that people will use those dependencies to build something. Building will require dev dependencies. I don't see a context in which it would make sense to not have the dev dependencies in the node_modules (since you won't be able to build the related project with it). Actually if you want to optimize the closure size, you might want to first have a dev dependencies, build the project then replace the node_modules folder with only production dependencies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the documentation should encourage packagers to run npm prune --production after building.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, then it's a matter of changing what the production flag does. At the moment it only installs dependencies and not devDependencies.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wondering here what the use case for this would be.
Would we want to expose binaries from the node_modules of a project ?
Were you thinking of webpack and those build executables for example ?
I'm wondering if it's not better to provide the node_modules folder as is and let the consumers of that folder handle the binary paths ?
(this is the approach that node2nix takes if I remember correctly). basically if you want in your consuming derivation use a single binary you can address it by path, if you want to make all of those available, you can do export PATH=node_modules/.bin

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry for the late response)

Yes, exactly. It's mostly there for convenience since some build scripts attempt to run those commands directly. I can do just fine without them, but we still need to resolve the other reproducibility issues.

@happysalada happysalada mentioned this pull request Sep 17, 2021
12 tasks
@happysalada
Copy link
Contributor

In the interest of discussion I went ahead and did an implementation for yarn
#138233
(handles both yarn.lock and package-lock.json)
I would be happy to get feedback on it.
My thinking is that if it takes to get another 3 months before we get npmlock2nix in nixpkgs but we can get a fetcher in next week, then it's that much earlier that we can start making a transition for the js eco-system.

@roberth
Copy link
Member

roberth commented Sep 17, 2021

Have we considered the old haskellPackages-like workflow?

  • contributors submit changes to the config file only
  • update automation runs on a branch node-updates (like haskell-updates)
  • hydra builds it and runs the relevant tests
  • merge weekly

This is the old workflow though. Nowadays contributors do generate the exprs, because it's fast, but haskellPackages has done without that for a long time.
For this to be feasible on node, contributors would have to run with a minimal config file which is sufficient for their package only, so they can regenerate the expressions and test in a reasonable amount of time.

It does avoid the technical issues.

@zhaofengli
Copy link
Member Author

zhaofengli commented Sep 17, 2021

To move this forward, I have removed the exposeBin functionality and rebased the PR. Additionally, the following changes were made:

  • The value of production is embedded in the output path, to alleviate the problem caused by forgetting to change the hash in this simple case so that it's guaranteed to cause a visible failure.
  • package.json and package-lock.json are now injected into the output, so the consumer can ensure that the packages match the source it's building. As @roberth has pointed out, changes like this will silently break existing users, so we must ensure that the default behavior can satisfy most usecases and be expected to remain unchanged for the foreseeable future.

For the missing output problem, an issue should be created for Nix, once we figure out a minimal repro example.


I looked a bit more into the tarball reproducibility issue that @happysalada and @yu-re-ka mentioned, and it appears that the only differences come down to the permissions of the symlinks in the tar. For example:

│ │ -lrwxrwxrwx   0        0        0        0 1970-01-01 00:00:01.000000 node_modules/restify/node_modules/.bin/semver -> ../semver/bin/semver.js
│ │ +lrwxr-xr-x   0        0        0        0 1970-01-01 00:00:01.000000 node_modules/restify/node_modules/.bin/semver -> ../semver/bin/semver.js

(- is what I got, and + is @happysalada's tarball)

For reference, I'm using ZFS as my filesystem.

@happysalada
Copy link
Contributor

happysalada commented Sep 21, 2021

Thank you for your work on this!

I've tested again and here is the thing that works for me

{ nodePackages, fetchFromGitHub }: 

nodePackages.fetchNodeModules {
  name = "bgp-node-modules";
  src = fetchFromGitHub {
    owner = "nttgin";
    repo = "BGPalerter";
    rev = "v1.28.1";
    sha256 = "sha256-Y0atkJfZxjUOGPQ3goXS/YD5SsX9ZjpbM0Nc5IuaFP4=";
  };

  makeTarball = false;
  sha256 = "sha256-/x8XwHxIEYS3IxEuk1jp1wYyJIV11DdSXUU+DCkrQ1g=";
}

(just to confirm if we still have a different hash on different platforms)
(the closure size seems to be 30 mb, so not that much).

The good news is that the bug of the disappearing path after building is gone with this new update.

I think the main point is to figure out if the hash is still different on different platforms. If it is, what to do about it.

There are a couple of additional nits

  • using mktemp instead of /tmp
  • perhaps defaulting production to false ? I was thinking that most people wanting to use node_modules are going to use those to build something and so including dev modules seems the default to me.

@roberth I like the idea. People could generate their own node-packages to test when adding a package. When they are confident it works, they just commit the result of the expression, and then weekly the version update can be run. There might still be some hiccups, but it could be a short term solution. I think it would just be a matter of documenting the workflow properly.

@roberth
Copy link
Member

roberth commented Sep 21, 2021

@happysalada would you like to lead such an effort? The haskellPackages team has shown that a few committed members can have a great impact.
EDIT: Effort spent on testing will remain useful even after migrating to a fetching solution and having a focused team will ease the migration and improve maintenance afterwards.

@happysalada
Copy link
Contributor

@roberth thank you for your idea. I am a little at capacity right now, but I have added testing this approach on my task list for as soon as I get a bit more time.

@stale
Copy link

stale bot commented Apr 16, 2022

I marked this as stale due to inactivity. → More info

@stale stale bot added the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Apr 16, 2022
@RaitoBezarius
Copy link
Member

Still important to me, can I do something to help @happysalada ?

@stale stale bot removed the 2.status: stale https://github.com/NixOS/nixpkgs/blob/master/.github/STALE-BOT.md label Sep 1, 2022
@happysalada
Copy link
Contributor

What is your level of interest on this ?
It needs to be tested with several packages to see if the hash difference appears on different platforms for different packages. If it does, I don't think it's a problem, it's just a matter of making nix expressions that take into account the different hashes for different platform. (I think the vscode package has similar things).

if you are interested in doing 3 tests with js packages you are interested in, and reporting the results, that would be amazing.
After that, we can try to make a decision whether to merge or not. I'm personally in favour.

@zhaofengli
Copy link
Member Author

I've been using this downstream, but we should push this forward. I'll rebase the PR in a bit, and add the bgpalerter package in as well so it can be more easily tested.

@RaitoBezarius
Copy link
Member

What is your level of interest on this ? It needs to be tested with several packages to see if the hash difference appears on different platforms for different packages. If it does, I don't think it's a problem, it's just a matter of making nix expressions that take into account the different hashes for different platform. (I think the vscode package has similar things).

if you are interested in doing 3 tests with js packages you are interested in, and reporting the results, that would be amazing. After that, we can try to make a decision whether to merge or not. I'm personally in favour.

Between medium and high, I am not sure though I have different platforms alas (only x86_64-linux, but maybe I can ask for aarch64-linux community builder to test this platform), I can give it a try on some packages I know.

@winterqt winterqt mentioned this pull request Sep 3, 2022
13 tasks
@winterqt
Copy link
Member

winterqt commented Sep 3, 2022

Hi all.

Over the past week or so, I've been working on a solution to this issue. (I actually didn't even know this PR was a thing before starting work on mine, until someone told me about it.)

It populates a cache directory to avoid the issue of different hashes occurring across platforms, and because of its superior reproducibility, is most likely the better solution to push forward.

There are more details in #189539 if you're interested in taking a look -- I welcome any and all comments/suggestions.

@zhaofengli
Copy link
Member Author

I haven't tested #189539 yet, but I think the approach there is less prone to breakages. Let's close this and focus our efforts on that.

@zhaofengli zhaofengli closed this Sep 4, 2022
@zhaofengli zhaofengli deleted the fetchnodemodules branch September 4, 2022 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: fetch Fetchers (e.g. fetchgit, fetchsvn, ...) 8.has: documentation This PR adds or changes documentation 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 0 This PR does not cause any packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.