Skip to content

stdenv: PURL fetcher introduction & feature flag#454333

Open
h0nIg wants to merge 1 commit intoNixOS:masterfrom
h0nIg:purl-featureflag
Open

stdenv: PURL fetcher introduction & feature flag#454333
h0nIg wants to merge 1 commit intoNixOS:masterfrom
h0nIg:purl-featureflag

Conversation

@h0nIg
Copy link
Contributor

@h0nIg h0nIg commented Oct 21, 2025

#421125 was merged and reverted later, because of regressions.

the background is described here: #421125 (comment)

@wolfgangwalther outlined the conditions and would like to enhance CI - over time. This is a continuous approach, which is in line with packages which have been found to be defunct and which need a fix. There may be more packages which have problems and we would like to prevent further fallout by a feature flag (prevents accessing + inheritance of drv.src / drv.srcs).

packages list: #453322 (comment)
list of real broken packages: #453322 (comment)
broken packages fix (deferrable PR: #457769)

With the old PR + the broken&platform check fix from #453291 + the feature flag, we enable maintainers to gather experience with PURL and set appropriate information (e.g. jq example, where fetchurl is used instead of fetchFromGithub)

nix-repl> xx = (import /my/nixpkgs {config={derivationPURLInheritance = true;};})
nix-repl> xx.python3Packages.boto3.meta.identifiers
{
  cpeParts = { ... };
  possibleCPEs = [ ... ];
  purl = "pkg:github/boto/[email protected]";
  purlParts = { ... };
  purls = [ ... ];
  v1 = { ... };
}

nix-repl> xx = (import /my/nixpkgs {})
nix-repl> xx.python3Packages.boto3.meta.identifiers
{
  cpeParts = { ... };
  possibleCPEs = [ ... ];
  purlParts = { ... };
  purls = [ ... ];
  v1 = { ... };
}

Things done

  • Built on platform:
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • Tested, as applicable:
  • Ran nixpkgs-review on this PR. See nixpkgs-review usage.
  • Tested basic functionality of all binary files, usually in ./result/bin/.
  • Nixpkgs Release Notes
    • Package update: when the change is major or breaking.
  • NixOS Release Notes
    • Module addition: when adding a new NixOS module.
    • Module update: when the change is significant.
  • Fits CONTRIBUTING.md, pkgs/README.md, maintainers/README.md and other READMEs.

Add a 👍 reaction to pull requests you find important.

@h0nIg h0nIg mentioned this pull request Oct 21, 2025
13 tasks
@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 8.has: changelog This PR adds or changes release notes 6.topic: ruby A dynamic, open source programming language with a focus on simplicity and productivity. 6.topic: fetch Fetchers (e.g. fetchgit, fetchsvn, ...) 6.topic: stdenv Standard environment 8.has: documentation This PR adds or changes documentation labels Oct 21, 2025
@h0nIg
Copy link
Contributor Author

h0nIg commented Oct 21, 2025

we need to squash all commits later, just for transparency let's keep the commits separated in order to understand what has been changed.

@h0nIg h0nIg marked this pull request as ready for review October 21, 2025 21:04
@h0nIg h0nIg marked this pull request as draft October 21, 2025 21:04
@h0nIg h0nIg marked this pull request as ready for review October 21, 2025 21:14
@wolfgangwalther
Copy link
Contributor

Since the transition to Github Actions we no longer have a CI which performs those checks so this needs to be done manually.

You can see the performance report in the PR summary page: https://github.com/NixOS/nixpkgs/actions/runs/18756048808?pr=454333#summary-53508514968

@nixpkgs-ci nixpkgs-ci bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Nov 1, 2025
@adisbladis
Copy link
Member

there was a lengthy discussion about the performance and improvements which have been implemented - in #421125

Nowhere in there is the fundamental design discussed.
This looks poorly designed to me: It's going to create a lot of garbage.

@h0nIg
Copy link
Contributor Author

h0nIg commented Nov 1, 2025

This looks poorly designed to me: It's going to create a lot of garbage.

@adisbladis thank you for your (constructive) feedback, IMHO it does not make sense to create noise for the other readers and duplicate / repeat.

I don't think i can remove your initial doubts here, maybe we can have a short chat in matrix instead and we align there? I ping'ed you already

@YorikSar
Copy link
Contributor

YorikSar commented Nov 3, 2025

@h0nIg Could you please post here a summary of your discussion in Matrix? I am interested to see if there's some context I'm missing that surfaced there.

@adisbladis Are you against just any new features in meta? With such stance we'll end up stagnating on some very important aspects of Nixpkgs development, like security in these cases (#439074 (comment) and this PR). As I mentioned in that PR, we did a round of optimisation of this performace-critical code that should allow us to land some features without slowing us down overall.

@nixpkgs-ci nixpkgs-ci bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Nov 5, 2025
@h0nIg
Copy link
Contributor Author

h0nIg commented Nov 5, 2025

@h0nIg Could you please post here a summary of your discussion in Matrix? I am interested to see if there's some context I'm missing that surfaced there.

@adisbladis Are you against just any new features in meta? With such stance we'll end up stagnating on some very important aspects of Nixpkgs development, like security in these cases (#439074 (comment) and this PR). As I mentioned in that PR, we did a round of optimisation of this performace-critical code that should allow us to land some features without slowing us down overall.

doesn't look like he is interested in a conversation

image

@jerith666 @jficz @bivsk @oneingan @stigtsp @CathalMullan @eclairevoyant @tomberek @pbsds @arianvp @06kellyjac @pombredanne @ConnorBaker @nikstur and others:

sorry, no PURL - can someone help please?

I'm tired and frustrated that we still have bogus discussions about 8MB additional memory out of 2600MB of evaluation memory. Even with a feature flag we continue to have this discussion - wtf.

https://github.com/NixOS/nixpkgs/actions/runs/18996522700?pr=454333
(4695549 + 191270 + 2088348 + 2150061) / 1024 / 1024 = 8,7MB diff
(590957759 + 83464891 + 1266555136 + 800003130) / 1024 / 1024 = 2614MB absolute

People would like to address security issues, but they can not because the tracking information is not available in an appropriate way. Without CPE PR and this pURL PR, people need to continue reading various channels / mailing lists / ... to have CVE on their radar, instead of falling back to automated processes.
Requesting CPE is even a formal process, not every software vendor / developer would like to go this way. The universal pURL identifier has benefits which may solve this "formal gap" problem, but different story and the two standards can get combined to achieve a maximum of coverage.....

I reverted the old PR to avoid further fallout, fixed the issue and added a feature flag for further surprises.
@mweinelt do you want to reduce the work for the security team, by enabling others to track vulnerabilities? CVE schema with PURL information has been released: https://github.com/CVEProject/cve-schema/releases/tag/v5.2.0

@nixpkgs-ci nixpkgs-ci bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Nov 5, 2025
@nix-owners nix-owners bot requested a review from adisbladis November 5, 2025 17:15
@jficz
Copy link
Contributor

jficz commented Nov 5, 2025

Unfortunate but understandable. Having meta hard dependent on src is imho a no-go, for various reasons described elsewhere. On the other hand, some kind of SBOM is needed if we want Nix(OS) to be taken seriously by certain businesses. For me PURL is just a way to achieve SBOM, not necessarily the only way.

While I support having PURL as (imho) the most reasonable choice, I think the real issue is having a SBOM in some form at least, and soon ™️

For starters, I'd be happy with just pname+version available, just to bootstrap the thing (and create, for example, an osquery table for nix packages). PURL can be the next step - as it will apparently require more testing and possibly workarounds around some nixpkgs hacks.

On the other hand, having hacks in nixpkgs is... unfortunate on its own so perhaps fight at both fronts? Work towards PURL in some reasonable form and at the same time make package maintainers fix their hacks or make them PURL-compatible. This could be done in long(is) term, let's say warnings about hacks in 26.05 and remove (still) affected packages in 26.11.

I'm pretty sure there's no way for this to land in 25.11 in any form.

@wolfgangwalther
Copy link
Contributor

@h0nIg You should take a step back and stop trying to push everyone to work according to your timeline. You are being incredibly pushy. If you want others to give input to your stuff, answer their questions and be patient. This is open source, so major changes need time. I know this is sometimes frustrating, we have all been there - but it's reality.

@mweinelt
Copy link
Member

mweinelt commented Nov 5, 2025

@mweinelt do you want to reduce the work for the security team, by enabling others to track vulnerabilities? CVE schema with PURL information has been released: CVEProject/[email protected] (release)

And please don't try to make this about me. Yes, I want those things, but we need to deal with the concerns other committers have.

@leona-ya
Copy link
Member

leona-ya commented Nov 8, 2025

This is a way to complicated PR to rush before branch-off. We (@jopejoe1 and I) won't accept this for 25.11, we don't want more breakage. Please wait for branch-off at least.

@leona-ya leona-ya added the 2.status: wait for branch‐off Waiting for the next Nixpkgs branch‐off label Nov 8, 2025

<!-- To avoid merge conflicts, consider adding your item at an arbitrary place in the list instead. -->

- Metadata identifier purl (Package URL, https://github.com/package-url/purl-spec) has been added for fetchgit, fetchpypi and fetchFromGithub fetchers and mkDerivation has been adjusted to reuse these informations. Package URL's enables a reliable identification and locatization of software packages. Maintainers of derivations using the adopted fetchers should rely on the `drv.src.meta.identifiers.v1.purl` default identifier and can enhance their `drv.meta.identifiers.v1.purls` list once they would like to have additional identifiers. Maintainers using fetchurl for `drv.src` are urged to adopt their `drv.meta.identifiers.purlParts` for proper identification. Maintainers should check that their `drv.src` / `drv.srcs` either evaluate properly or that they throw an UnsupportedPlatform statement instead of a missing attribute error. The inheritance feature of `drv.src(s).meta.identifiers.purl(s)` for `drv.meta.identifiers.purl(s)` can get activated via `config.derivationPURLInheritance`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Metadata identifier purl (Package URL, https://github.com/package-url/purl-spec) has been added for fetchgit, fetchpypi and fetchFromGithub fetchers and mkDerivation has been adjusted to reuse these informations. Package URL's enables a reliable identification and locatization of software packages. Maintainers of derivations using the adopted fetchers should rely on the `drv.src.meta.identifiers.v1.purl` default identifier and can enhance their `drv.meta.identifiers.v1.purls` list once they would like to have additional identifiers. Maintainers using fetchurl for `drv.src` are urged to adopt their `drv.meta.identifiers.purlParts` for proper identification. Maintainers should check that their `drv.src` / `drv.srcs` either evaluate properly or that they throw an UnsupportedPlatform statement instead of a missing attribute error. The inheritance feature of `drv.src(s).meta.identifiers.purl(s)` for `drv.meta.identifiers.purl(s)` can get activated via `config.derivationPURLInheritance`.
- Metadata identifier purl (Package URL, https://github.com/package-url/purl-spec) has been added for fetchgit, fetchpypi and fetchFromGithub fetchers and mkDerivation has been adjusted to reuse these informations. Package URL's enables a reliable identification and locatization of software packages. Maintainers of derivations using the adopted fetchers should rely on the `drv.src.meta.identifiers.v1.purl` default identifier and can enhance their `drv.meta.identifiers.v1.purls` list once they would like to have additional identifiers. Maintainers using fetchurl for `drv.src` are urged to adopt their `drv.meta.identifiers.purlParts` for proper identification. Maintainers should check that their `drv.src` / `drv.srcs` either evaluate properly or that they throw an UnsupportedPlatform statement instead of a missing attribute error. The inheritance feature of `drv.src(s).meta.identifiers.purl(s)` for `drv.meta.identifiers.purl(s)` can get activated via `config.allowSrcEvalForDrvMeta`.


### Package URL {#sec-meta-identifiers-purl}

[Package-URL](https://github.com/package-url/purl-spec) (PURL) is a specification to reliably identify and locate software packages. Through identification of software packages, additional (non-major) use cases are e.g. software license cross-verification via third party databases or initial vulnerability response management. Package URL's shall default to the mkDerivation.src, as the original consumed software package is the single point of truth. The default inheritance must get enabled explicitly through the nixpkgs config paramter `derivationPURLInheritance`.
Copy link
Member

@SuperSandro2000 SuperSandro2000 Dec 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[Package-URL](https://github.com/package-url/purl-spec) (PURL) is a specification to reliably identify and locate software packages. Through identification of software packages, additional (non-major) use cases are e.g. software license cross-verification via third party databases or initial vulnerability response management. Package URL's shall default to the mkDerivation.src, as the original consumed software package is the single point of truth. The default inheritance must get enabled explicitly through the nixpkgs config paramter `derivationPURLInheritance`.
[Package-URL](https://github.com/package-url/purl-spec) (PURL) is a specification to reliably identify and locate software packages. Through identification of software packages, additional (non-major) use cases are e.g. software license cross-verification via third party databases or initial vulnerability response management. Package URL's shall default to the mkDerivation.src, as the original consumed software package is the single point of truth. The default inheritance must get enabled explicitly through the nixpkgs config parameter `allowSrcEvalForDrvMeta`.

@nixpkgs-ci nixpkgs-ci bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Dec 9, 2025
@SuperSandro2000
Copy link
Member

The 25.11 release is out of the door now. How do we continue here?

@h0nIg
Copy link
Contributor Author

h0nIg commented Dec 20, 2025

@SuperSandro2000 I don't know, european companies will have to stop using nixpkgs for their commercial products because of the Cyber Resillience Act. I tried to compile a problem statement here: #472828

@samueldr
Copy link
Member

european companies will have to stop using nixpkgs for their commercial products because of the Cyber Resillience Act

For the record, this is FUD and inaccurate.

@arianvp
Copy link
Member

arianvp commented Dec 21, 2025

I have very little interest collaborating and moving this PR forward if you keep behaving like you are.

Emotional blackmail is not welcome and you've been warned twice now about this

@arianvp
Copy link
Member

arianvp commented Dec 21, 2025

Just to let you know how this looks from the outside:

  1. Supersandro clearly wants to move this forward. Solve a stalemate. And restart work on this
  2. Your first reaction to this is a again an emotional appeal that immediately digs your heels in the sand and drawing a red line

Instead of doing that, it'll be helpful to suggest collaborating with Sandro to get this in a mergeable state again? He clearly is interested in this landing and seems eager to help. So use that to your advantage.

@raboof
Copy link
Member

raboof commented Dec 21, 2025

With tools like https://github.com/tiiuae/sbomnix, https://github.com/nikstur/bombon, https://github.com/tweag/genealogos and others, while we might be 'behind the curve' on SBOMs in some respects, blanket statements such as "european companies will have to stop using nixpkgs for their commercial products because of the Cyber Resillience Act" are false and unhelpful.

I don't think anyone disagrees that those existing tools have their limitations, and that it would be great to improve in that respect - I do agree there is a lot of potential here, and if we get this right we can leapfrog other systems and have something that's actually much better. What is less clear (as mentioned before) is whether this PR is what's missing to make meaningful improvements. Back then I was in favor of merging the PR, to allow downstream experimentation and learn what changes would be needed.

I'm not so sure anymore, perhaps it would be better to keep this on a branch until we have a clear motivating PoC SBOM tool that actually does produce better output with these changes?

@raboof
Copy link
Member

raboof commented Dec 24, 2025

I noticed we've been doing a lot of talking 'in the abstract' and felt the need to summarize the topic and get some concrete examples of what things look like today. I wrote something up at https://arnout.engelen.eu/blog/nix-state-of-the-sbom/ . I tried to introduce the topic so it'd be helpful for someone new to the topic to get spun up on it, so it might be a bit verbose for y'all already participating in this thread.

Nonetheless the 6 example SBOMs might be helpful to pour over. The post is still rather draft-y, feedback welcome - I do plan to keep it updated as my understanding/opinions and the tools improve.

Based on that, I get the impression that we may not initially need the 'inheritance' part of this PR, which seems to be the controversial part: bombon can show the 'inferred' information just fine (though it currently puts it into externalReferences rather than the purl), and while sbomnix doesn't yet, AFAICT it seems like relatively low-hanging fruit there (famous last words...).

That said, having the fields to manually include PURL metadata in nixpkgs packages for cases where SBOM tools cannot accurately/completely infer it would still be very valuable. Perhaps it would make sense to extract that part of this PR into a separate one, which might be noncontroversial? We can keep the 'inheritance' aspect on a branch to experiment with without committing to it by actually merging it into nixpkgs already.

@fricklerhandwerk
Copy link
Contributor

Mostly-automatic annotation of sources with appropriate pURLs is very desirable. The strongest argument is that otherwise we don't have that structured data to work with downstream. I agree with @raboof that a smaller scope of just enabling (and surely also checking) the annotation would already get us a step forward. Whoever really needs to read those annotations will then be able to do so.

What I don't fully understand is how meta depending on src.meta is a big problem, except that there are packages where it's just broken. Those can be fixed, no? And having more checks during CI is always good.

I understand the evaluation time concern, but in a sense that is a deployment issue. There's nothing in principle speaking against distributing Nixpkgs sources just for derivations, with all the metadata expressions stripped and all the attributes packed into one file, and shipping the metadata with a database such as nix-index.

@h0nIg
Copy link
Contributor Author

h0nIg commented Jan 29, 2026

Mostly-automatic annotation of sources with appropriate pURLs is very desirable. The strongest argument is that otherwise we don't have that structured data to work with downstream. I agree with @raboof that a smaller scope of just enabling (and surely also checking) the annotation would already get us a step forward. Whoever really needs to read those annotations will then be able to do so.

What I don't fully understand is how meta depending on src.meta is a big problem, except that there are packages where it's just broken. Those can be fixed, no? And having more checks during CI is always good.

I understand the evaluation time concern, but in a sense that is a deployment issue. There's nothing in principle speaking against distributing Nixpkgs sources just for derivations, with all the metadata expressions stripped and all the attributes packed into one file, and shipping the metadata with a database such as nix-index.

@pombredanne asked me to compile a demo, i demonstrated which data can get extracted with this patch some time ago: #421125 (comment)

https://github.com/sap-contributions/nixpkgs-purl-demo/

out of 10238 python packages (12485 first and n-level-derivations), 10574 are identifiable out of the box. nearly 17% have a homepage different to the source location

focussing on the python derivations only, you can achieve 97% of purl match rate out of the box (302 out of 9648).

a rough list of packages and their purl: https://github.com/sap-contributions/nixpkgs-purl-demo/blob/main/data-name.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 2.status: wait for branch‐off Waiting for the next Nixpkgs branch‐off 6.topic: fetch Fetchers (e.g. fetchgit, fetchsvn, ...) 6.topic: ruby A dynamic, open source programming language with a focus on simplicity and productivity. 6.topic: stdenv Standard environment 8.has: changelog This PR adds or changes release notes 8.has: documentation This PR adds or changes documentation 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 12.approvals: 3+ This PR was reviewed and approved by three or more persons.

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.