Skip to content

Reapply "stdenv: Add CPE fields to meta"#439074

Merged
ConnorBaker merged 3 commits intoNixOS:masterfrom
tweag:cpe
Sep 15, 2025
Merged

Reapply "stdenv: Add CPE fields to meta"#439074
ConnorBaker merged 3 commits intoNixOS:masterfrom
tweag:cpe

Conversation

@YorikSar
Copy link
Contributor

@YorikSar YorikSar commented Sep 1, 2025

This reverts commit de74f9c from #438527 that reverted #409797.
Added a change that removes all null values from meta.identifiers to satisfy nix-env's XML and JSON outputs.
Unfortunately, this means that our API will have "holes" for fields with unknown values.


Add a 👍 reaction to pull requests you find important.

@nixpkgs-ci nixpkgs-ci bot added 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 6.topic: kernel The Linux kernel 6.topic: stdenv Standard environment 6.topic: lib The Nixpkgs function library 6.topic: llvm/clang Issues related to llvmPackages, clangStdenv and related 8.has: documentation This PR adds or changes documentation labels Sep 1, 2025
@YorikSar
Copy link
Contributor Author

YorikSar commented Sep 1, 2025

Here's the performance comparison between the old approach and the new one:

metric mean_before mean_after mean_diff mean_%_change p_value t_stat
envs.bytes 773000832.7273 773124234.9091 123402.1818 0.0182 0.0000 7.6144
envs.elements 57501262.8182 57508976.0000 7713.1818 0.0152 0.0000 7.6133
envs.number 39123841.2727 39131553.3636 7712.0909 0.0229 0.0000 7.6154
gc.heapSize 2898215656.7273 2890590021.8182 -7625634.9091 -0.2375 0.1762 -1.4555
gc.totalBytes 4992288488.7273 4991888356.3636 -400132.3636 -0.0095 0.0001 -6.1880
list.bytes 99379604.3636 99255051.6364 -124552.7273 -0.1640 0.0001 -6.3162
list.elements 12422450.5455 12406881.4545 -15569.0909 -0.1640 0.0001 -6.3162
nrAvoided 47293532.6364 47271846.9091 -21685.7273 -0.0547 0.0001 -6.6385
nrExprs 1806066.0909 1806082.0909 16.0000 0.0010 - inf
nrFunctionCalls 35081013.2727 35081566.4545 553.1818 0.0016 0.0003 5.4779
nrLookups 18419504.6364 18420242.1818 737.5455 0.0043 0.0067 3.4076
nrOpUpdateValuesCopied 98746514.4545 98739704.0909 -6810.3636 -0.0078 0.0000 -7.1531
nrOpUpdates 3628500.0000 3628996.6364 496.6364 0.0132 0.0003 5.3353
nrPrimOpCalls 20697403.9091 20641017.0000 -56386.9091 -0.3190 0.0000 -7.1727
nrThunks 49468843.7273 49469466.3636 622.6364 0.0013 0.0118 3.0716
sets.bytes 2341588157.0909 2341232168.7273 -355988.3636 -0.0183 0.0001 -6.5781
sets.elements 137456535.4545 137433681.0000 -22854.4545 -0.0199 0.0001 -6.7106
sets.number 8892724.3636 8893329.5455 605.1818 0.0070 0.0003 5.4113
values.bytes 1022322416.0000 1022332378.1818 9962.1818 0.0010 0.0118 3.0716
values.number 63895151.0000 63895773.6364 622.6364 0.0010 0.0118 3.0716

(only x86_64-linux, I've removed lines without diff and lines about time)

@vcunat
Copy link
Member

vcunat commented Sep 1, 2025

A random quick thought: I suppose you did consider using e.g. the empty string instead of null, if placeholders are somehow preferable?

@YorikSar
Copy link
Contributor Author

YorikSar commented Sep 1, 2025

@vcunat these parts can be empty, so empty string would not work as a placeholder. Anything other than null would be either much slower or less intuitive than just omitting them, I think.

@vcunat
Copy link
Member

vcunat commented Sep 1, 2025

Ah, I didn't know they can be empty. Then it's clear, I guess.

@h0nIg
Copy link
Contributor

h0nIg commented Sep 5, 2025

@YorikSar do you want that @infinisil approves this PR or what would be the next "step"?

Copy link
Contributor

@ConnorBaker ConnorBaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments; otherwise looks good!

Comment on lines 585 to 587
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
makeCPE =
cpeParts:
"cpe:2.3:${cpeParts.part}:${cpeParts.vendor}:${cpeParts.product}:${cpeParts.version}:${cpeParts.update}:${cpeParts.edition}:${cpeParts.sw_edition}:${cpeParts.target_sw}:${cpeParts.target_hw}:${cpeParts.language}:${cpeParts.other}";
makeCPE = flip pipe [
(attrVals [
"part"
"vendor"
"product"
"version"
"update"
"edition"
"sw_edition"
"target_sw"
"target_hw"
"language"
"other"
])
(concatStringsSep ":")
];

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of doing the length check above, have you considered having makeCPE unpack the function arguments? That would ensure that both the length is correct and that the names are correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

That's just too many allocations and operations to add to each package evaluation.

having makeCPE unpack the function arguments

That's a nice idea. Will do. Although I would still check the length to avoid using exceptions as flow control.

@nixpkgs-ci nixpkgs-ci bot added 12.approvals: 1 This PR was reviewed and approved by one person. 2.status: merge conflict This PR has merge conflicts with the target branch labels Sep 8, 2025
@nixpkgs-ci nixpkgs-ci bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Sep 12, 2025
Copy link
Contributor

@nikstur nikstur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell the original issue that led to the revert has been fixed. Good to go from my end.

Thank you for the work @YorikSar! I think we will have to build some more experience to get the versions right, but this can only be done iteratively.

@nixpkgs-ci nixpkgs-ci bot added 12.approvals: 2 This PR was reviewed and approved by two persons. and removed 12.approvals: 1 This PR was reviewed and approved by one person. labels Sep 14, 2025
@ConnorBaker
Copy link
Contributor

Any blockers preventing merging this?

@h0nIg
Copy link
Contributor

h0nIg commented Sep 15, 2025

IMHO no

@github-project-automation github-project-automation bot moved this to Done in Stdenv Sep 15, 2025
@nixpkgs-ci
Copy link
Contributor

nixpkgs-ci bot commented Sep 15, 2025

Backport failed for release-25.05, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin release-25.05
git worktree add -d .worktree/backport-439074-to-release-25.05 origin/release-25.05
cd .worktree/backport-439074-to-release-25.05
git switch --create backport-439074-to-release-25.05
git cherry-pick -x dd12290517ff6e57ab2ca384b101e0cdc617570c a178fd8c436915d233a1de37a2b211489aad5340 5e1eee582748866a51ca8f780829c3b2cf523fc3

@ConnorBaker
Copy link
Contributor

Any chance you’d be able to do the backport? If not I’ll try to get around to it this weekend.

@vcunat
Copy link
Member

vcunat commented Sep 15, 2025

Isn't this too intrusive for backporting?

@h0nIg
Copy link
Contributor

h0nIg commented Sep 15, 2025

Isn't this too intrusive for backporting?

the backport can be set on hold, if you want to gather experience? The benefit of CPE's and pURL outweights the low-medium risk.

The security tracker https://discourse.nixos.org/t/nixpkgs-supply-chain-security-project/34345/30 can benefit as well, because once a regular bump of software on master is done (without keeping a CVE in mind and determining "stable needs a fix as well"), nixpkgs stable fix is not triggered automatically.

Example: #409300 libarchive was bumped without CVE reference (edit to the PR description was adjusted AFTER additional issue was created ONLY), just by coincidence it was backported as well - which would have been unpatched for some time.

In addition CVE like this #411881 were not detected properly, therefore an additional reason for backporting / benefiting today

@YorikSar
Copy link
Contributor Author

I've updated the previous backport in #438385 with new commits. Please take a look.
@vcunat This change only adds a field to meta, it doesn't change anything else about how packages are being handled, so it is safe to backport it to 25.05. The idea is to start using it with current release and gather as much feedback as possible.

@philiptaron
Copy link
Contributor

philiptaron commented Sep 19, 2025

Isn't this too intrusive for backporting?

I agree that we should not backport this. Since the 25.11 release cycle is starting, let's get the experience with this over the few months until that happens, and have 25.11 be the release where this becomes a Nixpkgs staple.

@mdaniels5757 mdaniels5757 added the 8.has: port to stable This PR already has a backport to the stable release. label Oct 4, 2025
@adisbladis
Copy link
Member

This feature seems very half baked:

  • It has no internal usage & is pure overhead at the moment
  • The design looks overly memory intensive (way too many attrsets all over the place)

I'm not opposed to the feature as such, but it needs to go back to the drawing board.
I want to revert this before it goes into 25.11.

@K900
Copy link
Contributor

K900 commented Nov 1, 2025

I would also very much prefer to see this land after 25.11, given how the previous attempts have gone.

@YorikSar
Copy link
Contributor Author

YorikSar commented Nov 3, 2025

@adisbladis Before reverting what was brewing and collecting feedback for a long time, could you please provide more specific suggestions here?

It has no internal usage & is pure overhead at the moment

It has external uses and was really expected even in 25.05. See #354012 for more context (also cc @nikstur here).

The design looks overly memory intensive (way too many attrsets all over the place)

Could you please be more specific here? While the final implementation of this PR increases the total amount of attrsets needed for evaluation, several spinoff PRs reduced all these numbers significantly, which more than pays for this. Of course, it would be nice to keep optimising our eval times, but I think we should do this gradually, without reverting any new feature that needs more resources. You can't expect every PR to be perfectly optimised and not require any additional resources before it lands. This is just not how the world works.

@K900 I'm not sure I follow your reasoning either.

given how the previous attempts have gone

Do you mean that because 1 (only one) previous attempt at landing this change broke evaluation, which was then fixed in this PR, this feature is doomed to be in revert-reapply purgatory?

@adisbladis @K900 We need this feature to land in the release to start collecting data for vulnerability tracking initiatives. I find it very frustrating that you've decided that this needs to be reverted 2 months after it was merged with no proper review (apart from how "it looks") and with no time to address any feedback you might provide before the release.

@YorikSar YorikSar removed 8.has: port to stable This PR already has a backport to the stable release. backport release-25.05 labels Nov 5, 2025
@SuperSandro2000
Copy link
Member

I think it is very important to land those extra meta information to make the security process easier and more automated which hopefully reduces churn on the security team and makes the OS more secure for everyone.

@RossComputerGuy
Copy link
Member

bash fails to eval its CPE. Fixed in #459616.

@YorikSar
Copy link
Contributor Author

YorikSar commented Nov 8, 2025

@RossComputerGuy it looks like #435033 by @infinisil broke Bash CPE eval, it was fine after this PR.
I don’t quite understand why CI didn’t catch that before merge.

@ConnorBaker
Copy link
Contributor

That is odd…

It’s off by default:

checkMeta = mkOption {
type = types.bool;
default = false;
description = ''
Whether to check that the `meta` attribute of derivations are correct during evaluation time.
'';
};

but it looks like it is enabled for the CI eval:

checkMeta = true;

@adisbladis
Copy link
Member

@adisbladis Before reverting what was brewing and collecting feedback for a long time, could you please provide more specific suggestions here?

It has no internal usage & is pure overhead at the moment

It has external uses and was really expected even in 25.05. See #354012 for more context (also cc @nikstur here).

The design looks overly memory intensive (way too many attrsets all over the place)

Could you please be more specific here? While the final implementation of this PR increases the total amount of attrsets needed for evaluation, several spinoff PRs reduced all these numbers significantly, which more than pays for this. Of course, it would be nice to keep optimising our eval times, but I think we should do this gradually, without reverting any new feature that needs more resources. You can't expect every PR to be perfectly optimised and not require any additional resources before it lands. This is just not how the world works.

You can't just say that something unrelated makes up for adding evaluation overhead. That's not how this works at all.
Using that logic I could go and introduce all sorts of performance regressions all over the place. Sounds great, eh?

@adisbladis @K900 We need this feature to land in the release to start collecting data for vulnerability tracking initiatives. I find it very frustrating that you've decided that this needs to be reverted 2 months after it was merged with no proper review (apart from how "it looks") and with no time to address any feedback you might provide before the release.

I find your sense of urgency disturbing.. This is at the very core of nixpkgs and will seep in into the evaluation of every single derivation created.
Additionally this touches on API stability: The confidence of meta fields to be correct needs to be very high because we can't just go and break the API after the fact.

@adisbladis
Copy link
Member

I think the entire design looks problematic. But I'll mention a few specific instances of why this should be reverted.

*/
tryCPEPatchVersionInUpdateWithVendor =
  vendor: version:
  let
    regex = "([0-9]+\\.[0-9]+)\\.([0-9]+)";
    # we have to call toString here in case version is an attrset with __toString attribute
    versionMatch = builtins.match regex (toString version);
  in
  if versionMatch == null then
    {
      success = false;
      error = "version ${version} doesn't match regex `${regex}`";
    }
  else
    {
      success = true;
      value = {
        inherit vendor;
        version = elemAt versionMatch 0;
        update = elemAt versionMatch 1;
      };
    };

This is going to cause a lot of intermediate attribute sets to be allocated and is going to be another instance of #269637.

defaultCPEParts = {
  part = "a";
  #vendor = null;
  ${if attrs ? pname then "product" else null} = attrs.pname;
  #version = null;
  #update = null;
  edition = "*";
  sw_edition = "*";
  target_sw = "*";
  target_hw = "*";
  language = "*";
  other = "*";
};

needs to be moved up in scope to not re-allocate for the same attrset on every invocation.

These design issues don't surface in benchmarks yet because this feature is barely used.

@YorikSar
Copy link
Contributor Author

@adisbladis sorry for late reply

@adisbladis Before reverting what was brewing and collecting feedback for a long time, could you please provide more specific suggestions here?

It has no internal usage & is pure overhead at the moment

It has external uses and was really expected even in 25.05. See #354012 for more context (also cc @nikstur here).

The design looks overly memory intensive (way too many attrsets all over the place)

Could you please be more specific here? While the final implementation of this PR increases the total amount of attrsets needed for evaluation, several spinoff PRs reduced all these numbers significantly, which more than pays for this. Of course, it would be nice to keep optimising our eval times, but I think we should do this gradually, without reverting any new feature that needs more resources. You can't expect every PR to be perfectly optimised and not require any additional resources before it lands. This is just not how the world works.

You can't just say that something unrelated makes up for adding evaluation overhead. That's not how this works at all. Using that logic I could go and introduce all sorts of performance regressions all over the place. Sounds great, eh?

I'm sorry, I didn't convey the meaning correctly. We had that unoptimised code for quite some time, and nobody seemed to complain about it. Working on this PR, I found those optimisation opportunities and fixed them. I only landed them separately to have a clear picture of the performance impact of all the code related to CPEs. Overall work on CPEs made Nixpkgs evaluation faster. It doesn't mean that we can ignore potential issues in it, but it does mean that it is within the bounds of the performance that was, and still is tolerated.

In general, I find that trying to make things perfect before merging them can only lead to PRs being stuck in review purgatory, and features getting abandoned. I'm not saying that we should land completely unoptimised and hurtful code, especially in critical evaluation paths. I'm saying that we should land "good enough" code and optimise it while allowing users to start using new features, making Nixpkgs better for everybody.

@adisbladis @K900 We need this feature to land in the release to start collecting data for vulnerability tracking initiatives. I find it very frustrating that you've decided that this needs to be reverted 2 months after it was merged with no proper review (apart from how "it looks") and with no time to address any feedback you might provide before the release.

I find your sense of urgency disturbing.. This is at the very core of nixpkgs and will seep in into the evaluation of every single derivation created. Additionally this touches on API stability: The confidence of meta fields to be correct needs to be very high because we can't just go and break the API after the fact.

The design of this feature includes versioning. This allows us to break API in the future in a safe manner, so if we find that the current API is too heavy, we can address that in the future.
As for urgency, this PR (well, the original one) was on review for about 3 months, following 6 months of discussions in the original issue. I don't believe this shows careless urgency, the feature is ready to be used, the design was mostly settled. I tried to implement it to the best of my ability, and several people reviewed it.
What makes you not so confident in this process?


I think the entire design looks problematic. But I'll mention a few specific instances of why this should be reverted.

*/
tryCPEPatchVersionInUpdateWithVendor =
  vendor: version:
  let
    regex = "([0-9]+\\.[0-9]+)\\.([0-9]+)";
    # we have to call toString here in case version is an attrset with __toString attribute
    versionMatch = builtins.match regex (toString version);
  in
  if versionMatch == null then
    {
      success = false;
      error = "version ${version} doesn't match regex `${regex}`";
    }
  else
    {
      success = true;
      value = {
        inherit vendor;
        version = elemAt versionMatch 0;
        update = elemAt versionMatch 1;
      };
    };

This is going to cause a lot of intermediate attribute sets to be allocated and is going to be another instance of #269637.

This code is only used when product field is set, but others aren't. I expect this to be a rare occurrence. It also doesn't affect meta API, so we can change it to a less abstract version later if needed.

defaultCPEParts = {
  part = "a";
  #vendor = null;
  ${if attrs ? pname then "product" else null} = attrs.pname;
  #version = null;
  #update = null;
  edition = "*";
  sw_edition = "*";
  target_sw = "*";
  target_hw = "*";
  language = "*";
  other = "*";
};

needs to be moved up in scope to not re-allocate for the same attrset on every invocation.

Note that this attrset depends on attrs, so it cannot be just moved up. We could move up the part that doesn't depend on attrs and only create a new attrset if pname is set (we'll have to ultimately create 2 attrsets to do so). I think this would produce more attrsets, because pname is set in majority of cases.

These design issues don't surface in benchmarks yet because this feature is barely used.

Please note that these are not design issues, but rather potential performance pitfalls. We can address these if they actually become problematic, after we can measure the actual usage and actual impact. Otherwise any of these optimisations can turn out bad for the common usecase in the long run. Remember that premature unmeasured optimisation can always become its own enemy.


I'd like to reiterate that when performance of some piece of code is bad, I find it better to fix performance issues rather than reverting features. Please let me know if you find any other issues with this code. If there is a way to optimise them, we can discuss it and implement it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: kernel The Linux kernel 6.topic: lib The Nixpkgs function library 6.topic: llvm/clang Issues related to llvmPackages, clangStdenv and related 6.topic: stdenv Standard environment 8.has: documentation This PR adds or changes documentation 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 12.approvals: 2 This PR was reviewed and approved by two persons.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.