Skip to content

lib/attrsets: add mapAttrsToListRecursive(Cond) function#395160

Merged
wolfgangwalther merged 1 commit intoNixOS:masterfrom
illdefined:mapattrstolistrecursive
Sep 8, 2025
Merged

lib/attrsets: add mapAttrsToListRecursive(Cond) function#395160
wolfgangwalther merged 1 commit intoNixOS:masterfrom
illdefined:mapattrstolistrecursive

Conversation

@illdefined
Copy link
Contributor

I have been needing functionality to recursively transform an attribute set to a list quite often lately (for example in #391329), and I believe that having a generic implementation in lib might be useful.

Similar constructions are used in some of the generators in lib.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 25.05 Release Notes (or backporting 24.11 and 25.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added the 6.topic: lib The Nixpkgs function library label Apr 1, 2025
@illdefined illdefined force-pushed the mapattrstolistrecursive branch from 833a840 to de8e584 Compare April 1, 2025 12:33
@github-actions github-actions bot added 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Apr 1, 2025
@illdefined illdefined marked this pull request as ready for review April 1, 2025 12:38
@nix-owners nix-owners bot requested review from hsjobeki and infinisil April 1, 2025 12:39
Copy link
Contributor

@hsjobeki hsjobeki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, code looks very lean.
I think your flatten approach might have a performance benefit vs a recursive fold as well.

This also related to #237776

Lgtm change overall. But needs testing and documentation.

I would give it a bit of time to let @roberth and @infinisil think about this as well.

@illdefined illdefined force-pushed the mapattrstolistrecursive branch from de8e584 to 74980c6 Compare April 1, 2025 13:49
@illdefined illdefined requested a review from hsjobeki April 1, 2025 13:53
roberth
roberth previously requested changes Apr 1, 2025
@illdefined illdefined force-pushed the mapattrstolistrecursive branch from 74980c6 to 80156fd Compare April 1, 2025 19:52
@ofborg ofborg bot added the 2.status: merge conflict This PR has merge conflicts with the target branch label Apr 1, 2025
@illdefined illdefined force-pushed the mapattrstolistrecursive branch 2 times, most recently from 8c1bf9f to d552daa Compare April 1, 2025 20:04
@ofborg ofborg bot removed the 2.status: merge conflict This PR has merge conflicts with the target branch label Apr 1, 2025
@illdefined illdefined force-pushed the mapattrstolistrecursive branch 3 times, most recently from fafb6cb to 8d1713b Compare April 1, 2025 20:48
@illdefined illdefined requested a review from roberth April 2, 2025 10:06
@illdefined illdefined force-pushed the mapattrstolistrecursive branch 2 times, most recently from 1732abb to 5a9e090 Compare April 2, 2025 17:35
@illdefined
Copy link
Contributor Author

illdefined commented Apr 2, 2025

I aligned the argument names in the code with those in the documentation and added assertions to make the functions more user‐friendly.

@hsjobeki hsjobeki mentioned this pull request Apr 4, 2025
13 tasks
Copy link
Contributor

@hsjobeki hsjobeki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like #342330 is bit of a duplicate of this.
This PR also has some f to apply to the returned value.
And there is a slight difference in the recursion flow.

@illdefined
Copy link
Contributor Author

illdefined commented Apr 4, 2025

Seems like #342330 is bit of a duplicate of this. This PR also has some f to apply to the returned value. And there is a slight difference in the recursion flow.

If I understand correctly, one could implement the proposed collect' function based on mapAttrsToListRecursiveCond:

collect' = pred: set: mapAttrsToListRecursiveCond (path: set: !(pred path set)) (path: value: { inherit path value; }) set;

Using concatMap instead of a combination of concatLists + mapAttrsToList (which uses map internally) might however be more efficient.

@illdefined illdefined force-pushed the mapattrstolistrecursive branch from 5a9e090 to b169156 Compare April 4, 2025 15:10
@illdefined
Copy link
Contributor Author

I re‐implemented the function using concatMap instead of concatLists + mapAttrsToList, which should be more efficient according to the Nix reference manual.

To avoid repeating path ++ [ name ] and set.${name} I introduced a second function mapRecursive into the recursion, I am however not sure if that is considered good style. let bindings for the new path and value would be an alternative if call stack depth is of concern.

@h7x4
Copy link
Member

h7x4 commented Apr 11, 2025

Hello, I'm the author of #342330. I don't really mind which one we go with, but it feels a bit wasteful to reach for an noop function to use collect' (although implementing mapAttrsToListRecursiveCond in terms of collect' is probably much more wasteful with regards to list allocation). I'd be happy to have this one merged first though, and then rather optimize later.

[...] which should be more efficient according to the Nix reference manual.

Just in case you didn't know, you can verify this by configuring nix with envvars such as NIX_SHOW_STATS=1 and NIX_COUNT_CALLS=1

@illdefined
Copy link
Contributor Author

illdefined commented Apr 12, 2025

Hello, I'm the author of #342330. I don't really mind which one we go with, but it feels a bit wasteful to reach for an noop function to use collect' (although implementing mapAttrsToListRecursiveCond in terms of collect' is probably much more wasteful with regards to list allocation). I'd be happy to have this one merged first though, and then rather optimize later.

Agreed. I also expect an implementation of collect' based on mapAttrsToListRecursiveCond to be more efficient than the other way around.

Perhaps we can just measure if an independent implementation of collect' is worth the few extra lines of code.

[...] which should be more efficient according to the Nix reference manual.

Just in case you didn't know, you can verify this by configuring nix with envvars such as NIX_SHOW_STATS=1 and NIX_COUNT_CALLS=1

Thank you. I didn’t know about the NIX_COUNT_CALLS env var.

@h7x4 h7x4 requested a review from hsjobeki June 1, 2025 00:10
@illdefined illdefined added the 9.needs: community feedback This needs feedback from more community members. label Jul 28, 2025
@wolfgangwalther wolfgangwalther dismissed stale reviews from roberth and hsjobeki August 31, 2025 09:56

These were addressed.

Copy link
Contributor

@wolfgangwalther wolfgangwalther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the nits mentioned above, this LGTM.

@hsjobeki @roberth do you want to have another look?

@nixpkgs-ci nixpkgs-ci bot added the 12.approvals: 1 This PR was reviewed and approved by one person. label Aug 31, 2025
@wolfgangwalther wolfgangwalther removed the 9.needs: community feedback This needs feedback from more community members. label Aug 31, 2025
@h7x4
Copy link
Member

h7x4 commented Sep 8, 2025

@hsjobeki sorry for the ping, do you think you could take another look at this at some point?

@wolfgangwalther
Copy link
Contributor

We can always improve docs and tests later, so let's do it!

@wolfgangwalther wolfgangwalther merged commit 399136b into NixOS:master Sep 8, 2025
30 of 31 checks passed
@h7x4
Copy link
Member

h7x4 commented Sep 9, 2025

While testing out the new function, trying to use it as I had imagined using lib.collect', I found out they did actually work a bit differently. With lib.mapAttrsToListRecursiveCond does not seem possible to avoid catching the leaf attrs of the sets where the recursion did not stop early.

Like, imagine you wanted to do the following:

# input:
{
  a.b.c.i-want-this-one = { x = 1; y = 2; };
  x.y.z.i-want-this-one-too = { x = 1; y = 2; };
  m.n.o.i-dont-want-this-one = { a = 3; b = 4; };
}

# output:
[
  { path = [ "a" "b" "c" "i-want-this-one" ]; value = { x = 1; y = 2; }; }
  { path = [ "x" "y" "z" "i-want-this-one-too" ]; value = { x = 1; y = 2; }; }
]

You might try something like the following:

lib.mapAttrsToListRecursiveCond
  (path: _value: !(lib.elem (lib.last path) [ "i-want-this-one" "i-want-this-one-too" ]))
  (path: value: { inherit path value; })
  input

# output: 
[
  { path = [ "a" "b" "c" "i-want-this-one" ]; value = { x = 1; y = 2; }; }
  { path = [ "x" "y" "z" "i-want-this-one-too" ]; value = { x = 1; y = 2; }; }
  { path = [ "m" "n" "o" "i-dont-want-this-one" "a" ]; value = 3; }
  { path = [ "m" "n" "o" "i-dont-want-this-one" "b" ]; value = 4; }
]

It's not possible to rewrite the condition to exclude the extra leaf nodes, because the current implementation with unconditionally include everything that is not an attrset in the result.

I'm wondering if it would be better to change f : AttrSet -> Bool to f : a -> Bool, and then testing the leaf nodes for inclusion. Then you should be able to add !(isAttrs value) || to the condition to achieve the same result as what is done currently. Since it's merged so recently, maybe we could redo the implementation quickly before anyone starts using it? Although, I wonder if the map naming would be accurate anymore?

@wolfgangwalther
Copy link
Contributor

I'm wondering if it would be better to change f : AttrSet -> Bool to f : a -> Bool, and then testing the leaf nodes for inclusion. Then you should be able to add !(isAttrs value) || to the condition to achieve the same result as what is done currently. Since it's merged so recently, maybe we could redo the implementation quickly before anyone starts using it? Although, I wonder if the map naming would be accurate anymore?

I don't think we should change that, because this is how all the other xxxCond functions behave as well, right? This condition is just to decide whether to recurse into an attrset or not, by definition.

@h7x4
Copy link
Member

h7x4 commented Sep 9, 2025

I don't think we should change that, because this is how all the other xxxCond functions behave as well, right? This condition is just to decide whether to recurse into an attrset or not, by definition.

Yes, that's why I'm wondering about renaming. I was under the impression that this PR would superseed #342330, but I'm not able to achieve what I'm trying to do with this implementation. Considering that making the change would still let you be able to achieve the normal map cond, I was hoping we could change it. Alternatively, implement mapAttrsToListRecursiveCond in terms of collect'.

@wolfgangwalther
Copy link
Contributor

I'd not change anything about this function, because it aligns with other functions that work the same way. If that means it does not replace #342330, then that PR should be continued.

Alternatively, implement mapAttrsToListRecursiveCond in terms of collect'.

This could be done later, if it turns out to be efficient enough, I'd say.

@hsjobeki
Copy link
Contributor

hsjobeki commented Sep 11, 2025

Yes, that's why I'm wondering about renaming. I was under the impression that this PR would superseed #342330, but I'm not able to achieve what I'm trying to do with this implementation. Considering that making the change would still let you be able to achieve the normal map cond, I was hoping we could change it.

You should be able to do:

lib.mapAttrsToListRecursiveCond
  (path: _value: !(lib.elem (lib.last path) [ "i-want-this-one" "i-want-this-one-too" "i-dont-want-this-one" ]))
  (path: value: { inherit path value; })
  input

Then do a second pass with filter, which is probably okay depending on the data.

This function seems to map any primitive leafs, independently of the pred.
Maybe we can still push collect' as an inverse of this function and try to unify them via #342330

Alternatively, implement mapAttrsToListRecursiveCond in terms of collect'.

This shouldn't be possible currently right? Because collect' doesn't have 2 independent functions for collecting and mapping and its pred doesn't take the path as argument. We should try to make them equal first.

While i think map is mostly for performance, so we don't need to make a second pass over the list.
We could also add another argument later filter that independently controls whether to include the leaf element, but its a tradeoff of having an extra argument, that might not be needed. (depends on your data probably)

With the status quo, after @wolfgangwalther did the merge, i would try to use it as is.
Though i am also not super happy with the name and would probably rather like to rename this in the future. That's always possible.

@h7x4 Could you probably try what is necessary to implement this in terms of collect' ?
I think that collect' (v: !isAttrs v || !pred v) attrs should be equivalent to mapAttrsToListRecursiveCond pred id attrs (without path) (correct me if i am wrong)

@hsjobeki
Copy link
Contributor

hsjobeki commented Sep 11, 2025

Going forward i would try to enhance #342330 and try to make it a superset of this function. I am not sure how both compare in performance (they need to be kind of comparable first) should we take a look at the stats and continue the discussion in the other PR? I would take an extra look at list allocations and then decide if its worth to unify one in the terms of another?

@hsjobeki
Copy link
Contributor

hsjobeki commented Sep 11, 2025

De Morgan style:
if isAttrs && pred then recurse else collect
:=
if ! isAttrs || ! pred then collect else recurse

So we can spot that we always collect if the value is not an attribute set, independently from the predicate.
Which can be problematic in non-homogenous attribute sets.

we might accept this as consistent to mapAttrsRecursiveCond which uses the same logic, and implement a less opinionated logic in #342330,
The problem is the static isAttrs which is needed for recursing, but also affect the opposite branch which is when to collect

According to the docs of mapAttrsRecursiveCond the following can be used to collect derivations but this would also collect any primitve value (or lists), due to this exact problem.

nix-repl> lib.mapAttrsRecursiveCond (as: ! lib.isDerivation as) (p: lib.id) { hello = pkgs.hello; a = 1; }
{
  a = 1;
  hello = «derivation /nix/store/5g60vyp4cbgwl12pav5apyi571smp62s-hello-2.12.2.drv»;
}

(Sorry for splitting my thoughts, i hope you could follow :) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: lib The Nixpkgs function library 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-darwin: 1 This PR causes 1 package to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. 12.approvals: 1 This PR was reviewed and approved by one person.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants