Conversation
|
Thanks to @infinisil @tilpner @srhb @toonn @Myrl for helping with various parts of it. |
|
Very cool! One request from my side would be to ensure the doc comments are formatted in the loose convention understood by nixdoc. This means using multi-line comments for "public" functions (those which should be included in the docs), having a We don't include |
lib/sources.nix
Outdated
There was a problem hiding this comment.
Not sure how I feel about requiring a name, it makes the function more annoying to use, and I personally don't care much about being able to identify source derivations, so I'd be fine with some constant name = "filtered-src". Having to specify a name for a src and the derivation itself is a bit weird too.
There was a problem hiding this comment.
I like how eg. fetchpatch has an optional name. imo a default constant is great though.
There was a problem hiding this comment.
OK; I think it's very useful to have good names in nix-store -qR or to identify slow downloads, but given that it being optional seems to be popular, so I will change it accordingly.
Luckily fetchpatch users have started to use name more aggressively; hopefully it will also be the case here.
lib/sources.nix
Outdated
There was a problem hiding this comment.
I don't get this function. Shouldn't isPrefixOfLeafPath [a d] == true? Because isPrefixOfLeafPath [a] == true and [a] is a prefix of [a d].
There was a problem hiding this comment.
Sorry, isPrefixOfLeafPath [a] == false; that was a typo.
lib/sources.nix
Outdated
There was a problem hiding this comment.
Maybe debugTraceEnable, aligning it with debugTracePrefix
lib/sources.nix
Outdated
There was a problem hiding this comment.
All these builtins (except builtins.trace) are also reexported by lib, so you can drop builtins.
There was a problem hiding this comment.
That doesn't work, as with lib; is only enabled on a per-function level and isPrefixOfLeafPath doesn't have it, so it seems that it's needed.
I could add a with lib; to it, but is there a drawback/benefit of using lib vs builtins? It appeared to me that it's nice to only use builtins when possible, so that it's clear that you could easily copy out the function if needed / that it has not dependencies.
There was a problem hiding this comment.
It's preferable to avoid with, so Nix can perform scope checking and improve lookup performance of performance-sensitive code like this.
Not yet, currently bisecting some kernel issues :D |
|
Thank you for your contributions.
|
|
I subscribed to this issue because I ran into confusion with the currently available methods so I'd definitely still like to see progress. |
This has the intention to address various issues with the existing
`builtins.filterSource` and surrounding library functions:
* The library functions existing so far are not straightforward to use
correctly and are hard to remember.
Consequently only few people use them.
* Most existing functions are about *excluding* unwanted files,
making various hardcodes (such as `.o` and `.so` in `cleanSourceFilter`)
that typically still forget some other unwanted files.
The added approach is more about explicitly listing wanted files,
thus being more reliable.
* This very frequently results in non-reproducible builds when working
from project working trees (as opposed to nixpkgs `pkgs` builds, which
always operate from clean checkouts), with "build", "gen" or similar
directories being accidentally included as sources.
It is also a common sources of the dreaded
warning: dumping very large path (> 256 MiB); this may run out of memory
message (and as indicated, bad for performance).
* The most common antipattern regarding the above is `src = ./.`.
To prevent it, we need to provide functionality that's straightforward
to understand and convenient.
* `builtins.filterSource f ./.` can result in build impurities
depending on the basename of the parent directory
(see NixOS/nix#1305).
A solution to that is `builtins.path`, or its use in `cleanSourceWith`
(see NixOS#67996).
The added function makes its use very convenient.
* The existing functions make it difficult to debug what files are included
and why. The added function provides an argument `debugTraceEnable`,
that makes debugging easy.
After spending tens of hours of tracking down irreproducible builds resulting
from bad `src =` assignments in various projects, I hope that this function
will put an end to it.
d20e5d2 to
d3dcd71
Compare
nh2
left a comment
There was a problem hiding this comment.
I've addressed the feedback, and force-pushed a rebase on top of master.
I've also slightly improved the alignment in the debug output ("symlink" is a new file type that can also occur).
So I think this is quite ready from my side now.
roberth
left a comment
There was a problem hiding this comment.
This function implementation does too much by itself. Its implementation must be refactored by pulling out functions that perform the various tasks. Otherwise we'll end up maintaining multiple implementations of the same operations side by side.
I don't think a function with many parameters is a good replacement for something that can be expressed as set-like operations, because the structure of such an expression matters to the meaning. To illustrate (with bool ops instead), (a && b) || c is generally not equal to a && (b || c), so the meaning of an expression f { inherit a b c; } is ambiguous without knowing the intricate details of f that aren't easily expressed in natural language docs.
We can only judge the value of explicitSource by knowing how users use the currently missing or undocumented functions. Recommending it may or may not be a mistake in this new situation. If the implementation of explicitSource is small enough, perhaps we can take the risk.
The library functions existing so far are not straightforward to use correctly and are hard to remember. Consequently only few people use them.
We seem to lack the functions that make this easy. This PR has the right ideas for what those functions could be.
The added approach is more about explicitly listing wanted files,
thus being more reliable.
👍 this has been missing.
The most common antipattern regarding the above is src = ./..
To prevent it, we need to provide functionality that's straightforward to understand and convenient.
We need to document the source functions in the manual and update our examples.
A composition of well-named filters source operations will be even easier to understand than a set of parameters that don't exhibit their priorities at the call site.
The functions
The parameters are a good guide of what's missing.
-
Instead of
includeDirs, we should haveunionSourcesfrom list of sources to a single source starting at the longest common subpath. -
includeFilesshould be a function[path] -> source -> sourcethe returnstruefor the paths and their ancestry without consulting the original source.includeFilescan be implemented withunionSources. -
excludeHiddenshould be a functionsource -> source. -
pathComponentExcludesshould be a function[string] -> source -> source. Large part of its added value is that it only applies the excludes beloworigSrc. -
debugTraceEnableshould be turned into a functionsource -> source. Similarly, we could havetraceSourceWith : string -> source -> sourceortraceSourceWith : (path -> type -> string) -> source -> source -
namecan be turned into a function too, to make the possibility of renaming more visible.cleanSourceWith { inherit name src; }will do the trick, or we could refactorcleanSourceWithto call the new renaming function.
Functions like these will make explicitFilterSource either redundant, or easy to understand and maintain.
|
I really want to see something like this to succeed, the missing best practices around A composable design with mostly straightforward semantics would be amazing to have, yet for some reason it seems like it’s very hard to create. |
Appearances can be deceiving. I'm thinking a combination of learned helplessness and lack of urgency.
|
|
@roberth Splitting the function into reusable components sounds good, but it's a bigger task than I have time to tackle currently. Would you mind giving a shot at implementing your proposed ideas? |
I'm currently prioritizing reviews and maintenance over code contributions for the same reason. |
|
I marked this as stale due to inactivity. → More info |
| if tree == null | ||
| then true | ||
| else | ||
| if path == [] |
| component = builtins.head path; | ||
| restPath = builtins.tail path; | ||
| in | ||
| if !(builtins.hasAttr component tree) |
There was a problem hiding this comment.
Revert the logic here. if not is a bad, less readable pattern.
|
I recently developed a more composable, safer and easier-to-use abstraction for source filtering in a draft PR to Nixpkgs, please take a look, try it out, and give feedback! https://discourse.nixos.org/t/easy-source-filtering-with-file-sets/29117 |
|
I will close this as abandoned. You are free to reopen. |

Motivation for this change
Edit: PR description slightly outdated, please see the commit message for the current one (which has feedback addressed).
This has the intention to address various issues with the existing
builtins.filterSourceand surrounding library functions:The library functions existing so far are not straightforward to use correctly and are hard to remember. Consequently only few people use them.
Most existing functions are about excluding unwanted files, making various hardcodes (such as
.oand.soincleanSourceFilter) that typically still forget some other unwanted files. The added approach is more about explicitly listing wanted files,thus being more reliable.
This very frequently results in non-reproducible builds when working from project working trees (as opposed to nixpkgs
pkgsbuilds, which always operate from clean checkouts), with "build", "gen" or similar directories being accidentally included as sources.It is also a common sources of the dreaded
warning: dumping very large path (> 256 MiB); this may run out of memory
message (and as indicated, bad for performance).
The most common antipattern regarding the above is
src = ./..To prevent it, we need to provide functionality that's straightforward to understand and convenient.
builtins.filterSource f ./.can result in build impurities depending on the basename of the parent directory (see Basename of current directory goes into derivation for filterSource on ./. nix#1305).A solution to that is
builtins.path. The added function makes its use very convenient.The existing functions make it difficult to debug what files are included and why.
The added function provides an argument
debugEnableTracing, that makes debugging easy.Output looks like:
After spending tens of hours of tracking down irreproducible builds resulting from bad
src =assignments in various projects, I hope that this function will put an end to it.Things done
sandboxinnix.confon non-NixOS)nix-shell -p nox --run "nox-review wip"./result/bin/)nix path-info -Sbefore and after)lib.explicitFilterSource "lib" ./lib { srcDirs = [ ./lib ]; pathComponentExcludes = ["tests"]; }TODO
nixpkgsmanual, adding a section on using nix packages in project working trees, explaining how build files are often accidentally included, and recommendingexplicitFilterSourceas a solution.nixmanual, mentioning in the sections aboutbuiltins.filterSourceandbuiltins.path'sfilterthatnixpkgshas convenience functions to handle common situations.