-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
Description
Cleanup to be done once #256324 is merged.
Note
I use the term "multiplexed package" to refer to a package which has multiple versions present in a single instance of the cudaPackages package set.
Examples of such "multiplexed packages" include cuDNN and TensorRT, as there are generally several versions of these packages available in each instance of the cudaPackages package set.
A non-example would be NVCC, as there is exactly one in each instance of the cudaPackages package set.
Cleaner abstraction to support overrides
Per a conversation with @SomeoneSerge (thank you for generously giving so much of your time to me so I could throw a bunch of Nix at you!), how one should attempt to override core attributes of the package set isn't clear. That is, the "backbone" attribute sets which are instantiated outside of the package set fixed point and manually threaded through the extensions to prevent infinite recursion (which occurs because the existence of some attributes in the package set depends on these values): cudaVersion, gpus, flags, and nvccCompatibilities.
Serge suggested making use of the module system to solve this issue:
Instead of evaluating modules inside the generic multiplex builder and each extension.nix, we do it once in pkgs/top-level/cuda-packages.nix. We expose two additional attributes in the package set: a list for additional modules to be evaluated against the ones which exist in-tree (allowing users to extend what we provide) and the resulting evaluated configuration. The builders would then take some subtree of that configuration instead of performing the module evaluation themselves.
Fragile implementation for multiplexed packages
Multiplexed packages rely on the existence of several paths which depend on the pname of the package. As an example, the expression which produces cudnn packages requires that all of pkgs/development/cuda-modules/cudnn/{fixup,shims,releases}.nix exist, and access them by name.
This is fragile and generally unwanted. Improvements suggested in Cleaner abstraction to support overrides may provide a way to re-implement such logic in a cleaner way.
Source-available packages and source priorities
Nixpkgs prefers to use source-builds when possible. Some packages, like CUDA CCCL are available both pre-packaged from NVIDIA and from source: https://github.com/NVIDIA/cccl.
It's unclear at the moment how to handle builds when multiple sources are available. However, this does seem like something the module system would excel at handling, given we are able to assign priorities to different values.
Laundry list
- Find a way to abstract away CuTensor's
extension.nix. Likely solved by Cleaner abstraction to support overrides. - Update logic to handle creation of overrides using the new
dependenciesfield present in the feature manifests generated bycuda-redist-find-features. - Document this particular section of NVIDIA's documentation for other CUDA maintainers: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#host-compiler-support-policy
- "We only support libstdc++ (GCC’s implementation) for all the supported host compilers for the platforms listed above."
- That makes it sound like, even when using Clang, we need to use GCC's C++ standard library.
- Figure out what
cuda_compatis/does: Use cuda_compat drivers when available #267247 - Figure out how to extend option sets with additional options. Likely solved by Cleaner abstraction to support overrides
- Investigate behavior of overrides of
cudaVersionon thecudaPackagespackage set throughoverrideandoverrideAttrs. Likely solved by Cleaner abstraction to support overrides. - Move
cuda-redist-find-featurestonix-community(cudaPackages: support multiple platforms #256324 (comment))
@NixOS/cuda-maintainers @SomeoneSerge @samuela thoughts?
Metadata
Metadata
Assignees
Labels
Projects
Status