Skip to content

Next steps for Nvidia compiler instructions #1563

@fmahebert

Description

@fmahebert

In PR #1542 (merged into release/1.9.0, not develop) I updated and reformatted the nvidia-specific instructions.

There is now a template configs/templates/jedi-mpas-nvidia-dev/spack.yaml which describes a subset of the jedi-mpas-env virtual env, along with some notes about what was cut out and why.

There is also a site-config configs/sites/tier2/ubuntu2404-nvhpc which indicates what to install from apt and points to those site packages.

I believe next steps would be (but up for discussion)

  • figure out why JEDI compiled with this environment fails to load cublas at runtime (@l90lpa will open an issue on this with details)
  • [perhaps related to above] see if we can point spack to use cublas and cufft instead of installing openblas and fftw?
  • make better use of gcc as a fallback compiler instead of resorting to the OS package manager; this would enable a more portable configuration
    • my attempts to set up gcc as fallback compiler led to many concretization issues, but presumably this is user error on my part
    • note that we must use nvidia for all fortran packages (technically, all packages that provide modules that will be imported by JEDI; this is because fortran has no "abi compatibility" thus module files are not portable across compilers), and probably all packages using MPI as well; this impacts certain components that may or may not build with nvidia (gsibec, ...); thus it may or may not be possible to build the entire jedi-mpas-env even with ideal use of fallback compilers.
  • and surely any number of cleanups from a spack(-stack) expert point of view!

I'm opening this issue to give @stiggy87 some pointers on current state, and some ideas of where we could take this next.

Metadata

Metadata

Labels

INFRAJEDI Infrastructure

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions