Skip to content

Feature/update nvhpc config#1580

Merged
stiggy87 merged 17 commits intoJCSDA:developfrom
stiggy87:feature/update-nvhpc-config
Mar 28, 2025
Merged

Feature/update nvhpc config#1580
stiggy87 merged 17 commits intoJCSDA:developfrom
stiggy87:feature/update-nvhpc-config

Conversation

@stiggy87
Copy link
Copy Markdown
Contributor

Summary

Update to the site and template for the jedi-mpas-nvidia-dev feature.

Testing

This has been done on an AWS GPU instance (g5.4xlarge) following instructions from @l90lpa.

Applications affected

This will only affect the JEDI MPAS Nvidia compiler work that @fmahebert completed.

Dependencies

N/A

Issue(s) addressed

Fixes #1563

Checklist

  • This PR addresses one issue/problem/enhancement, or has a very good reason for not doing so.
  • These changes have been tested on the affected systems and applications.
  • All dependency PRs/issues have been resolved and this PR can be merged.

@stiggy87 stiggy87 added INFRA JEDI Infrastructure ignore (testing) Debugging CI or other web hook labels Mar 25, 2025
@stiggy87 stiggy87 self-assigned this Mar 25, 2025
@stiggy87 stiggy87 requested review from climbfuji, eap and fmahebert March 26, 2025 19:19
@stiggy87 stiggy87 removed the ignore (testing) Debugging CI or other web hook label Mar 26, 2025
Copy link
Copy Markdown
Collaborator

@climbfuji climbfuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you are only installing package ectrans?

@stiggy87
Copy link
Copy Markdown
Contributor Author

It looks like you are only installing package ectrans?

@climbfuji Good question. I've asked, and @fmahebert told me majority of the work around this compiler is focused on ectrans. I am all for just adding it to the list and leaving the remaining packages there.

I've started a dialog with @fmahebert and @l90lpa to get some more clarification.

@stiggy87
Copy link
Copy Markdown
Contributor Author

I pulled too much out for testing! I've added the original packages back into the spec and added ectrans to it.

@eap
Copy link
Copy Markdown
Collaborator

eap commented Mar 27, 2025

This is more of a question for @fmahebert on requirements; Should we consider adding in jedi-fv3-env, soca-env and ewok-env? It would be nice to be able to run skylab small experiments with this, although (1) I don't know if that's outside the scope of the needs here and (2) maybe that would be better added in a followup change?

Thoughts? At minimum it would be nice to get ewok-env +ecflow ~cylc in this change since we would immediately have ecflow and could run some simple experiments (probably?)

@fmahebert
Copy link
Copy Markdown
Contributor

fmahebert commented Mar 27, 2025

@eap I (selfishly) view this PR's goal to be fixing issues in the current nvhpc spack-stack environment that are blocking our short-term code demonstration goals. Thus, I would recommend against the scope expansion of adding any new packages to the build as part of this PR.

To expand a bit — the demonstrations we're targeting will not involve skylab, workflows, or any of the soca/fv3 related code. Thus, adding the packages you've suggested is not immediately necessary, and risks introducing delays if we discover anything that doesn't build cleanly with nvhpc. In the longer term, it would certainly be nice to have more support for nvhpc across the system by working towards what you suggest. But I view that as a long-term project of building support (as and when warranted by JCSDA/partner needs), and beyond the scope of this PR.

Does that sound reasonable?

Copy link
Copy Markdown
Collaborator

@eap eap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fmahebert > Absolutely reasonable, I just wanted to ask the question. Adding in ewok and other skylab dependencies would be a good followup but it looks like this is working as intended for the demo.

@climbfuji climbfuji requested a review from fmahebert March 27, 2025 19:18
Copy link
Copy Markdown
Contributor

@fmahebert fmahebert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your help on this @stiggy87 !

@stiggy87 stiggy87 merged commit c7e2475 into JCSDA:develop Mar 28, 2025
9 checks passed
@stiggy87 stiggy87 deleted the feature/update-nvhpc-config branch April 1, 2025 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

INFRA JEDI Infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Next steps for Nvidia compiler instructions

4 participants