Make CUDA and ROCm architecture conditional by alalazo · Pull Request #27185 · spack/spack

alalazo · 2021-11-03T07:57:17Z

Modifications:

The variant to specify which architecture to use for CUDA and ROCm are now conditional on +cuda and
+rocm respectively.
Modify cp2k to make all CUDA related variants conditional on +cuda

dev-zero

LGTM. Thanks!

alalazo · 2021-11-03T20:17:48Z

@scottwittenburg Do you have any idea why some specs are failing in pipelines? For instance, I can't get a sense of this error

scottwittenburg · 2021-11-03T20:36:02Z

@scottwittenburg Do you have any idea why some specs are failing in pipelines? For instance, I can't get a sense of this error

It seems spack install is still exiting 0 in a lot of cases when configure stage fails, and pipelines use that exit code to decide whether to create a buildcache or not. So the error at the bottom of the trace is really a red herring in those cases, and you have to scroll way up to find the real problem. In the case you linked above, this error message appears:

==> [2021-11-03-09:18:21.167201] No patches needed for hpx
==> [2021-11-03-09:18:21.202240] hpx: Executing phase: 'cmake'
==> [2021-11-03-09:18:21.306025] Error: AttributeError: 'tuple' object has no attribute 'values'
/builds/spack/spack/var/spack/repos/builtin/packages/hpx/package.py:166, in instrumentation_args:
        165    def instrumentation_args(self):
  >>    166        for value in self.variants['instrumentation'].values:
        167            if value == 'none':
        168                continue
        169

var/spack/repos/builtin/packages/hpx/package.py

dev-zero · 2021-11-04T10:49:25Z

@alalazo does this mean I can finally set the default of variants depending on +cuda to True, but they get ignored unless +cuda is enabled?

alalazo · 2021-11-04T11:07:25Z

@dev-zero I think so, but to be clear:

We can't set the default value of a package based on some other variant
We can make variant conditional

For instance, when this PR is merged a spec satisfying cp2k~cuda will have no cuda_fft variant at all. In that sense you can change the default of cuda_fft to True since the variant will be there only when +cuda is in the spec.

alalazo · 2021-11-04T11:39:23Z

@sethrj FYI, not sure this can make it to v0.17.0

var/spack/repos/builtin/packages/hpx/package.py

fixes spack#14337 The variant to specify which architecture to use for CUDA and ROCm are now conditional on +cuda and +rocm respectively.

alalazo · 2021-11-04T20:24:31Z

@spackbot run pipeline

spackbot-app · 2021-11-04T20:24:33Z

I've started that pipeline for you!

alalazo · 2021-11-04T21:12:48Z

@spackbot run pipeline

spackbot-app · 2021-11-04T21:12:50Z

I've started that pipeline for you!

haampie · 2021-11-05T08:39:32Z

Are we going to merge this before 0.17? (I would be in favor :D)

sethrj

I like it!

sethrj · 2021-11-22T12:53:54Z

var/spack/repos/builtin/packages/cp2k/package.py

+        variant('cuda_arch_35_k20x', default=False,
+                description=('CP2K (resp. DBCSR) has specific parameter sets for'
+                             ' different GPU models. Enable this when building'
+                             ' with cuda_arch=35 for a K20x instead of a K40'))


Not relevant to your changes, but damn this is ugly 😂

Well, luckily Nvidia fixed that afterwards, and many codes can build for multiple models in parallel, hence don't have that problem in the first place.

Since spack#27185, the cuda_arch variant values are conditional on +cuda. This means that for -cuda specs, the installation fails with: ``` ==> acts: Executing phase: 'cmake' ==> Error: KeyError: 'cuda_arch' /home/wdconinc/git/spack/var/spack/repos/builtin/packages/acts/package.py:222, in cmake_args: 219 log_failure_threshold = spec.variants['log_failure_threshold'].value 220 args.append("-DACTS_LOG_FAILURE_THRESHOLD={0}".format(log_failure_threshold)) 221 >> 222 cuda_arch = spec.variants['cuda_arch'].value 223 if cuda_arch != 'none': 224 args.append('-DCUDA_FLAGS=-arch=sm_{0}'.format(cuda_arch[0])) 225 ```

Since #27185, the cuda_arch variant values are conditional on +cuda. This means that for -cuda specs, the installation fails with: ``` ==> acts: Executing phase: 'cmake' ==> Error: KeyError: 'cuda_arch' /home/wdconinc/git/spack/var/spack/repos/builtin/packages/acts/package.py:222, in cmake_args: 219 log_failure_threshold = spec.variants['log_failure_threshold'].value 220 args.append("-DACTS_LOG_FAILURE_THRESHOLD={0}".format(log_failure_threshold)) 221 >> 222 cuda_arch = spec.variants['cuda_arch'].value 223 if cuda_arch != 'none': 224 args.append('-DCUDA_FLAGS=-arch=sm_{0}'.format(cuda_arch[0])) 225 ```

adamjstewart · 2022-02-02T17:11:18Z

It seems like cuda_arch still has a none option, do we want to disallow this? That would significantly reduce error checking in a lot of packages. I feel like we only added cuda_arch=none as an option so that we could support ~cuda before this variant was conditional.

sethrj · 2022-02-02T17:37:38Z

I'm in favor of it. Technically you can have CUDA enabled but not have any device-specific code, but it's really only useful for testing. (You can for example build against CUDA on a system without a working CUDA card, and you can still call CUDA runtime APIs such as cudaMalloc even if you don't have any device-specific code.)

alalazo · 2022-02-03T08:53:07Z

I guess the question would be what should be the default.

sethrj · 2022-02-03T09:29:54Z

It would be really cool if we could interrogate the system to find the default, but that could be bad when build and deploy systems are different. Probably better for there to be a way to force the user to select the cuda architecture.

spackbot-app bot added build-systems new-variant update-package labels Nov 3, 2021

spackbot-app bot requested a review from dev-zero November 3, 2021 07:57

dev-zero previously approved these changes Nov 3, 2021

View reviewed changes

becker33 previously approved these changes Nov 3, 2021

View reviewed changes

alalazo dismissed stale reviews from becker33 and dev-zero via e1d94b6 November 3, 2021 21:40

alalazo force-pushed the mixims/cuda_rocm_conditional_variants branch from 16c8554 to 22f98e0 Compare November 3, 2021 22:12

tldahlgren assigned dev-zero and becker33 Nov 3, 2021

haampie reviewed Nov 3, 2021

View reviewed changes

var/spack/repos/builtin/packages/hpx/package.py Show resolved Hide resolved

albestro mentioned this pull request Nov 4, 2021

hpx package.py: self.variants['instrumentation'].values: 'tuple' object has no attribute 'values' #27213

Closed

4 tasks

alalazo marked this pull request as ready for review November 4, 2021 06:34

alalazo mentioned this pull request Nov 4, 2021

mfem, hpx: fix recipes after conditional variants #27215

Merged

alalazo requested a review from becker33 November 4, 2021 09:58

becker33 reviewed Nov 4, 2021

View reviewed changes

var/spack/repos/builtin/packages/hpx/package.py Show resolved Hide resolved

alalazo added 2 commits November 4, 2021 20:08

Make CUDA and ROCm architecture conditional

18e08c3

fixes spack#14337 The variant to specify which architecture to use for CUDA and ROCm are now conditional on +cuda and +rocm respectively.

cp2k: make all CUDA related variants conditional on +cuda

771919e

alalazo force-pushed the mixims/cuda_rocm_conditional_variants branch from 22f98e0 to 771919e Compare November 4, 2021 19:09

sethrj approved these changes Nov 22, 2021

View reviewed changes

sethrj merged commit 5eba5dc into spack:develop Nov 22, 2021

alalazo deleted the mixims/cuda_rocm_conditional_variants branch November 22, 2021 12:55

wdconinc mentioned this pull request Dec 5, 2021

[acts] use variants['cuda_arch'] only when +cuda #27813

Merged

Conversation

alalazo commented Nov 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dev-zero left a comment

Choose a reason for hiding this comment

Uh oh!

alalazo commented Nov 3, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scottwittenburg commented Nov 3, 2021

Uh oh!

Uh oh!

dev-zero commented Nov 4, 2021

Uh oh!

alalazo commented Nov 4, 2021

Uh oh!

alalazo commented Nov 4, 2021

Uh oh!

Uh oh!

alalazo commented Nov 4, 2021

Uh oh!

spackbot-app bot commented Nov 4, 2021

Uh oh!

alalazo commented Nov 4, 2021

Uh oh!

spackbot-app bot commented Nov 4, 2021

Uh oh!

haampie commented Nov 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sethrj left a comment

Choose a reason for hiding this comment

Uh oh!

sethrj Nov 22, 2021

Choose a reason for hiding this comment

Uh oh!

dev-zero Nov 23, 2021

Choose a reason for hiding this comment

Uh oh!

adamjstewart commented Feb 2, 2022

Uh oh!

sethrj commented Feb 2, 2022

Uh oh!

alalazo commented Feb 3, 2022

Uh oh!

sethrj commented Feb 3, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

alalazo commented Nov 3, 2021 •

edited

Loading

alalazo commented Nov 3, 2021 •

edited

Loading

haampie commented Nov 5, 2021 •

edited

Loading