Skip to content

Fix ROCm constraints for ginkgo@glu_experimental in HiOp#32499

Merged
eugeneswalker merged 10 commits intospack:developfrom
cameronrutherford:ginkgo-hip-fixes
Oct 13, 2022
Merged

Fix ROCm constraints for ginkgo@glu_experimental in HiOp#32499
eugeneswalker merged 10 commits intospack:developfrom
cameronrutherford:ginkgo-hip-fixes

Conversation

@cameronrutherford
Copy link
Copy Markdown
Contributor

Not 100% sure if this syntax is correct, but this was what I needed to add to avoid conflicts and build successfully on ROCm 5.2.0.

Perhaps this is a case where expanding the spack core syntax would be beneficial, or at least including some useful documentation as I struggled to do this at first.

@cameronrutherford
Copy link
Copy Markdown
Contributor Author

Can this be merged? I don't see any reason why this is still blocked

Copy link
Copy Markdown
Contributor

@tcojean tcojean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, LGTM, but here is a patch which achieves the same thing without a need for special if or other guard.

diff --git a/var/spack/repos/builtin/packages/ginkgo/package.py b/var/spack/repos/builtin/packages/ginkgo/package.py
index aa88add9..306812ef 100644
--- a/var/spack/repos/builtin/packages/ginkgo/package.py
+++ b/var/spack/repos/builtin/packages/ginkgo/package.py
@@ -22,8 +22,8 @@ class Ginkgo(CMakePackage, CudaPackage, ROCmPackage):

     version("develop", branch="develop")
     version("master", branch="master")
-    version("glu", branch="glu")
-    version("glu_experimental", branch="glu_experimental")
+    version("1.5.0.glu", branch="glu")
+    version("1.5.0.glu_experimental", branch="glu_experimental")
     version("1.4.0", commit="f811917c1def4d0fcd8db3fe5c948ce13409e28e")  # v1.4.0
     version("1.3.0", commit="4678668c66f634169def81620a85c9a20b7cec78")  # v1.3.0
     version("1.2.0", commit="b4be2be961fd5db45c3d02b5e004d73550722e31")  # v1.2.0
diff --git a/var/spack/repos/builtin/packages/hiop/package.py b/var/spack/repos/builtin/packages/hiop/package.py
index b770a786..7b7e08c4 100644
--- a/var/spack/repos/builtin/packages/hiop/package.py
+++ b/var/spack/repos/builtin/packages/hiop/package.py
@@ -113,7 +113,7 @@ class Hiop(CMakePackage, CudaPackage, ROCmPackage):
     depends_on("coinhsl+blas", when="+sparse")
     depends_on("metis", when="+sparse")

-    depends_on("ginkgo@glu_experimental", when="+ginkgo")
+    depends_on("[email protected]_experimental", when="+ginkgo")

     conflicts(
         "+shared",

tcojean
tcojean previously approved these changes Sep 13, 2022
@cameronrutherford
Copy link
Copy Markdown
Contributor Author

@tcojean I have applied your suggestion and the changes seem to function well, thanks!

@alalazo alalazo self-assigned this Sep 14, 2022
@cameronrutherford cameronrutherford changed the title Remove ROCm constraints for ginkgo@glu_experimental. Fix ROCm constraints for ginkgo@glu_experimental and hotfix camp HIP config Sep 14, 2022
@cameronrutherford
Copy link
Copy Markdown
Contributor Author

@alalazo I addressed your changes - I forgot to fix that after applying @tcojean's patch.

#32469 introduced a breaking change in camp for HIP_CLANG_INCLUDE_PATH that I noticed when building RAJA/Umpire, and so I modified the camp package to fix this. cc @trws @davidbeckingsale from original PR

@cameronrutherford cameronrutherford changed the title Fix ROCm constraints for ginkgo@glu_experimental and hotfix camp HIP config Fix ROCm constraints for ginkgo@glu_experimental in HiOp Sep 15, 2022
@cameronrutherford
Copy link
Copy Markdown
Contributor Author

This is one of my smaller spack PRs, but the review has been very insightful. Thank you all who helped!

I think this is finally ready for merge.

alalazo
alalazo previously approved these changes Sep 20, 2022
@eugeneswalker
Copy link
Copy Markdown
Contributor

It looks like this PR makes @1.5.0.glu_experimental the default -- is that the intention?

I see the only failing CI check is due to [email protected]_experimental +rocm build failing in the E4S environment. The logs are incomplete in GitLab -- I'm replicating the build locally to find out why it is failing now. Will update here as I find out more.

@cameronrutherford
Copy link
Copy Markdown
Contributor Author

It looks like this PR makes @1.5.0.glu_experimental the default -- is that the intention?

I see the only failing CI check is due to [email protected]_experimental +rocm build failing in the E4S environment. The logs are incomplete in GitLab -- I'm replicating the build locally to find out why it is failing now. Will update here as I find out more.

That's not the intention - perhaps we should tag develop as the default. We resorted to the change in version name as we were having issues with the Ginkgo version constraints for ROCm.

If changing the default is sufficient that would work

@eugeneswalker
Copy link
Copy Markdown
Contributor

The CI check is failing because [email protected]_experimental +rocm can't find rocprim header. See log below.

[email protected] +rocm is OK though.

==> Installing ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag
==> No binary for ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag found: installing from source
==> No patches needed for ginkgo
==> ginkgo: Executing phase: 'cmake'
==> ginkgo: Executing phase: 'build'
==> Error: ProcessError: Command exited with status 2:
    'make' '-j16'

5 errors found in build log:
     1238    In file included from /spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-11.1.0/rocthrust-5.2.1-nt26ed72pglkidikl67cyc7vgbqbhpnv/include/thrust/detail/copy_if.inl:20:
     1239    In file included from /spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-11.1.0/rocthrust-5.2.1-nt26ed72pglkidikl67cyc7vgbqbhpnv/include/thrust/system/detail/generic/copy_if.h:62:
     1240    In file included from /spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-11.1.0/rocthrust-5.2.1-nt26ed72pglkidikl67cyc7vgbqbhpnv/include/thrust/system/detail/generic/copy_if.inl:31:
     1241    In file included from /spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-11.1.0/rocthrust-5.2.1-nt26ed72pglkidikl67cyc7vgbqbhpnv/include/thrust/scan.h:1594:
     1242    In file included from /spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-11.1.0/rocthrust-5.2.1-nt26ed72pglkidikl67cyc7vgbqbhpnv/include/thrust/detail/scan.inl:28:
     1243    In file included from /spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-11.1.0/rocthrust-5.2.1-nt26ed72pglkidikl67cyc7vgbqbhpnv/include/thrust/system/detail/adl/scan.h:44:
  >> 1244    /spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-11.1.0/rocthrust-5.2.1-nt26ed72pglkidikl67cyc7vgbqbhpnv/include/thrust/system/hip/detail/scan.h:48:10: fatal error: 'rocprim/rocprim.h
             pp' file not found
     1245    #include <rocprim/rocprim.hpp>
     1246             ^~~~~~~~~~~~~~~~~~~~~
     1247    1 error generated when compiling for host.
  >> 1248    CMake Error at ginkgo_hip_generated_device_matrix_data_kernels.hip.cpp.o.cmake:146 (message):
     1249      Error generating
     1250      /tmp/root/spack-stage/spack-stage-ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag/spack-build-wvvqk7c/hip/CMakeFiles/ginkgo_hip.dir/base/./ginkgo_hip_generated_dev
             ice_matrix_data_kernels.hip.cpp.o
     1251
     1252
  >> 1253    make[2]: *** [hip/CMakeFiles/ginkgo_hip.dir/build.make:80: hip/CMakeFiles/ginkgo_hip.dir/base/ginkgo_hip_generated_device_matrix_data_kernels.hip.cpp.o] Error 1
     1254    make[2]: *** Waiting for unfinished jobs....
     1255    [ 45%] Building CXX object reference/CMakeFiles/ginkgo_reference.dir/solver/gmres_kernels.cpp.o
     1256    cd /tmp/root/spack-stage/spack-stage-ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag/spack-build-wvvqk7c/reference && /spack/lib/spack/env/gcc/g++ -Dginkgo_reference
             _EXPORTS -I/tmp/root/spack-stage/spack-stage-ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag/spack-build-wvvqk7c/include -I/tmp/root/spack-stage/spack-stage-ginkgo-1
             .5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag/spack-src/include -I/tmp/root/spack-stage/spack-stage-ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag/spack-sr
             c -O3 -DNDEBUG -fPIC -Wpedantic -MD -MT reference/CMakeFiles/ginkgo_reference.dir/solver/gmres_kernels.cpp.o -MF CMakeFiles/ginkgo_reference.dir/solver/gmres_kernels.cpp.o.d -o CMa
             keFiles/ginkgo_reference.dir/solver/gmres_kernels.cpp.o -c /tmp/root/spack-stage/spack-stage-ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag/spack-src/reference/solv
             er/gmres_kernels.cpp
     1257    -- Removing /tmp/root/spack-stage/spack-stage-ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag/spack-build-wvvqk7c/hip/CMakeFiles/ginkgo_hip.dir/base/./ginkgo_hip_gen
             erated_exception.hip.cpp.o
     1258    /spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-11.1.0/cmake-3.23.3-yhy7aje54ibee2f52ixmfoblb74lrdho/bin/cmake -E remove /tmp/root/spack-stage/spack-stage-ginkgo-1.5.0.glu_experiment
             al-wvvqk7charnz364cztyfnycmsswomdag/spack-build-wvvqk7c/hip/CMakeFiles/ginkgo_hip.dir/base/./ginkgo_hip_generated_exception.hip.cpp.o
     1259    -- Generating dependency file: /tmp/root/spack-stage/spack-stage-ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag/spack-build-wvvqk7c/hip/CMakeFiles/ginkgo_hip.dir/ba
             se/ginkgo_hip_generated_exception.hip.cpp.o.depend.pre

     ...

     1459    /spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-11.1.0/hipblas-5.2.1-zgpvdlm3fdkkust7b2e5xveifraxpngl/include/hipblas.h:16:9: warning: : warning : This file is deprecated. Use the he
             ader file from /spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-11.1.0/hipblas-5.2.1-zgpvdlm3fdkkust7b2e5xveifraxpngl/include/hipblas/hipblas.h by using #include <hipblas/hipblas.h> [
             -W#pragma-messages]
     1460    #pragma message(": warning : This file is deprecated. Use the header file from /spack/opt/spack/linux-ubuntu20.04-x86_64/gcc-11.1.0/hipblas-5.2.1-zgpvdlm3fdkkust7b2e5xveifraxpngl/i
             nclude/hipblas/hipblas.h by using #include <hipblas/hipblas.h>")
     1461            ^
     1462    1 warning generated when compiling for host.
     1463    Generated /tmp/root/spack-stage/spack-stage-ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag/spack-build-wvvqk7c/hip/CMakeFiles/ginkgo_hip.dir/__/common/unified/matri
             x/./ginkgo_hip_generated_dense_kernels.cpp.o successfully.
     1464    make[2]: Leaving directory '/tmp/root/spack-stage/spack-stage-ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag/spack-build-wvvqk7c'
  >> 1465    make[1]: *** [CMakeFiles/Makefile2:792: hip/CMakeFiles/ginkgo_hip.dir/all] Error 2
     1466    make[1]: Leaving directory '/tmp/root/spack-stage/spack-stage-ginkgo-1.5.0.glu_experimental-wvvqk7charnz364cztyfnycmsswomdag/spack-build-wvvqk7c'
  >> 1467    make: *** [Makefile:159: all] Error 2

@trws
Copy link
Copy Markdown
Contributor

trws commented Oct 6, 2022

rocprim got moved out of the hip package recently, it's in a new separate rocprim package. We've had several packages break as a result.

@cameronrutherford
Copy link
Copy Markdown
Contributor Author

Should I just change the default for ginkgo for the moment while this branch is experimental, or do we want to keep this PR open to debug the rocprim issue?

@eugeneswalker
Copy link
Copy Markdown
Contributor

Should I just change the default for ginkgo for the moment while this branch is experimental, or do we want to keep this PR open to debug the rocprim issue?

I am perhaps not the best person to comment on this, but my intuition is that, if this is an experimental branch, it should not be the default.

@tcojean
Copy link
Copy Markdown
Contributor

tcojean commented Oct 12, 2022

Should I just change the default for ginkgo for the moment while this branch is experimental, or do we want to keep this PR open to debug the rocprim issue?

Sorry for the delay. I will look at the rocprim issue separately, but yes this branch should not be made default as it is not an actual release. 1.4.0 should still be the default release for now, or master.

auto-merge was automatically disabled October 13, 2022 01:31

Head branch was pushed to by a user without write access

@cameronrutherford
Copy link
Copy Markdown
Contributor Author

Set preferred=True for version 1.4.0. Hopefully, that fixes pipelines.

@eugeneswalker
Copy link
Copy Markdown
Contributor

Looks like everything is passing. I'll defer merge decision to you Cameron or anyone else, in case you want additional eyes from other reviewers before merging. Thanks!

Copy link
Copy Markdown
Contributor

@tcojean tcojean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@cameronrutherford
Copy link
Copy Markdown
Contributor Author

I think this is good to go

@eugeneswalker eugeneswalker merged commit e20d45f into spack:develop Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants