Skip to content

Bugfixes for the HIP package on new versions of ROCm#21822

Closed
bvanessen wants to merge 2 commits intospack:developfrom
bvanessen:bugfix_hip_package
Closed

Bugfixes for the HIP package on new versions of ROCm#21822
bvanessen wants to merge 2 commits intospack:developfrom
bvanessen:bugfix_hip_package

Conversation

@bvanessen
Copy link
Copy Markdown
Contributor

Fixed the rocm-dev-libs path when HIP is an external package for newer
versions of HIP. Fixed rocm-path environment variable to point to the
ROCm path rather than the rocm-dev-libs variable. Added a definition
of the HIP_CLANG_INCLUDE_PATH, which is not properly identified
because Spack wraps the compiler up in a Spack specific path, which
prevents proper detection of the Clang compiler headers.

versions of HIP.  Fixed rocm-path environment variable to point to the
ROCm path rather than the rocm-dev-libs variable.  Added a definition
of the HIP_CLANG_INCLUDE_PATH, which is not properly identified
because Spack wraps the compiler up in a Spack specific path, which
prevents proper detection of the Clang compiler headers.
@bvanessen
Copy link
Copy Markdown
Contributor Author

@becker33 @scheibelp @haampie @balos1 Here are some fixes that enable the HIP package to work for my on LC.

@adamjstewart
Copy link
Copy Markdown
Member

@srekolam @arjun-raj-kuppala can you review?

@haampie
Copy link
Copy Markdown
Member

haampie commented Feb 21, 2021

Will take a look tomorrow!

@srekolam
Copy link
Copy Markdown
Contributor

i am trying out a few packages with this change ..right now i built [email protected],@3.8.0.. will update abt the progress in an hour or so.

@haampie
Copy link
Copy Markdown
Member

haampie commented Feb 22, 2021

@bvanessen can you share the output of the following without this change to HIP_CLANG_INCLUDE_PATH:

$ spack load hip
$ HIPCC_VERBOSE=2 hipcc --version

I'm curious why some of your variables are not set correctly with the external build

@srekolam
Copy link
Copy Markdown
Contributor

this is breaking @3.7.0 for rocblas, rocrand packages..

20 Run Build Command(s):/usr/bin/gmake cmTC_60d2b/fast && /usr/bin/gmake -f CMakeFiles/cmTC_60d2b.dir/build.make CMakeFi
les/cmTC_60d2b.dir/build
21 gmake[1]: Entering directory '/tmp/root/spack-stage/spack-stage-rocblas-3.7.0-ubamgh57ba7ufnt4amy5xvez3k5b7sni/spack-b
uild-ubamgh5/CMakeFiles/CMakeTmp'
22 Building CXX object CMakeFiles/cmTC_60d2b.dir/testCXXCompiler.cxx.o
23 /spack/opt/spack/linux-centos8-skylake_avx512/gcc-8.3.1/hip-3.7.0-itxcp66ohqd44ocz3wj7l5qjwcvvesjm/bin/hipcc -o CMa
keFiles/cmTC_60d2b.dir/testCXXCompiler.cxx.o -c /tmp/root/spack-stage/spack-stage-rocblas-3.7.0-ubamgh57ba7ufnt4amy5xvez3k
5b7sni/spack-build-ubamgh5/CMakeFiles/CMakeTmp/testCXXCompiler.cxx

24 clang-11: error: unsupported option '--rocm-path=/spack/opt/spack/linux-centos8-skylake_avx512/gcc-8.3.1/hip-3.7.0-itxcp66
ohqd44ocz3wj7l5qjwcvvesjm'
25 gmake[1]: *** [CMakeFiles/cmTC_60d2b.dir/build.make:85: CMakeFiles/cmTC_60d2b.dir/testCXXCompiler.cxx.o] Error 1
26 gmake[1]: Leaving directory '/tmp/root/spack-stage/spack-stage-rocblas-3.7.0-ubamgh57ba7ufnt4amy5xvez3k5b7sni/spack-bu
ild-ubamgh5/CMakeFiles/CMakeTmp'
27 gmake: *** [Makefile:140: cmTC_60d2b/fast] Error 2

@haampie
Copy link
Copy Markdown
Member

haampie commented Feb 22, 2021

@srekolam that must have been the case already since --rocm-path was around since before this pr.

You might want to review #21852 instead which fixes your issue and includes this PR except for the HIP_CLANG_INCLUDE_PATH variable.

@bvanessen
Copy link
Copy Markdown
Contributor Author

@bvanessen can you share the output of the following without this change to HIP_CLANG_INCLUDE_PATH:

$ spack load hip
$ HIPCC_VERBOSE=2 hipcc --version

I'm curious why some of your variables are not set correctly with the external build

Here is what I get

lbann-test-hip-zen2] (2) vanessen@corona211 [lbann_bvanessen.git] %> HIPCC_VERBOSE=2 hipcc --version
HIP_PATH=/opt/rocm-4.0.0
HIP_PLATFORM=hcc
HIP_COMPILER=clang
HIP_RUNTIME=ROCclr
ROCM_PATH=/opt/rocm-4.0.0
HIP_ROCCLR_HOME=/opt/rocm-4.0.0
HIP_CLANG_PATH=/opt/rocm-4.0.0/llvm/bin
HIP_CLANG_INCLUDE_PATH=/opt/rocm-4.0.0/llvm/lib/clang/12.0.0/include
HIP_INCLUDE_PATH=/opt/rocm-4.0.0/include
HIP_LIB_PATH=/opt/rocm-4.0.0/lib
DEVICE_LIB_PATH=/opt/rocm-4.0.0/amdgcn/bitcode
HIP version: 4.0.20496-4f163c68
clang version 12.0.0 (/data/jenkins_workspace/centos_pipeline_job_4.0/rocm-rel-4.0/rocm-4.0-23-20201214/7.7/external/llvm-project/clang dac2bfceaa8d4a90257dc8a6d58f268e172ce00e)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm-4.0.0/llvm/bin

However, I also see the following build errors:

  >> 395    CMake Error in tests/CMakeLists.txt:
     396      Target "lbann" contains relative path in its INTERFACE_INCLUDE_DIRECTORIES:
     397    
     398        "HIP_CLANG_INCLUDE_PATH-NOTFOUND/.."

So I am not sure what is happening, but explicitly setting the environment variable allowed it to work for me.

@bvanessen
Copy link
Copy Markdown
Contributor Author

@srekolam that must have been the case already since --rocm-path was around since before this pr.

You might want to review #21852 instead which fixes your issue and includes this PR except for the HIP_CLANG_INCLUDE_PATH variable.

I was just able to test the current PR 21822 without manually setting the environment variable and it doesn't work for me. I can try the new PR 21852 later, but I don't expect it to work without that variable.

@haampie
Copy link
Copy Markdown
Member

haampie commented Feb 22, 2021

Ah, this issue is caused by find_package(hip) which is bad at locating clang.

Can you try the following:

'-DHIP_CXX_COMPILER={0}'.format(self.spec['hip'].hipcc)

@haampie
Copy link
Copy Markdown
Member

haampie commented Feb 22, 2021

The hip cmake script tries to link to clang's builtins lib even for host code (the hip::host target).

In spack there is a patch to drop that behavior, since it was intended to only add half precision support to device code (the hip::device target). Obviously if you use an external hip install you don't have that patch.

There's an open PR to drop this in hip itself, so you can show your support: ROCm/hip#2219

@tldahlgren tldahlgren added the bugfix Something wasn't working, here's a fix label Feb 22, 2021
@bvanessen
Copy link
Copy Markdown
Contributor Author

Ah, this issue is caused by find_package(hip) which is bad at locating clang.

Can you try the following:

'-DHIP_CXX_COMPILER={0}'.format(self.spec['hip'].hipcc)

Yes if I specifically set the HIP CXX compiler it works without defining the CLANG include path in the HIP package.

@haampie
Copy link
Copy Markdown
Member

haampie commented Feb 22, 2021

Closing this in favor of #21852

@haampie haampie closed this Feb 22, 2021
@bvanessen bvanessen deleted the bugfix_hip_package branch February 22, 2021 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AMD bugfix Something wasn't working, here's a fix update-package

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants