Skip to content

Support ROCm 6#8319

Merged
kmaehashi merged 10 commits into
cupy:mainfrom
Azusachan:rocm-6.0
Sep 18, 2024
Merged

Support ROCm 6#8319
kmaehashi merged 10 commits into
cupy:mainfrom
Azusachan:rocm-6.0

Conversation

@Azusachan
Copy link
Copy Markdown
Contributor

ROCm 6 introduced various changes on its API. In particular,

  • Removal of gcnarch from hipDeviceProp_t structure
  • Renaming of ‘memoryType’ in hipPointerAttribute_t structure to ‘type’

This patch provides support on ROCm 6.0.0 and above.

@Azusachan Azusachan closed this May 2, 2024
@Azusachan Azusachan reopened this May 2, 2024
@kmaehashi kmaehashi self-assigned this May 7, 2024
@kmaehashi kmaehashi added cat:enhancement Improvements to existing features to-be-backported Pull-requests to be backported to stable branch prio:medium labels May 7, 2024
@littlewu2508
Copy link
Copy Markdown
Contributor

littlewu2508 commented May 23, 2024

cupy_backends/cuda/libs/_cnvrtc.pxi also needs to update to avoid nvrtc.getVersion error, thanks to @Berrysoft

From 05233251a78e86bd269f79272561de22991843a1 Mon Sep 17 00:00:00 2001
From: Yiyang Wu <[email protected]>
Date: Thu, 23 May 2024 20:41:14 +0800
Subject: [PATCH] Add ROCm 6 in runtime_version

---
 cupy_backends/cuda/libs/_cnvrtc.pxi | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/cupy_backends/cuda/libs/_cnvrtc.pxi b/cupy_backends/cuda/libs/_cnvrtc.pxi
index 9f02b5522..b2b06aa4f 100644
--- a/cupy_backends/cuda/libs/_cnvrtc.pxi
+++ b/cupy_backends/cuda/libs/_cnvrtc.pxi
@@ -139,5 +139,8 @@ cdef SoftLink _get_softlink():
         elif runtime_version < 6_00_00000:
             # ROCm 5.x
             libname = 'libamdhip64.so.5'
+        elif runtime_version < 7_00_00000:
+            # ROCm 6.x
+            libname = 'libamdhip64.so.6'
 
     return SoftLink(libname, prefix, mandatory=True)
-- 
2.44.0

@Berrysoft
Copy link
Copy Markdown
Contributor

@littlewu2508 the comment should be # ROCm 6.x :)

@littlewu2508
Copy link
Copy Markdown
Contributor

the comment should be # ROCm 6.x :)

Thanks, I've edited the patch

Comment thread cupy_backends/hip/cupy_hip_runtime.h Outdated
ROCm 6 introduced various changes on its API. In particular,
* Removal of gcnarch from hipDeviceProp_t structure
* Renaming of ‘memoryType’ in hipPointerAttribute_t structure to ‘type’

This patch allows cupy to be built on this version.
@Azusachan
Copy link
Copy Markdown
Contributor Author

Rebase on 13.2.0

@littlewu2508 littlewu2508 mentioned this pull request Aug 6, 2024
Copy link
Copy Markdown

@ax3l ax3l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

I am trying to install cupy on a ROCm 6.1 machine (early access for El Capitan at LLNL) and this patch address the configure and compile errors I encountered. Can this patch be merged? :)

@ax3l
Copy link
Copy Markdown

ax3l commented Aug 24, 2024

cc @takagi @jglaser

@kmaehashi
Copy link
Copy Markdown
Member

Hi @Azusachan, thank you so much for the contribution, and sorry for keeping you waiting! I have verified the build succeeds with this PR, and of course, happy to merge this one to support ROCm 6.x in CuPy.

A roadblock I faced when testing this PR was that I couldn't launch the kernel in my environment with ROCm 6.2. Does anyone ever experienced or resolved this kind of issue?

>>> cupy.arange(10)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/maehashi/Development/cupy/cupy/_creation/ranges.py", line 60, in arange
    _arange_ufunc(typ(start), typ(step), ret, dtype=dtype)
  File "cupy/_core/_kernel.pyx", line 1375, in cupy._core._kernel.ufunc.__call__
    kern = self._get_ufunc_kernel(dev_id, op, arginfos, has_where)
  File "cupy/_core/_kernel.pyx", line 1402, in cupy._core._kernel.ufunc._get_ufunc_kernel
    kern = _get_ufunc_kernel(
  File "cupy/_core/_kernel.pyx", line 1082, in cupy._core._kernel._get_ufunc_kernel
    return _get_simple_elementwise_kernel(
  File "cupy/_core/_kernel.pyx", line 94, in cupy._core._kernel._get_simple_elementwise_kernel
    return _get_simple_elementwise_kernel_from_code(name, code, options)
  File "cupy/_core/_kernel.pyx", line 82, in cupy._core._kernel._get_simple_elementwise_kernel_from_code
    module = compile_with_cache(code, options)
  File "cupy/_core/core.pyx", line 2258, in cupy._core.core.compile_with_cache
    return cuda.compiler._compile_module_with_cache(
  File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 480, in _compile_module_with_cache
    return _compile_with_cache_hip(
  File "/home/maehashi/Development/cupy/cupy/cuda/compiler.py", line 930, in _compile_with_cache_hip
    mod.load(binary)
  File "cupy/cuda/function.pyx", line 263, in cupy.cuda.function.Module.load
    cpdef load(self, bytes cubin):
  File "cupy/cuda/function.pyx", line 264, in cupy.cuda.function.Module.load
    runtime._ensure_context()
  File "cupy_backends/cuda/api/runtime.pyx", line 1022, in cupy_backends.cuda.api.runtime._ensure_context
    memGetInfo()
  File "cupy_backends/cuda/api/runtime.pyx", line 593, in cupy_backends.cuda.api.runtime.memGetInfo
    check_status(status)
  File "cupy_backends/cuda/api/runtime.pyx", line 146, in cupy_backends.cuda.api.runtime.check_status
    raise CUDARuntimeError(status)
cupy_backends.cuda.api.runtime.CUDARuntimeError: hipErrorInvalidValue: invalid argument

Also cc-ing AMD people: @AdrianAbeyta @pnunna93 @lcskrishna @bmedishe @shbiswas834

Comment thread cupy_backends/cuda/api/runtime.pyx Outdated
Comment thread docs/source/install.rst Outdated
@kmaehashi
Copy link
Copy Markdown
Member

A roadblock I faced when testing this PR was that I couldn't launch the kernel in my environment with ROCm 6.2. Does anyone ever experienced or resolved this kind of issue?

Ok my GPU was too old to run ROCm 6.0... The problem disappeared with gfx908.

@kmaehashi
Copy link
Copy Markdown
Member

/test mini

kmaehashi
kmaehashi previously approved these changes Sep 17, 2024
@kmaehashi
Copy link
Copy Markdown
Member

/test mini

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cat:enhancement Improvements to existing features prio:medium to-be-backported Pull-requests to be backported to stable branch

Projects

None yet

5 participants