[ROCm] enable unit tests and other changes #10266

iotamudelta · 2018-08-06T18:06:12Z

This PR for the ROCm target does the following:

enable some unit tests on ROCm
fix a missing static_cast that breaks BatchNorm call on ROCm
fix BatchNorm to work on ROCm w/ ROCm warp sizes etc
improve the pyhipify script by introducing kernel scope to some transpilations and other improvements
fix a linking issue on ROCm
for more unit test sets: mark currently broken tests broken (to be fixed)
enable THINLTO (phase one) to parallelize linking
address the first failing of the elementwise kernel by removing non-working ROCm specialization

first round of changes to update PR

merge from upstream

This reverts commit 864dbe4.

next round of fixes to address comments

merge from upstream

After discussion in review, disable flake8 on pyHIPIFY for now.

Merge from pytorch upstream

This will hopefully safe some grief in the future with overriding code.

…solete code for output directory not existing

Address more review comments

We already had a fallback.

Automatically handle transpilations inside device code only

Automatically pre-include CUDA headers just like NVCC.

Minor changes to pass flake8 tests

.jenkins/pytorch/build.sh

 pip install -r requirements.txt || true

 if [[ "$BUILD_ENVIRONMENT" == *rocm* ]]; then
+  # This is necessary in order to cross compile (or else we'll have missing GPU device).


aten/src/ATen/native/cuda/Loops.cuh

 __launch_bounds__(nt, 4)
-#ifdef __HIP_PLATFORM_HCC__
-__global__ void elementwise_kernel(int N, const func_t& f) {
-#else


caffe2/CMakeLists.txt

-     HIP_INCLUDE_DIRECTORIES(${Caffe2_HIP_INCLUDES})
-  ENDIF()
+  if(BUILD_ATEN)
+    # Get Compile Definitions from the directory (FindHIP.CMake bug)


test/test_autograd.py

        with self.assertRaises(RuntimeError):
            b.add_(5)

+    @unittest.skipIf(TEST_WITH_ROCM, "test doesn't currently work on the ROCm stack")


tools/amd_build/pyHIPIFY/hipify-python.py

+    """Generalization for finding a balancing closure group
+
+    e.g. if group = ["(", ")"], then finds the first balanced parantheses.
+         if group = ["{", "}"], then finds the first balanced bracket.


tools/amd_build/pyHIPIFY/hipify-python.py

+    """If the file makes kernel builtin calls and does not include the cuda_runtime.h header,
+    then automatically add an #include to match the "magic" includes provided by NVCC.
+    TODO:
+        Update logic to ignore cases where the cuda_runtime.h is included by another file.


tools/build_pytorch_libs.sh

 else
-    LDFLAGS="$LDFLAGS -Wl,-rpath,\$ORIGIN"
+    if [[ $USE_ROCM -eq 1 ]]; then
+        LDFLAGS="$LDFLAGS -Wl,-rpath,\\\\\\\$ORIGIN"


ezyang

Some nits but I don't see any show stoppers.

facebook-github-bot

ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: This PR for the ROCm target does the following: * enable some unit tests on ROCm * fix a missing static_cast that breaks BatchNorm call on ROCm * fix BatchNorm to work on ROCm w/ ROCm warp sizes etc * improve the pyhipify script by introducing kernel scope to some transpilations and other improvements * fix a linking issue on ROCm * for more unit test sets: mark currently broken tests broken (to be fixed) * enable THINLTO (phase one) to parallelize linking * address the first failing of the elementwise kernel by removing non-working ROCm specialization Pull Request resolved: pytorch/pytorch#10266 Differential Revision: D9184178 Pulled By: ezyang fbshipit-source-id: 03bcd1fe4ca4dd3241f09634dbd42b6a4c350297

…RAND_PR While there, add the remaining changes requested in upstream PR pytorch#10266

test/test_autograd.py

-    ('gesv', (2, 3, S, S), ((2, 3, S, S),), 'batched_dims', NO_ARGS, [skipIfNoLapack]),
-    ('gesv', (2, 2, S, S), ((1, S, S),), 'batched_broadcast_A', NO_ARGS, [skipIfNoLapack]),
-    ('gesv', (1, S, S), ((2, 2, S, S),), 'batched_broadcast_b', NO_ARGS, [skipIfNoLapack]),
+    ('gesv', (S, S, S), ((S, S, S),), 'batched', NO_ARGS, [skipIfNoLapack, skipIfRocm]),


Summary: This PR for the ROCm target does the following: * enable some unit tests on ROCm * fix a missing static_cast that breaks BatchNorm call on ROCm * fix BatchNorm to work on ROCm w/ ROCm warp sizes etc * improve the pyhipify script by introducing kernel scope to some transpilations and other improvements * fix a linking issue on ROCm * for more unit test sets: mark currently broken tests broken (to be fixed) * enable THINLTO (phase one) to parallelize linking * address the first failing of the elementwise kernel by removing non-working ROCm specialization Pull Request resolved: pytorch#10266 Differential Revision: D9184178 Pulled By: ezyang fbshipit-source-id: 03bcd1fe4ca4dd3241f09634dbd42b6a4c350297

iotamudelta and others added 30 commits July 2, 2018 13:46

Merge pull request #12 from iotamudelta/master

5645923

first round of changes to update PR

Merge remote-tracking branch 'upstream/master'

91738d9

Merge pull request #14 from iotamudelta/master

dc103d4

merge from upstream

Fix spelling of parenthesis.

c4fa8af

Get rid of flush/fsync.

79b1f45

Fix enum error that pep8 revealed.

c1d40f3

Line length fix.

864dbe4

Revert "Line length fix."

aa799ef

This reverts commit 864dbe4.

Send progress info to stderr by review comment.

4ad2f9e

Change to detail? as per review.

0e14afd

Document what the macros stand for.

5dac66e

raise exception if no kernel end found.

f2ca907

Convert to enum from magic numbers for function disable mode.

9eb80c9

Merge pull request #15 from iotamudelta/master

a333147

next round of fixes to address comments

Merge remote-tracking branch 'upstream/master'

88134e7

Merge pull request #16 from iotamudelta/master

c49615d

merge from upstream

After discussion in review, disable flake8 on pyHIPIFY for now.

f3eded3

Merge pull request #17 from iotamudelta/master

2f769c4

After discussion in review, disable flake8 on pyHIPIFY for now.

Merge remote-tracking branch 'upstream/master'

d28c837

Merge pull request #18 from iotamudelta/master

f737380

Merge from pytorch upstream

Name f, not lines.

6dfcd25

Signal error with sys.exit()

acaadd9

Always bail out if the output directory already exists.

6f6152e

This will hopefully safe some grief in the future with overriding code.

Strip trailing slash.

00ea583

Fix static_cast logic in hipify script for better coverage. Remove ob…

8548d75

…solete code for output directory not existing

Merge pull request #19 from iotamudelta/master

0cd2a61

Address more review comments

Remove second argument that wasn't actually required.

0298849

Don't require the root of the project to be specified.

5a4e719

We already had a fallback.

Python-ify as per review.

abfe666

Do not close file explicitly.

c8e68cd

iotamudelta and others added 3 commits August 6, 2018 12:55

Merge pull request #85 from Jorghi12/transpile_device_math

38ad0c3

Automatically handle transpilations inside device code only

Merge pull request #62 from Jorghi12/dynamic

18d0351

Automatically pre-include CUDA headers just like NVCC.

Merge branch 'master' into enableunittests

46c121c

iotamudelta requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners August 6, 2018 18:06

jithunnair-amd and others added 2 commits August 6, 2018 13:53

Minor changes to pass flake8 tests

789cf1a

Merge pull request #100 from jithunnair-amd/enable_unit_tests_for_rocm

dd0d208

Minor changes to pass flake8 tests

ezyang reviewed Aug 6, 2018

View reviewed changes

aten/src/ATen/native/cuda/Loops.cuh

__launch_bounds__(nt, 4)

#ifdef __HIP_PLATFORM_HCC__

__global__ void elementwise_kernel(int N, const func_t& f) {

#else

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Aug 6, 2018

View reviewed changes

caffe2/CMakeLists.txt

HIP_INCLUDE_DIRECTORIES(${Caffe2_HIP_INCLUDES})

ENDIF()

if(BUILD_ATEN)

# Get Compile Definitions from the directory (FindHIP.CMake bug)

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Aug 6, 2018

View reviewed changes

tools/build_pytorch_libs.sh

else

LDFLAGS="$LDFLAGS -Wl,-rpath,\$ORIGIN"

if [[ $USE_ROCM -eq 1 ]]; then

LDFLAGS="$LDFLAGS -Wl,-rpath,\\\\\\\$ORIGIN"

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

ezyang approved these changes Aug 6, 2018

View reviewed changes

facebook-github-bot reviewed Aug 6, 2018

View reviewed changes

facebook-github-bot closed this in a38b572 Aug 6, 2018

iotamudelta added a commit to iotamudelta/pytorch that referenced this pull request Aug 7, 2018

Merge remote-tracking branch 'rocm_upstream/enableunittests' into roc…

e9c047e

…RAND_PR While there, add the remaining changes requested in upstream PR pytorch#10266

ssnl reviewed Aug 10, 2018

View reviewed changes

ezyang added open source merged labels Jun 24, 2019

jithunnair-amd deleted the enableunittests branch September 25, 2025 16:32

[ROCm] enable unit tests and other changes #10266

[ROCm] enable unit tests and other changes #10266

Uh oh!

Conversation

iotamudelta commented Aug 6, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants