Port bmm and baddbmm from TH to ATen #42553

anjali411 · 2020-08-04T19:16:51Z

Stack from ghstack:

Port bmm and baddbmm from TH to ATen #42553 Port bmm and baddbmm from TH to ATen

Ports torch.bmm and torch.baddbmm from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions.

Closes #24539

Differential Revision: D24893511

[ghstack-poisoned]

ghstack-source-id: b737e98 Pull Request resolved: #42553

dr-ci · 2020-08-04T19:18:48Z

💊 CI failures summary and remediations

As of commit 909b366 (more details on the Dr. CI page):

1/1 failures possibly* introduced in this PR
- 1/1 non-CircleCI failure(s)

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm3.9-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 162 times.

[ghstack-poisoned]

ghstack-source-id: 3eba02a Pull Request resolved: #42553

aten/src/THC/THCBlas.cu

[ghstack-poisoned]

ghstack-source-id: 047dffe Pull Request resolved: #42553

zasdfgbnm

Not finished yet. Will post more comment later.

aten/src/ATen/cuda/CUDABlas.h

aten/src/ATen/cuda/CUDABlas.cpp

aten/src/ATen/native/cuda/LinearAlgebra.cu

aten/src/ATen/cuda/CUDABlas.cpp

aten/src/ATen/native/cuda/LinearAlgebra.cu

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. [ghstack-poisoned]

ghstack-source-id: f45d4e1 Pull Request resolved: #42553

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. [ghstack-poisoned]

ghstack-source-id: 61c04d4 Pull Request resolved: #42553

aten/src/ATen/cuda/CUDABlas.h

aten/src/ATen/native/cuda/LinearAlgebra.cu

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. [ghstack-poisoned]

ghstack-source-id: b43d308 Pull Request resolved: #42553

ngimel · 2020-08-13T19:42:17Z

Per @gchanan's request ports from TH to ATen should also beef up test coverage (in particular, various discontiguity patterns on input/output, and proper runtime errors for arguments on the different devices).

zasdfgbnm · 2020-08-17T16:33:50Z

@anjali411 Could you please rebase? Looks like there are lots of flaky tests.

aten/src/ATen/native/cuda/LinearAlgebra.cu

aten/src/ATen/cuda/CUDABlas.cpp

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. [ghstack-poisoned]

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. Closes #24539 [ghstack-poisoned]

ghstack-source-id: 2a2540a Pull Request resolved: #42553

test/test_torch.py

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. Closes #24539 [ghstack-poisoned]

ghstack-source-id: 3469afd Pull Request resolved: #42553

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. Closes #24539 [ghstack-poisoned]

test/test_torch.py

zasdfgbnm · 2020-11-11T18:15:50Z

test/test_torch.py

    @skipCUDAIf(torch.version.cuda == "10.1", "flaky on CUDA 10.1")
    @onlyOnCPUAndCUDA
-    @dtypes(*torch.testing.get_all_fp_dtypes(), *torch.testing.get_all_complex_dtypes())
+    @dtypesIfCUDA(*(torch.testing.get_all_fp_dtypes(include_half=True, include_bfloat16=AMPERE_OR_ROCM) +


Please don't do so. We test on all dtypes on purpose to make sure that all dtypes are tested: if it is supported, then it should run well. If it is not supported, it should raise an error.

cc: @ngimel

synced offline: the cpu bmm and baddbmm has multiple code paths, some of them supports bfloat16 and float16, some don't. So depending on the input, half and bfloat could or could not be supported. https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/LinearAlgebra.cpp#L498

So @zasdfgbnm , @ngimel and I agreed to add full support for torch.float16 and torch.bfloat16 in a follow-up PR and leave this one as is.

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. Closes #24539 Differential Revision: [D24893511](https://our.internmc.facebook.com/intern/diff/D24893511) [ghstack-poisoned]

ghstack-source-id: 0bbc3aa Pull Request resolved: #42553

test/test_torch.py

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. Closes #24539 Differential Revision: [D24893511](https://our.internmc.facebook.com/intern/diff/D24893511) [ghstack-poisoned]

ghstack-source-id: d7ff7cd Pull Request resolved: #42553

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. Closes #24539 Differential Revision: [D24893511](https://our.internmc.facebook.com/intern/diff/D24893511) [ghstack-poisoned]

ghstack-source-id: c52815b Pull Request resolved: #42553

zasdfgbnm

LGTM! Thanks for working on this!

codecov · 2020-11-12T12:03:27Z

Codecov Report

Merging #42553 (909b366) into gh/anjali411/46/base (4738672) will increase coverage by 0.00%.
The diff coverage is 0.00%.

@@                  Coverage Diff                  @@
##           gh/anjali411/46/base   #42553   +/-   ##
=====================================================
  Coverage                 81.22%   81.22%           
=====================================================
  Files                      1837     1837           
  Lines                    198087   198087           
=====================================================
+ Hits                     160893   160897    +4     
+ Misses                    37194    37190    -4

facebook-github-bot · 2020-11-12T17:12:35Z

@anjali411 merged this pull request in e1ee3bf.

Summary: Now when #42553 is merged we can delete a bit of code from the tests and enable some of the skipped complex tests. Unfortunately, `test_pinverse_complex_xfailed` and `test_symeig_complex_xfailed` had bugs and it wasn't caught automatically that these tests xpass. Need to be careful next time with `unittest.expectedFailure`. Pull Request resolved: #47910 Reviewed By: zhangguanheng66 Differential Revision: D25052130 Pulled By: mruberry fbshipit-source-id: 29512995c024b882f9cb78b7bede77733d5762d0

xwang233 · 2021-02-03T06:08:33Z

aten/src/ATen/cuda/CUDABlas.cpp

 /* LEVEL 3 BLAS FUNCTIONS */

+#ifndef __HIP_PLATFORM_HCC__
+#if defined(CUDA_VERSION) && CUDA_VERSION >= 11200


Is this macro CUDA_VERSION >= 11200 intended? If you mean cuda 11.2, it should be 11020. I'm not sure if cuda 11.2 was a thing back in November 2020. 😅

No harm done, workaround is good.

@xwang233 no my bad! we should fix that to avoid confusion in future

Port bmm and baddbmm from TH to ATen

1cd351c

[ghstack-poisoned]

anjali411 added a commit that referenced this pull request Aug 4, 2020

Port bmm and baddbmm from TH to ATen

e2536db

ghstack-source-id: b737e98 Pull Request resolved: #42553

anjali411 requested a review from zasdfgbnm August 4, 2020 19:30

anjali411 changed the title ~~Port bmm and baddbmm from TH to ATen~~ [WIP] Port bmm and baddbmm from TH to ATen Aug 4, 2020

Update on "[WIP] Port bmm and baddbmm from TH to ATen"

2e66241

[ghstack-poisoned]

anjali411 added a commit that referenced this pull request Aug 5, 2020

Port bmm and baddbmm from TH to ATen

a6c54eb

ghstack-source-id: 3eba02a Pull Request resolved: #42553

zasdfgbnm reviewed Aug 5, 2020

View reviewed changes

aten/src/THC/THCBlas.cu Outdated Show resolved Hide resolved

aten/src/THC/THCBlas.cu Outdated Show resolved Hide resolved

Update on "[WIP] Port bmm and baddbmm from TH to ATen"

8ae2836

[ghstack-poisoned]

Update on "[WIP] Port bmm and baddbmm from TH to ATen"

01356ee

[ghstack-poisoned]

Update on "[WIP] Port bmm and baddbmm from TH to ATen"

3edd722

[ghstack-poisoned]

anjali411 requested a review from zasdfgbnm August 5, 2020 22:26

anjali411 added a commit that referenced this pull request Aug 5, 2020

Port bmm and baddbmm from TH to ATen

c6f4f7b

ghstack-source-id: 047dffe Pull Request resolved: #42553

anjali411 changed the title ~~[WIP] Port bmm and baddbmm from TH to ATen~~ Port bmm and baddbmm from TH to ATen Aug 6, 2020

zasdfgbnm reviewed Aug 6, 2020

View reviewed changes

aten/src/ATen/native/cuda/LinearAlgebra.cu Outdated Show resolved Hide resolved

Update on "Port bmm and baddbmm from TH to ATen"

7c8a985

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. [ghstack-poisoned]

anjali411 added a commit that referenced this pull request Aug 7, 2020

Port bmm and baddbmm from TH to ATen

78c3ad6

ghstack-source-id: f45d4e1 Pull Request resolved: #42553

Update on "Port bmm and baddbmm from TH to ATen"

59da4e1

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. [ghstack-poisoned]

anjali411 added a commit that referenced this pull request Aug 12, 2020

Port bmm and baddbmm from TH to ATen

222077a

ghstack-source-id: 61c04d4 Pull Request resolved: #42553

zasdfgbnm reviewed Aug 13, 2020

View reviewed changes

Update on "Port bmm and baddbmm from TH to ATen"

248f5b6

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. [ghstack-poisoned]

anjali411 added a commit that referenced this pull request Aug 13, 2020

Port bmm and baddbmm from TH to ATen

10ba9d5

ghstack-source-id: b43d308 Pull Request resolved: #42553

zasdfgbnm reviewed Aug 17, 2020

View reviewed changes

aten/src/ATen/native/cuda/LinearAlgebra.cu Outdated Show resolved Hide resolved

zasdfgbnm reviewed Aug 17, 2020

View reviewed changes

aten/src/ATen/cuda/CUDABlas.cpp Outdated Show resolved Hide resolved

Update on "Port bmm and baddbmm from TH to ATen"

495c0e1

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. [ghstack-poisoned]

Update on "Port bmm and baddbmm from TH to ATen"

006a3bd

Ports `torch.bmm` and `torch.baddbmm` from TH to ATen, as well as adds support for complex dtypes. Also removes dead TH code for Level 2 functions. Closes #24539 [ghstack-poisoned]

anjali411 added a commit that referenced this pull request Nov 11, 2020

Port bmm and baddbmm from TH to ATen

0d617da

ghstack-source-id: 2a2540a Pull Request resolved: #42553