Added CUDA support for complex input for torch.inverse #45034

IvanYashchuk · 2020-09-19T19:19:05Z

torch.inverse now works for complex inputs on GPU.
Test cases with complex matrices are xfailed for now. For example, batched matmul does not work with complex yet.

Ref. #33152

dr-ci · 2020-09-19T19:40:34Z

💊 CI failures summary and remediations

As of commit 55ad7fc (more details on the Dr. CI page):

2/8 failures possibly* introduced in this PR
- 2/2 non-CircleCI failure(s)
6/8 broken upstream at merge base 9e0102c since Nov 08

1 failure confirmed as flaky and can be ignored:

pytorch_linux_xenial_py3_clang7_onnx_ort_test2

🚧 6 ongoing upstream failures:

These were probably caused by upstream breakages that are not fixed yet:

pytorch_linux_xenial_py3_6_gcc5_4_test since Nov 08
- 🔁 rerun
binary_linux_libtorch_3_7m_cpu_devtoolset7_shared-with-deps_build since Nov 06
- 🔁 rerun
pytorch_linux_bionic_py3_8_gcc9_coverage_test1 since Nov 08
- 🔁 rerun
pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1 since Nov 08
- 🔁 rerun
pytorch_linux_xenial_py3_clang5_asan_test1 since Nov 08
- 🔁 rerun
pytorch_linux_bionic_py3_6_clang9_test since Nov 08
- 🔁 rerun

Extra GitHub checks: 2 failed

Failed: GitHub Actions - flake8-py3
Failed: GitHub Actions - flake8-py3

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 76 times.

test/test_torch.py

xwang233 · 2020-09-22T20:45:53Z

test/test_torch.py

I think currently single-inverse is on cusolver, but batched inverse is mostly on MAGMA. cublas batched inverse performance is too bad, and is not used.

Different implementations shouldn't give these large differences. Can you try running tests only in double precision, and see if the numerical difference is still this large?

I think currently single-inverse is on cusolver, but batched inverse is mostly on MAGMA.

Indeed, I mixed them up.

I will check the tolerances.

Double precision is fine and tests pass for low tolerance (atol=1e-11 in my case).
But for floats, individual entries can vary significantly:

Tensors failed to compare as equal! With rtol=0 and atol=0.1, found 2553 element(s) (out of 393216) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 0.37847900390625 (820.0460815429688 vs. 819.6676025390625), which occurred at index (1, 1, 254, 71).

I think MAGMA and cuSOLVER can give these large differences as long as the identity == inv(A) A tests pass

self.assertEqual(identity.expand_as(matrix), torch.matmul(matrix, matrix_inverse)) self.assertEqual(identity.expand_as(matrix), torch.matmul(matrix_inverse, matrix))

xwang233

Thanks for the work! The PR generally looks good, other than the tolerance in tests.

xwang233

Thanks! The cusolver implementations LGTM. Can you rebase and make sure all relevant tests passed?

IvanYashchuk · 2020-09-25T09:13:35Z

So MAGMA vs cuSOLVER (batched vs single inverse) fails on Windows with

Tensors failed to compare as equal! With rtol=0 and atol=0.1, found 19138 element(s) (out of 393216) whose difference(s) exceeded the margin of error (including 0 nan comparisons). The greatest difference was 1.11126708984375 (819.3917236328125 vs. 818.2804565429688)

I wonder what should we do with this check.

xwang233 · 2020-09-25T23:28:06Z

I think it's mostly due to precisions? When the original matrix is in the range of randn, which is mostly small numbers like -1~1. Its inverse could contain large numbers (such as 819 you are seeing) and cause numerical inaccuracy in float32 precision.

very slow

for fp32 batched inverse

codecov · 2020-10-02T10:12:48Z

Codecov Report

Merging #45034 into master will increase coverage by 46.17%.
The diff coverage is 100.00%.

@@             Coverage Diff             @@
##           master   #45034       +/-   ##
===========================================
+ Coverage   35.28%   81.45%   +46.17%     
===========================================
  Files         443     1798     +1355     
  Lines       56512   188195   +131683     
===========================================
+ Hits        19938   153297   +133359     
+ Misses      36574    34898     -1676

anjali411

lgtm thanks @IvanYashchuk

IvanYashchuk · 2020-10-27T18:29:20Z

I've fixed merge conflicts.

IvanYashchuk · 2020-10-29T08:18:24Z

@mruberry, @anjali411 could you import this pull request?

facebook-github-bot · 2020-10-30T17:25:38Z

Hi @IvanYashchuk!

Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention.

You currently have a record in our system, but we do not have a signature on file.

In order for us to review and merge your code, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

IvanYashchuk · 2020-11-02T09:56:20Z

I've fixed new merge conflicts.
@mruberry, @anjali411 this PR should be ready now for merging.

facebook-github-bot

@anjali411 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

anjali411 · 2020-11-05T15:41:06Z

Hi @IvanYashchuk can you rebase and resolve the merge conflict?

facebook-github-bot

@anjali411 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2020-11-06T01:13:26Z

@anjali411 merged this pull request in 33acbed.

ezyang · 2020-11-06T15:13:23Z

It looks like this is failing slow tests: https://app.circleci.com/pipelines/github/pytorch/pytorch/235866/workflows/ce4b1d8f-f73d-4853-b153-3d842c6bfe15/jobs/8743650

IvanYashchuk · 2020-11-06T17:17:34Z

Failing tests should be fixed now with a770132.

anjali411 · 2020-11-06T19:21:51Z

@IvanYashchuk can you create a new PR for reland?

Summary: `torch.inverse` now works for complex inputs on GPU. Opening a new PR here. The previous PR was merged and reverted due to a bug in tests marked with `slowTest`. Previous PR #45034 Ref. #33152 Pull Request resolved: #47595 Reviewed By: navahgar Differential Revision: D24840955 Pulled By: anjali411 fbshipit-source-id: ec49fffdc4b3cb4ae7507270fa24e127be14f59b

IvanYashchuk added module: complex Related to complex number support in PyTorch module: linear algebra Issues related to specialized linear algebra operations in PyTorch; includes matrix multiply matmul labels Sep 19, 2020

IvanYashchuk mentioned this pull request Sep 19, 2020

Complex Numbers Support #33152

Closed

pytorchbot added the open source label Sep 19, 2020

ailzhang requested review from anjali411 and vishwakftw September 22, 2020 20:08

ailzhang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Sep 22, 2020

vishwakftw requested a review from xwang233 September 22, 2020 20:27

xwang233 reviewed Sep 22, 2020

View reviewed changes

test/test_torch.py Outdated Show resolved Hide resolved

xwang233 reviewed Sep 22, 2020

View reviewed changes

xwang233 approved these changes Sep 25, 2020

View reviewed changes

IvanYashchuk force-pushed the inverse-complex-cuda branch from a997594 to 1aa60da Compare September 28, 2020 17:14

IvanYashchuk added 6 commits September 29, 2020 01:14

Added complex inverse, magma version

d383c08

Added complex support for cusolver version of inverse

d1ec23e

Modified test_inverse for float32. Complex dtypes are xfailed.

bb99e82

Using SVD on CPU and .to(device) as per xwang233's comment SVD on GPU is

b490652

very slow

Updated comment about batched inverse check

d3056c5

Use higher rtol to account for differences in MAGMA and cuSOLVER

a298f93

for fp32 batched inverse

IvanYashchuk force-pushed the inverse-complex-cuda branch from 1aa60da to a298f93 Compare September 29, 2020 06:15

Bump tolerances once again

5a6650f

mruberry self-requested a review October 5, 2020 06:42

mruberry mentioned this pull request Oct 5, 2020

torch.linalg in PyTorch 1.10 tracker #42666

Closed

4 tasks

This was referenced Oct 14, 2020

Added linalg.cholesky #46083

Closed

Added linalg.tensorinv #45969

Closed

Merge remote-tracking branch 'upstream/master' into inverse-complex-cuda

2932662

IvanYashchuk added 2 commits October 26, 2020 05:52

Changed 'cuda' in device -> self.device_type == 'cuda'

291f8e6

Merge remote-tracking branch 'upstream/master' into inverse-complex-cuda

c6da06b

anjali411 approved these changes Oct 26, 2020

View reviewed changes

Merge remote-tracking branch 'upstream/master' into inverse-complex-cuda

f48d9f2

facebook-github-bot added the cla signed label Oct 31, 2020

Merge remote-tracking branch 'upstream/master' into inverse-complex-cuda

77c3c40

Merge branch 'master' into inverse-complex-cuda

a2c77cf

facebook-github-bot reviewed Nov 4, 2020

View reviewed changes

Merge remote-tracking branch 'upstream/master' into inverse-complex-cuda

042c54d

facebook-github-bot reviewed Nov 5, 2020

View reviewed changes

facebook-github-bot closed this in 33acbed Nov 6, 2020

facebook-github-bot added the Merged label Nov 6, 2020

IvanYashchuk mentioned this pull request Nov 6, 2020

Added computing matrix condition numbers (linalg.cond) #45832

Closed

IvanYashchuk added 2 commits November 6, 2020 11:14

Merge remote-tracking branch 'upstream/master' into inverse-complex-cuda

d5efc0c

Fixed atol=0 for test_inverse_many_batches

a770132

IvanYashchuk reopened this Nov 6, 2020

Merge remote-tracking branch 'upstream/master' into inverse-complex-cuda

55ad7fc

IvanYashchuk closed this Nov 9, 2020

IvanYashchuk mentioned this pull request Nov 9, 2020

Added CUDA support for complex input for torch.inverse #2 #47595

Closed

Added CUDA support for complex input for torch.inverse #45034

Added CUDA support for complex input for torch.inverse #45034

Uh oh!

Conversation

IvanYashchuk commented Sep 19, 2020

Uh oh!

dr-ci bot commented Sep 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

🚧 6 ongoing upstream failures:

Extra GitHub checks: 2 failed

Uh oh!

Uh oh!

xwang233 Sep 22, 2020

Choose a reason for hiding this comment

Uh oh!

IvanYashchuk Sep 23, 2020

Choose a reason for hiding this comment

Uh oh!

IvanYashchuk Sep 24, 2020

Choose a reason for hiding this comment

Uh oh!

xwang233 left a comment

Choose a reason for hiding this comment

Uh oh!

xwang233 left a comment

Choose a reason for hiding this comment

Uh oh!

IvanYashchuk commented Sep 25, 2020

Uh oh!

xwang233 commented Sep 25, 2020

Uh oh!

codecov bot commented Oct 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

anjali411 left a comment

Choose a reason for hiding this comment

Uh oh!

IvanYashchuk commented Oct 27, 2020

Uh oh!

IvanYashchuk commented Oct 29, 2020

Uh oh!

facebook-github-bot commented Oct 30, 2020

Uh oh!

IvanYashchuk commented Nov 2, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

anjali411 commented Nov 5, 2020

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Nov 6, 2020

Uh oh!

ezyang commented Nov 6, 2020

Uh oh!

IvanYashchuk commented Nov 6, 2020

Uh oh!

anjali411 commented Nov 6, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

dr-ci bot commented Sep 19, 2020 •

edited

Loading

codecov bot commented Oct 2, 2020 •

edited

Loading