Add a 'dim' argument to nuclear norm #21022

skrah · 2019-05-28T17:28:26Z

…ear_norm_axes

skrah · 2019-05-29T19:21:37Z

@pytorchbot retest this please.

skrah · 2019-05-30T11:58:12Z

@soumith @vishwakftw I've implemented the 'dim' argument for nuclear norm (#18275).

As @vishwakftw has mentioned, this requires a batch_svd() implementation. I've added a simple one that has about the same performance characteristics as np.linalg.svd().

There are some open issues about optimized batch_svd() implementations. Perhaps it would be easier to proceed with those after at::svd() has been ported to ATen. So I left the batch_svd() implementation as a static function exclusively for nuclear_norm().

aten/src/ATen/native/LinearAlgebra.cpp

vishwakftw · 2019-05-30T14:27:12Z

I’ll start working on implementing batch SVD in a week or two. It would be safe to mark the batch_svd function as internal (with an underscore preceding the name), but I don’t think its exposed in Python in this PR anyways.

aten/src/ATen/native/LinearAlgebra.cpp

ezyang · 2019-06-06T18:52:00Z

@vishwakftw If you think this is good to go I can merge it

vishwakftw · 2019-06-06T19:03:22Z

@ezyang I should be able to take a look at this tomorrow (it’s a bit late back here). Hope that’s fine.

…sions.

vishwakftw

Looks good so far. I noticed that there are no CUDA tests available, could you add them too?

aten/src/ATen/native/LinearAlgebra.cpp

test/test_torch.py

torch/functional.py

… message.

…ear_norm_axes

skrah · 2019-06-08T21:34:18Z

@pytorchbot retest this please.

skrah · 2019-06-09T09:43:19Z

@pytorchbot retest this please.

vishwakftw

Looks good to me except a few nits. Once those are addressed, this should be good to land.

aten/src/ATen/native/LinearAlgebra.cpp

vishwakftw · 2019-06-09T10:01:59Z

aten/src/ATen/native/LinearAlgebraUtils.h

+    perm.push_back(i);
+  }
+
+  TORCH_CHECK(perm.size() == ndim,


Neat check!

test/test_torch.py

vishwakftw · 2019-06-10T09:04:37Z

test/test_torch.py

+    def _test_nuclear_norm_axes(self, device='cpu'):
+        def check_single_nuclear_norm(x, axes):
+            if x.is_cuda and randrange(100) < 95:
+                return  # too many cpu <==> gpu copies


What is the purpose of randrange?

If you are concerned about too many copies you can create a tensor on the CPU and pass it to this function, and then create a copy of the tensor on the device that you are testing for explicitly.

Something like this:

x_device = x.to(device=device, copy=True) if device != ‘cpu’ else x # x is a CPU tensor invariably.

The tests are too slow, so some are skipped (they all pass here on my machine without randrange).

So I think that the CUDA tests spend the majority of the time copying from GPU ==> CPU so that numpy arrays can be created and the results can be compared.

Currently there are three copies when x is already on the GPU:

For creating the numpy array.

For copying the first result to CPU for allclose().

For copying the second result to CPU for allclose().

One can reduce it to two copies by copying expected to GPU, but it would probably not be much better.

…ear_norm_axes

vishwakftw · 2019-06-10T10:02:12Z

test/test_torch.py

+            self.assertTrue(np.allclose(ans.cpu(), expected, rtol=1e-04, atol=1e-04))
+
+        for n in range(1, 5):
+            for m in range(1, 5):


I think these ranges create too many test cases which is slowing down the tests. In my opinion: three kinds of test cases should suffice:

square matrices

tall matrices

fat matrices

Square matrices could be of dimensions (3, 3), fat matrices could be of dimensions (3, 5) and tall matrices could be of dimensions (5, 3). In addition to these, we could have at most 2 batch dimensions (5, *, *) and (7, 5, *, *). This comes up to 9 test cases. In you case there are (2 + 4) * (4 * 4) = 96 test cases (just considering the sizes of the tensors).

With a release build, probably including pytest test collection:

CPU

$ /home/stefan/rel2/bin/python3 -m pytest test_torch.py -k test_sum_dim -v ... 1 passed, 494 deselected in 19.41 seconds

$ /home/stefan/rel2/bin/python3 -m pytest test_torch.py -k test_nuclear_norm -v ... 2 passed, 493 deselected in 2.05 seconds

CUDA

$ /home/stefan/rel2/bin/python3 -m pytest test_cuda.py -k test_det_logdet_slogdet -v ... 1 passed, 169 deselected in 6.37 seconds

95% random skip (with 98% random skip it is 4.74s):

/home/stefan/rel2/bin/python3 -m pytest test_cuda.py -k nuclear_norm -v ... 2 passed, 168 deselected in 7.07 seconds

So CPU is still relatively fast. On CUDA it definitely is one of the slower tests, but not completely an outlier. The benefit of brute force is that it catches issues like:

#20452

Are random skips not allowed in the tests? In the CPython test suite they are allowed, but practices differ of course.

I don’t think random skips are allowed. However there is a @slowtest decorator available, introduced in #18231

@ezyang do you have any suggestions?

With 58c000c the CPU tests take 0.7s and the CUDA tests (with 95% skips) 2.4s.

vishwakftw

Everything looks neat, thank you for this. I’ll wait for @ezyang to give his thoughts about randomly skipping tests.

ezyang · 2019-06-10T18:27:25Z

Yes, please use slowTest :) Excerpts from Vishwak Srinivasan's message of 2019-06-10 04:18:30 -0700:

…

vishwakftw commented on this pull request. > + expected = np.linalg.norm(a, "nuc", axis=axes) + + ans = torch.norm(x, "nuc", dim=axes) + self.assertTrue(ans.is_contiguous()) + self.assertEqual(ans.shape, expected.shape) + self.assertTrue(np.allclose(ans.cpu(), expected, rtol=1e-04, atol=1e-04)) + + out = torch.zeros(expected.shape, dtype=x.dtype, device=x.device) + ans = torch.norm(x, "nuc", dim=axes, out=out) + self.assertIs(ans, out) + self.assertTrue(ans.is_contiguous()) + self.assertEqual(ans.shape, expected.shape) + self.assertTrue(np.allclose(ans.cpu(), expected, rtol=1e-04, atol=1e-04)) + + for n in range(1, 5): + for m in range(1, 5): @ezyang do you have any suggestions?

ezyang · 2019-06-10T18:34:16Z

I'm OK with landing this straight up

facebook-github-bot

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

skrah · 2019-06-10T19:18:58Z

Yes, please use slowTest :)

Fine, next time. :)

Thanks @vishwakftw for reviewing and @ezyang for landing this!

Summary: Addresses #18275. Pull Request resolved: pytorch/pytorch#21022 Differential Revision: D15743515 Pulled By: ezyang fbshipit-source-id: e4aaea0bd7f863a2abad45c4322d6a9fb02a88e3

facebook-github-bot · 2019-06-10T23:36:03Z

@ezyang merged this pull request in 8b9b215.

Add a 'dim' argument to nuclear norm.

29218b3

pytorchbot added the module: operators label May 28, 2019

skrah added 4 commits May 28, 2019 19:29

Merge branch 'master' of https://github.com/pytorch/pytorch into nucl…

22ed95e

…ear_norm_axes

Appease flake8.

c6633a6

Merge branch 'master' of https://github.com/pytorch/pytorch into nucl…

78c7118

…ear_norm_axes

Remove dispatch from the new functions.

c02aff8

skrah changed the title ~~[WIP] Add a 'dim' argument to nuclear norm~~ Add a 'dim' argument to nuclear norm May 30, 2019

skrah requested a review from soumith May 30, 2019 11:58

vishwakftw reviewed May 30, 2019

View reviewed changes

aten/src/ATen/native/LinearAlgebra.cpp Outdated Show resolved Hide resolved

vishwakftw reviewed May 30, 2019

View reviewed changes

aten/src/ATen/native/LinearAlgebra.cpp Outdated Show resolved Hide resolved

vishwakftw reviewed May 30, 2019

View reviewed changes

aten/src/ATen/native/LinearAlgebra.cpp Outdated Show resolved Hide resolved

skrah added 3 commits May 30, 2019 17:30

Use batchCount().

5a14ae6

Reuse the shape vector.

b107d4e

Rename batch_svd() to _batch_svd().

3794b37

ezyang added the open source label Jun 5, 2019

skrah commented Jun 6, 2019

View reviewed changes

aten/src/ATen/native/LinearAlgebra.cpp Show resolved Hide resolved

ezyang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jun 6, 2019

Remove contiguous() call and add tests for non-contiguous batch dimen…

282ff0f

…sions.

vishwakftw reviewed Jun 7, 2019

View reviewed changes

aten/src/ATen/native/LinearAlgebra.cpp Outdated Show resolved Hide resolved

aten/src/ATen/native/LinearAlgebra.cpp Outdated Show resolved Hide resolved

test/test_torch.py Outdated Show resolved Hide resolved

torch/functional.py Outdated Show resolved Hide resolved

skrah added 6 commits June 7, 2019 21:01

Move utility function to LinearAlgebraUtils.h.

76619a4

Move axes sanity check out of permute() in order to get a nicer error…

e68fa92

… message.

Add return.

99c1818

Add CUDA tests.

1a32bf9

Use at::empty() and only reshape if compute_uv is true.

f6902ef

Merge branch 'master' of https://github.com/pytorch/pytorch into nucl…

825f80a

…ear_norm_axes

skrah added 3 commits June 8, 2019 12:20

Another questionable flake8 preference.

762f9ac

Merge branch 'master' of https://github.com/pytorch/pytorch into nucl…

aa46fe8

…ear_norm_axes

Skip tests on ROCm (missing MAGMA dependency).

b6f1bbd

vishwakftw reviewed Jun 9, 2019

View reviewed changes

Move CUDA tests to test_cuda.py.

bd4c694

pytorchbot added the module: cuda Related to torch.cuda, and CUDA support in general label Jun 9, 2019

Improve some nuclear_norm() error messages.

780823d

vishwakftw reviewed Jun 10, 2019

View reviewed changes

skrah added 2 commits June 10, 2019 11:14

Relax the error bounds (one CI is failing).

44ec29d

Merge branch 'master' of https://github.com/pytorch/pytorch into nucl…

160b081

…ear_norm_axes

vishwakftw reviewed Jun 10, 2019

View reviewed changes

Speed up tests.

58c000c

vishwakftw approved these changes Jun 10, 2019

View reviewed changes

Relax the error bounds even more (one CI still fails).

bb9198e

facebook-github-bot reviewed Jun 10, 2019

View reviewed changes

facebook-github-bot closed this in 8b9b215 Jun 10, 2019

facebook-github-bot added the merged label Jun 10, 2019

skrah mentioned this pull request Jun 17, 2019

torch.norm does not work when p="nuc" #18275

Closed

mruberry added the Merged label Oct 28, 2020

Add a 'dim' argument to nuclear norm #21022

Add a 'dim' argument to nuclear norm #21022

Uh oh!

Conversation

skrah commented May 28, 2019 • edited by ezyang Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

skrah commented May 29, 2019

Uh oh!

skrah commented May 30, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vishwakftw commented May 30, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

ezyang commented Jun 6, 2019

Uh oh!

vishwakftw commented Jun 6, 2019

Uh oh!

vishwakftw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

skrah commented Jun 8, 2019

Uh oh!

skrah commented Jun 9, 2019

Uh oh!

vishwakftw left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vishwakftw Jun 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vishwakftw Jun 10, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vishwakftw left a comment

Choose a reason for hiding this comment

Uh oh!

ezyang commented Jun 10, 2019 via email

Uh oh!

ezyang commented Jun 10, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

skrah commented Jun 10, 2019

Uh oh!

facebook-github-bot commented Jun 10, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

skrah commented May 28, 2019 •

edited by ezyang

Loading

vishwakftw commented May 30, 2019 •

edited

Loading

vishwakftw Jun 10, 2019 •

edited

Loading

vishwakftw Jun 10, 2019 •

edited

Loading