Port torch.copysign method_tests() to OpInfo #54945

RockingJavaBean · 2021-03-30T08:10:13Z

This PR ports the method_tests() entries of torch.copysign to OpInfo.

While porting the tests, the test_out cases from test_ops.py would fail as the out variant of torch.copysign does not support scalar inputs.

>>> x = torch.randn(2)
>>> y = torch.empty_like(x)
>>> torch.copysign(x, 1.)
tensor([1.4836, 1.2156])
>>> torch.copysign(x, 1., out=y)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: copysign(): argument 'other' (position 2) must be Tensor, not float

This PR fixes the tests by adding an overload native_functions entry and re-dispatching scalar inputs to the existing copysign_out function.

facebook-github-bot · 2021-03-30T08:10:20Z

💊 CI failures summary and remediations

As of commit 27eb2e1 (more details on the Dr. CI page):

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

codecov · 2021-03-30T13:31:59Z

Codecov Report

Merging #54945 (27eb2e1) into master (aeedd5c) will decrease coverage by 0.20%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master   #54945      +/-   ##
==========================================
- Coverage   77.44%   77.23%   -0.21%     
==========================================
  Files        1893     1893              
  Lines      186472   186477       +5     
==========================================
- Hits       144404   144033     -371     
- Misses      42068    42444     +376

ezyang · 2021-03-30T15:27:23Z

aten/src/ATen/native/native_functions.yaml


+- func: copysign.Scalar_out(Tensor self, Scalar other, *, Tensor(a!) out) -> Tensor(a!)
+  dispatch:
+    CPU, CUDA: copysign_out


Implementation is

Tensor& copysign_out(const Tensor& self, const Scalar& other, Tensor& result) { return at::copysign_out(result, self, wrapped_scalar_tensor(other)); }

which is fully dispatched so CPU, CUDA is overly conservative; CompositeImplicitAutograd would be OK. Even better, though, would be to make the kernel structured (not in this PR though https://github.com/pytorch/rfcs/blob/rfc-0005/RFC-0005-structured-kernel-definitions.md )--if you think you are interested in making it structured, no action necessary here, structured will fix this up later.

Thank you so much for reviewing this PR, and I'm really interested in the structured kernel.
A new PR #55040 is created for porting torch.copysign to structured, please kindly take a look.

ezyang · 2021-03-31T14:35:27Z

torch/testing/_internal/common_methods_invocations.py

+                               low=None, high=None,
+                               requires_grad=requires_grad)
+        else:
+            return case


There isn't really any non-tuple case for you to hit in the examples below, right?

@mruberry is there a more well known function that's supposed to be used in this case?

@ezyang The non-tuple cases are hit when constructing args of SampleInput, please kindly refer to https://github.com/pytorch/pytorch/pull/54945/files#r605009953.
I think this branch of checking tuple causes this ambiguity.
It's nearly midnight in my timezone and it took some hours to build PyTorch from the source, I will try improving the readability of this PR tomorrow.

ezyang · 2021-03-31T14:35:49Z

Overall this PR looks good but I am not an OpInfo expert so would like @mruberry to take a look

RockingJavaBean · 2021-03-31T15:41:37Z

torch/testing/_internal/common_methods_invocations.py

+
+    return [SampleInput(_make_case(lhs), args=(_make_case(rhs),))
+            for lhs, rhs in cases]
+


Besides using _make_case(lhs) to generate input tensor of SampleInput, the _make_case(rhs) is used for args as well.
The rhs corresponds to the second item of each tuple element.
Hence its values are 3.14, 0.0, and -0.0 for the corresponding scalar cases.

…fo_copysign

RockingJavaBean · 2021-04-01T09:34:11Z

This PR is updated with the following changes.

drop the changes for copysign.Scalar_out as torch.copysign is being ported to structured in copysign: port to structured kernel #55040.
remove the branch that causes ambiguity in sample_inputs_copysign for readability.

I think this PR is ready for another look.

Summary: Related #54945 This PR ports `copysign` to structured, and the `copysign.Scalar` overloads are re-dispatched to the structured kernel. Pull Request resolved: #55040 Reviewed By: glaringlee Differential Revision: D27465501 Pulled By: ezyang fbshipit-source-id: 5cbabfeaaaa7ca184ae0b701b9692a918a90b117

mruberry

LGTM! Nice simplification, @RockingJavaBean.

facebook-github-bot · 2021-04-01T16:43:17Z

@mruberry has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-04-02T03:32:11Z

@mruberry merged this pull request in b074a24.

kshitij12345 · 2021-04-05T09:29:13Z

torch/testing/_internal/common_methods_invocations.py

+        # broadcast rhs
+        (_make_tensor(S, S, S), _make_tensor(S, S)),
+        # broadcast lhs
+        (_make_tensor(S, S), _make_tensor(S, S, S)),


@RockingJavaBean @mruberry Any idea how this is working. AFAIK this shouldn't work. Reference: #50747

Surprisingly, I haven't seen any related failed test in master.

But trying to run locally, I am getting error as expected.

============================================================================= FAILURES ============================================================================= ________________________________________________ TestCommonCPU.test_variant_consistency_eager_copysign_cpu_float32 _________________________________________________ Traceback (most recent call last): File "/home/kshiteej/Pytorch/pytorch_inplace_broadcast_test/test/test_ops.py", line 306, in test_variant_consistency_eager _test_consistency_helper(inplace_samples, inplace_variants) File "/home/kshiteej/Pytorch/pytorch_inplace_broadcast_test/test/test_ops.py", line 290, in _test_consistency_helper variant_forward = variant(cloned, RuntimeError: output with shape [5, 5] doesn't match the broadcast shape [5, 5, 5] _______________________________________________ TestCommonCUDA.test_variant_consistency_eager_copysign_cuda_float32 ________________________________________________ Traceback (most recent call last): File "/home/kshiteej/Pytorch/pytorch_inplace_broadcast_test/test/test_ops.py", line 306, in test_variant_consistency_eager _test_consistency_helper(inplace_samples, inplace_variants) File "/home/kshiteej/Pytorch/pytorch_inplace_broadcast_test/test/test_ops.py", line 290, in _test_consistency_helper variant_forward = variant(cloned, RuntimeError: output with shape [5, 5] doesn't match the broadcast shape [5, 5, 5] ========================================================================= warnings summary =========================================================================

While trying to see the CI build log, I noticed something odd.

Looking at the CI run for pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1.

There is no result for test_variant_consistency_eager_copysign_cpu_float32.

Relevant lines from the log linked below

Details

Apr 01 17:21:54 test_variant_consistency_eager_broadcast_to_cpu_complex64 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 17:21:54 test_variant_consistency_eager_broadcast_to_cpu_float32 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 17:21:54 test_variant_consistency_eager_ceil_cpu_float32 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 17:21:54 test_variant_consistency_eager_cholesky_cpu_complex64 (__main__.TestCommonCPU) ... ok (0.006s) Apr 01 17:21:54 test_variant_consistency_eager_cholesky_cpu_float32 (__main__.TestCommonCPU) ... ok (0.005s) Apr 01 17:21:54 test_variant_consistency_eager_cholesky_inverse_cpu_complex64 (__main__.TestCommonCPU) ... ok (0.003s) Apr 01 17:21:54 test_variant_consistency_eager_cholesky_inverse_cpu_float32 (__main__.TestCommonCPU) ... ok (0.005s) Apr 01 17:21:54 test_variant_consistency_eager_clamp_cpu_float32 (__main__.TestCommonCPU) ... ok (0.006s) Apr 01 17:21:54 test_variant_consistency_eager_conj_cpu_complex64 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 17:21:54 test_variant_consistency_eager_conj_cpu_float32 (__main__.TestCommonCPU) ... ok (0.003s) Apr 01 17:21:54 test_variant_consistency_eager_cos_cpu_complex64 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 17:21:54 test_variant_consistency_eager_cos_cpu_float32 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 17:21:54 test_variant_consistency_eager_cosh_cpu_complex64 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 17:21:54 test_variant_consistency_eager_cosh_cpu_float32 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 17:21:54 test_variant_consistency_eager_cummax_cpu_float32 (__main__.TestCommonCPU) ... ok (0.003s) Apr 01 17:21:54 test_variant_consistency_eager_cummin_cpu_float32 (__main__.TestCommonCPU) ... ok (0.003s) Apr 01 17:21:54 test_variant_consistency_eager_cumprod_cpu_complex64 (__main__.TestCommonCPU) ... ok (0.007s) Apr 01 17:21:54 test_variant_consistency_eager_cumprod_cpu_float32 (__main__.TestCommonCPU) ... ok (0.006s) Apr 01 17:21:54 test_variant_consistency_eager_cumsum_cpu_complex64 (__main__.TestCommonCPU) ... ok (0.005s) Apr 01 17:21:54 test_variant_consistency_eager_cumsum_cpu_float32 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 17:21:54 test_variant_consistency_eager_deg2rad_cpu_float32 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 17:21:54 test_variant_consistency_eager_diag_cpu_complex64 (__main__.TestCommonCPU) ... ok (0.005s) Apr 01 17:21:54 test_variant_consistency_eager_diag_cpu_float32 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 17:21:54 test_variant_consistency_eager_diff_cpu_complex64 (__main__.TestCommonCPU) ... ok (0.005s) Apr 01 17:21:54 test_variant_consistency_eager_diff_cpu_float32 (__main__.TestCommonCPU) ... ok (0.005s) Apr 01 17:21:54 test_variant_consistency_eager_digamma_cpu_float32 (__main__.TestCommonCPU) ... ok (0.004s)

@mruberry Did I miss something?

https://circleci.com/api/v1.1/project/github/pytorch/pytorch/12041172/output/107/0?file=true&allocation-id=6065fe626bfe87630fa4b686-0-build%2F48AA87AC

This line causes the issue.

pytorch/test/test_ops.py

Line 238 in c821b83

variants = (v for v in (method, inplace) + aliases if v is not None)

variants is a generator so it is exhausted in the first run (refer python snippet below)

Thus the following loop does not run except for the first sample

pytorch/test/test_ops.py

Lines 268 to 286 in c821b83

# Test eager consistency

for variant in variants:

# Skips inplace ops

if variant in inplace_ops and skip_inplace:

continue

# Compares variant's forward

# Note: copies the to-be-modified input when testing the inplace variant

tensor.grad = None

cloned = clone_input_helper(sample.input) if variant in inplace_ops else sample.input

variant_forward = variant(cloned,

*sample.args,

**sample.kwargs)

self.assertEqual(expected_forward, variant_forward)

# Compares variant's backward

if expected_grad is not None and (variant not in inplace_ops or op.supports_inplace_autograd):

variant_forward.sum().backward()

self.assertEqual(expected_grad, tensor.grad)

Example

>>> def print_iterable(iterable): ... for x in iterable: ... print(x) ... >>> l = [x for x in range(3)] # list comprehension >>> print_iterable(l) 0 1 2 >>> print_iterable(l) 0 1 2 >>> l = (x for x in range(3)) # generator expression >>> print_iterable(l) 0 1 2 >>> print_iterable(l) >>>

While trying to see the CI build log, I noticed something odd.

Looking at the CI run for pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1.

There is no result for test_variant_consistency_eager_copysign_cpu_float32.

Relevant lines from the log linked below

@mruberry Did I miss something?

https://circleci.com/api/v1.1/project/github/pytorch/pytorch/12041172/output/107/0?file=true&allocation-id=6065fe626bfe87630fa4b686-0-build%2F48AA87AC

@kshitij12345 Thank you so much for pointing this out.

According to pytorch_linux_xenial_cuda10_2_cudnn7_py3_gcc7_test1 triggered by the last commit of this PR 27eb2e1, the test_variant_consistency_eager_copysign_cpu did run and pass.

https://circleci.com/api/v1.1/project/github/pytorch/pytorch/12027499/output/107/0?file=true&allocation-id=606569f28b208810a6a496b9-0-build%2F5C11284D

Apr 01 06:48:33 test_variant_consistency_eager_conj_cpu_float32 (__main__.TestCommonCPU) ... ok (0.003s) Apr 01 06:48:33 test_variant_consistency_eager_copysign_cpu_float32 (__main__.TestCommonCPU) ... ok (0.004s) Apr 01 06:48:33 test_variant_consistency_eager_cos_cpu_complex64 (__main__.TestCommonCPU) ... ok (0.004s)

And after checking the test_variant_consistency_eager in test_ops.py, I agree with you that the exhausted generator for variants leads to the skipping of SampleInputs and it is why cases for broadcasted self tensor pass without resolving issue 50747.

I'm really appreciated your PR #53014 for fixing this issue, I think it will enable testing broadcasted self tensor with OpInfo.

@RockingJavaBean
Right. My bad. I wonder why the log I checked did not have it.

But I can confirm it is running.

Apologies for the false alarm.

Thanks for looking into it.

But this exposes a horrible problem with the use of generator expressions like this in sample inputs; many tests are "running" but not actually performing the test properly.

@kshitij12345, @vfdev-5, what's the best way to fix this and validate the fix so the behavior isn't regressed later? One simple approach might be to validate that len(sample_inputs) > 0 in all the tests that enumerate them for now?

We can't do len on generators. So I think right now we can just do variants=tuple(genearator expression). But how to detect so that this doesn't happen in general case is worth some thought 🤔

Yeah... maybe we'll just have to be vigilant to detect it. But it'd be nice if we could not only detect that a test ran but that it actually did something.

I think looking at answers for this question at Stack Overflow might give us a nice idea.

Add opinfo for torch.copysign

84eb679

RockingJavaBean requested a review from ezyang as a code owner March 30, 2021 08:10

facebook-github-bot added the cla signed label Mar 30, 2021

pytorchbot added the open source label Mar 30, 2021

H-Huang added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 30, 2021

ezyang reviewed Mar 30, 2021

View reviewed changes

RockingJavaBean mentioned this pull request Mar 31, 2021

copysign: port to structured kernel #55040

Closed

ezyang requested a review from mruberry March 31, 2021 14:32

ezyang reviewed Mar 31, 2021

View reviewed changes

RockingJavaBean commented Mar 31, 2021

View reviewed changes

RockingJavaBean added 3 commits April 1, 2021 00:04

Merge branch 'master' of https://github.com/pytorch/pytorch into opin…

de2050e

…fo_copysign

drop changes to copysign.Scalar_out

296e318

update sample_inputs_copysign method

27eb2e1

mruberry approved these changes Apr 1, 2021

View reviewed changes

facebook-github-bot closed this in b074a24 Apr 2, 2021

facebook-github-bot added the Merged label Apr 2, 2021

RockingJavaBean deleted the opinfo_copysign branch April 2, 2021 08:55

kshitij12345 reviewed Apr 5, 2021

View reviewed changes

This was referenced Apr 5, 2021

testing: run eager consistency test on all samples #55300

Closed

OpInfo & test_torch.py cleanup #55201

Closed

imaginary-person mentioned this pull request Apr 7, 2021

Add OpInfo tests for torch.addbmm #55378

Closed


		return [SampleInput(_make_case(lhs), args=(_make_case(rhs),))
		for lhs, rhs in cases]

	# Test eager consistency
	for variant in variants:
	# Skips inplace ops
	if variant in inplace_ops and skip_inplace:
	continue

	# Compares variant's forward
	# Note: copies the to-be-modified input when testing the inplace variant
	tensor.grad = None
	cloned = clone_input_helper(sample.input) if variant in inplace_ops else sample.input
	variant_forward = variant(cloned,
	*sample.args,
	**sample.kwargs)
	self.assertEqual(expected_forward, variant_forward)

	# Compares variant's backward
	if expected_grad is not None and (variant not in inplace_ops or op.supports_inplace_autograd):
	variant_forward.sum().backward()
	self.assertEqual(expected_grad, tensor.grad)

Port torch.copysign method_tests() to OpInfo #54945

Port torch.copysign method_tests() to OpInfo #54945

Uh oh!

Conversation

RockingJavaBean commented Mar 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Mar 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💊 CI failures summary and remediations

Uh oh!

codecov bot commented Mar 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ezyang commented Mar 31, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RockingJavaBean commented Apr 1, 2021

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Apr 1, 2021

Uh oh!

facebook-github-bot commented Apr 2, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kshitij12345 Apr 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kshitij12345 Apr 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

RockingJavaBean commented Mar 30, 2021 •

edited

Loading

facebook-github-bot commented Mar 30, 2021 •

edited

Loading

codecov bot commented Mar 30, 2021 •

edited

Loading

kshitij12345 Apr 5, 2021 •

edited

Loading

kshitij12345 Apr 7, 2021 •

edited

Loading