Changing as_strided_scatter to deterministic inputs #85583

srossross · 2022-09-24T00:34:35Z

Stack from ghstack (oldest at bottom):

-> Changing as_strided_scatter to deterministic inputs #85583

[ghstack-poisoned]

pytorch-bot · 2022-09-24T00:34:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/85583

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 Failures

As of commit 0b103b0:

The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

ghstack-source-id: b62b7c2 Pull Request resolved: #85583

[ghstack-poisoned]

ghstack-source-id: cdb5cd9 Pull Request resolved: #85583

[ghstack-poisoned]

ghstack-source-id: aa337a3 Pull Request resolved: #85583

srossross · 2022-09-27T17:02:37Z

@lezcano @amjames

Usually, in OpInfos, we take care of not generating these non-deterministic inputs, because these functions are not differentiable on these inputs, and gradtests would fail on them. As such my question is, are all the functions that are marked as skips coming from non-determinism non-deterministic for all inputs, or is there any of them that's just non-deterministic for some inputs. If it's the latter, it may be the case that the function has some deeper issue as the gradtests would be failing for them.

When I update the inputs to be deterministic I would have expected the test_fn_grad function to be passing, is that a reasonable assumption?

I still get this error with the input: torch.as_strided_scatter(make_tensor([3,3]), make_tensor([2, 2]), [2, 2], [2, 1])

__________________________________________________________ TestGradientsCPU.test_fn_grad_as_strided_scatter_cpu_float64 __________________________________________________________
Traceback (most recent call last):
  File "/home/srossross/pytorch/test/test_ops_gradients.py", line 156, in test_fn_grad
    self._grad_test_helper(device, dtype, op, op.get_op())
  File "/home/srossross/pytorch/test/test_ops_gradients.py", line 139, in _grad_test_helper
    return self._check_helper(device, dtype, op, variant, 'gradcheck', check_forward_ad=check_forward_ad,
  File "/home/srossross/pytorch/test/test_ops_gradients.py", line 108, in _check_helper
    self.assertTrue(gradcheck(fn, gradcheck_args,
  File "/home/srossross/pytorch/torch/testing/_internal/common_utils.py", line 3309, in gradcheck
    return torch.autograd.gradcheck(fn, inputs, **kwargs)
  File "/home/srossross/pytorch/torch/autograd/gradcheck.py", line 1418, in gradcheck
    return _gradcheck_helper(**args)
  File "/home/srossross/pytorch/torch/autograd/gradcheck.py", line 1432, in _gradcheck_helper
    _gradcheck_real_imag(gradcheck_fn, func, func_out, tupled_inputs, outputs, eps,
  File "/home/srossross/pytorch/torch/autograd/gradcheck.py", line 1075, in _gradcheck_real_imag
    gradcheck_fn(func, func_out, tupled_inputs, outputs, eps,
  File "/home/srossross/pytorch/torch/autograd/gradcheck.py", line 1309, in _fast_gradcheck
    analytical_vJu = _get_analytical_vJu_backward_mode(inputs, outputs, nondet_tol, check_grad_dtypes, all_v, all_u_dense)
  File "/home/srossross/pytorch/torch/autograd/gradcheck.py", line 560, in _get_analytical_vJu_backward_mode
    all_vJ = _check_analytical_jacobian_attributes(inputs, output, nondet_tol, check_grad_dtypes,
  File "/home/srossross/pytorch/torch/autograd/gradcheck.py", line 549, in _check_analytical_jacobian_attributes
    raise GradcheckError('Backward is not reentrant, i.e., running backward with '
torch.autograd.gradcheck.GradcheckError: Backward is not reentrant, i.e., running backward with same input and grad_output multiple times gives different values, although analytical gradient matches numerical gradient.The tolerance for nondeterminism was 0.0.

NOTE: If your op relies on non-deterministic operations i.e., it is listed here:
https://pytorch.org/docs/stable/generated/torch.use_deterministic_algorithms.html
this failure might be expected.

If you are adding a new operator, please file an issue and then use one of the
workarounds. The workaround depends on how your test invokes gradcheck/gradgradcheck.
If the test
- manually invokes gradcheck/gradgradcheck, then call gradcheck/gradgradcheck
  with `nondet_tol=<tol>` as a keyword argument.
- is OpInfo-based (e.g., in test_ops_gradients.py), then modify the OpInfo for the test
  to have `gradcheck_nondet_tol=<tol>`.
- is a Module test (e.g., in common_nn.py), then modify the corresponding
  module_test entry to have `gradcheck_nondet_tol=<tol>`

I'm going to continue to investigating this

lezcano · 2022-09-27T17:51:44Z

Is the gradient of this function deterministic for all the inputs? (it should be if the function is differentiable)

[ghstack-poisoned]

ghstack-source-id: 2a65ae2 Pull Request resolved: #85583

facebook-github-bot · 2022-10-04T00:13:57Z

/easycla

As part of the transition to the PyTorch Foundation, this project now requires contributions be covered under the new CLA. See #85559 for additional details.

This comment will trigger a new check of this PR. If you are already covered, you will simply see a new "EasyCLA" check that passes. If you are not covered, a bot will leave a new comment with a link to sign.

linux-foundation-easycla · 2022-10-04T00:14:08Z

The committers listed above are authorized under a signed CLA.

✅ login: srossross / name: Sean Ross-Ross (1c8f4c1, 839ffa2, 904fd95, 63781d3, b735e96, 853b9fe, 73497a2, 46036da)

[ghstack-poisoned]

ghstack-source-id: 9f9c5b5 Pull Request resolved: #85583

[ghstack-poisoned]

ghstack-source-id: e28cf84 Pull Request resolved: #85583

[ghstack-poisoned]

ghstack-source-id: 1229cda Pull Request resolved: #85583

[ghstack-poisoned]

ghstack-source-id: 75ac3a0 Pull Request resolved: #85583

mruberry · 2022-11-02T03:22:29Z

torch/testing/_internal/common_methods_invocations.py

+def error_inputs_as_strided_scatter(op_info, device, **kwargs):
+    make_arg = partial(make_tensor, device=device, dtype=torch.float32, requires_grad=False)
+
+    input_t = make_arg([4, 4])


Add a comment describing what error condition is being tested

mruberry · 2022-11-02T03:22:47Z

torch/testing/_internal/common_methods_invocations.py

-        ((16,), (2, 2, 2, 2), (1, 1, 1, 1), 0),
-        ((16,), (2, 1, 1, 2), (1, 7, 7, 1), 0),
+        ((3, 3), (2, 2), (2, 1), 0),
+        ((16,), (2, 2, 2, 2), (8, 4, 2, 1), 0),


What's the change here? Should the test cases have a comment describing the property they assume?

added a comment above, the removed code was causing non-deterministic behaviour so I modified the strides

mruberry · 2022-11-02T03:23:27Z

torch/testing/_internal/common_methods_invocations.py

               DecorateInfo(unittest.skip('Only fails for LAZY, passes on everything else'), 'TestCompositeCompliance', 'test_backward'),  # noqa: B950
               DecorateInfo(unittest.skip('Passes on complex64 and float32 only'), 'TestJit', 'test_variant_consistency_jit'),
               DecorateInfo(unittest.skip('Fails on cuda + rocm'), 'TestCommon', 'test_complex_half_reference_testing'),
+               # https://github.com/pytorch/pytorch/issues/88105


Nice pointer to the issue

mruberry

Overall looks good @srossross but I left some suggestions for comments to help future developers understand what's happening

[ghstack-poisoned]

ghstack-source-id: 28eff7e Pull Request resolved: #85583

mruberry

Cool; cc @kurtamohler

[ghstack-poisoned]

ngimel · 2022-11-07T23:39:49Z

I believe with #88342 the backward errors should be fixed

srossross · 2022-11-08T17:52:23Z

I'm not sure, but could something still be wrong, because now the tests for test_fn_gradgrad have an unexpected success, but the test_fn_grad_as_strided_scatter_cpu_float64 still fails with:

  File "/home/srossross/pytorch/torch/autograd/gradcheck.py", line 1283, in _check_analytical_numerical_equal
    raise GradcheckError(_get_notallclose_msg(a, n, j, i, complex_indices, test_imag, is_forward_ad) + jacobians_str)
torch.autograd.gradcheck.GradcheckError: Jacobian mismatch for output 0 with respect to input 0,
numerical:tensor(0., device='cuda:0', dtype=torch.float64)
analytical:tensor(1., device='cuda:0', dtype=torch.float64)

The above quantities relating the numerical and analytical jacobians are computed 
in fast mode. See: https://github.com/pytorch/pytorch/issues/53876 for more background 
about fast mode. Below, we recompute numerical and analytical jacobians in slow mode:

Numerical:
 tensor([[0.]], device='cuda:0', dtype=torch.float64)
Analytical:
tensor([[1.]], device='cuda:0', dtype=torch.float64)

The max per-element difference (slow mode) is: 1.0.

ngimel · 2022-11-08T18:28:11Z

cc @bdhirsh @soulitzer

[ghstack-poisoned]

ghstack-source-id: f568ec9 Pull Request resolved: #85583

lezcano · 2022-11-09T12:34:17Z

I think it's fine to merge this one as is. Then we can follow up with a different PR fixing that issue.
@bdhirsh, @soulitzer or me can have a look at it.

lezcano · 2022-11-09T12:38:04Z

@pytorchbot merge -f "unrelated failures"

pytorchmergebot · 2022-11-09T12:39:59Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Pull Request resolved: pytorch#85583 Approved by: https://github.com/mruberry

WPI: changing as_strided_scatter to deterministic inputs

1c8f4c1

[ghstack-poisoned]

srossross mentioned this pull request Sep 24, 2022

Added new test test_compare_cpu that checks if cpu and gpu results are consistent #85011

Closed

facebook-github-bot added the cla signed label Sep 24, 2022

srossross added a commit that referenced this pull request Sep 24, 2022

WPI: changing as_strided_scatter to deterministic inputs

2150882

ghstack-source-id: b62b7c2 Pull Request resolved: #85583

srossross changed the title ~~WPI: changing as_strided_scatter to deterministic inputs~~ WIP: changing as_strided_scatter to deterministic inputs Sep 24, 2022

pytorchbot added the open source label Sep 24, 2022

Update on "WIP: changing as_strided_scatter to deterministic inputs"

839ffa2

[ghstack-poisoned]

Update on "WIP: changing as_strided_scatter to deterministic inputs"

904fd95

[ghstack-poisoned]

srossross added a commit that referenced this pull request Sep 26, 2022

WPI: changing as_strided_scatter to deterministic inputs

5efe3bd

ghstack-source-id: cdb5cd9 Pull Request resolved: #85583

Update on "WIP: changing as_strided_scatter to deterministic inputs"

63781d3

[ghstack-poisoned]

srossross added a commit that referenced this pull request Sep 27, 2022

WPI: changing as_strided_scatter to deterministic inputs

f81981e

ghstack-source-id: aa337a3 Pull Request resolved: #85583

Update on "WIP: changing as_strided_scatter to deterministic inputs"

b735e96

[ghstack-poisoned]

srossross added a commit that referenced this pull request Sep 27, 2022

WPI: changing as_strided_scatter to deterministic inputs

01b13b1

ghstack-source-id: 2a65ae2 Pull Request resolved: #85583

Update on "WIP: changing as_strided_scatter to deterministic inputs"

853b9fe

[ghstack-poisoned]

srossross added a commit that referenced this pull request Oct 4, 2022

WPI: changing as_strided_scatter to deterministic inputs

982411b

ghstack-source-id: 9f9c5b5 Pull Request resolved: #85583

Update on "WIP: changing as_strided_scatter to deterministic inputs"

73497a2

[ghstack-poisoned]

Update on "WIP: changing as_strided_scatter to deterministic inputs"

46036da

[ghstack-poisoned]

srossross mentioned this pull request Oct 5, 2022

Added new option any_common_cpu_cuda_one to OpDTypes #86286

Closed

Update on "WIP: changing as_strided_scatter to deterministic inputs"

ea3ea66

[ghstack-poisoned]

srossross added a commit that referenced this pull request Oct 5, 2022

WPI: changing as_strided_scatter to deterministic inputs

3244016

ghstack-source-id: e28cf84 Pull Request resolved: #85583

Update on "WIP: changing as_strided_scatter to deterministic inputs"

7df7c65

[ghstack-poisoned]

srossross added a commit that referenced this pull request Oct 5, 2022

WPI: changing as_strided_scatter to deterministic inputs

631cf16

ghstack-source-id: 1229cda Pull Request resolved: #85583

srossross changed the title ~~WIP: changing as_strided_scatter to deterministic inputs~~ Changing as_strided_scatter to deterministic inputs Nov 1, 2022

Update on "Changing as_strided_scatter to deterministic inputs"

f5f039e

[ghstack-poisoned]

srossross added a commit that referenced this pull request Nov 1, 2022

WPI: changing as_strided_scatter to deterministic inputs

7bfb436

ghstack-source-id: 75ac3a0 Pull Request resolved: #85583

srossross marked this pull request as ready for review November 1, 2022 18:44

srossross requested review from mruberry and ngimel as code owners November 1, 2022 18:44

mruberry reviewed Nov 2, 2022

View reviewed changes

Update on "Changing as_strided_scatter to deterministic inputs"

4a9ef27

[ghstack-poisoned]

srossross added a commit that referenced this pull request Nov 2, 2022

WPI: changing as_strided_scatter to deterministic inputs

ef8982b

ghstack-source-id: 28eff7e Pull Request resolved: #85583

srossross requested a review from mruberry November 2, 2022 17:40

srossross added the ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR label Nov 2, 2022

mruberry approved these changes Nov 4, 2022

View reviewed changes

Update on "Changing as_strided_scatter to deterministic inputs"

4b64fe2

[ghstack-poisoned]

Update on "Changing as_strided_scatter to deterministic inputs"

0b103b0

[ghstack-poisoned]

srossross added a commit that referenced this pull request Nov 8, 2022

WPI: changing as_strided_scatter to deterministic inputs

681caf5

ghstack-source-id: f568ec9 Pull Request resolved: #85583

pytorchmergebot added the Merged label Nov 9, 2022

pytorchmergebot closed this in ff7c5b0 Nov 9, 2022

facebook-github-bot deleted the gh/srossross/6/head branch June 8, 2023 18:50

Changing as_strided_scatter to deterministic inputs #85583

Changing as_strided_scatter to deterministic inputs #85583

Uh oh!

Conversation

srossross commented Sep 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/85583

❌ 2 Failures

Uh oh!

srossross commented Sep 27, 2022

Uh oh!

lezcano commented Sep 27, 2022

Uh oh!

facebook-github-bot commented Oct 4, 2022

Uh oh!

linux-foundation-easycla bot commented Oct 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mruberry Nov 2, 2022

Choose a reason for hiding this comment

Uh oh!

srossross Nov 2, 2022

Choose a reason for hiding this comment

Uh oh!

mruberry Nov 2, 2022

Choose a reason for hiding this comment

Uh oh!

srossross Nov 2, 2022

Choose a reason for hiding this comment

Uh oh!

mruberry Nov 2, 2022

Choose a reason for hiding this comment

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

mruberry left a comment

Choose a reason for hiding this comment

Uh oh!

ngimel commented Nov 7, 2022

Uh oh!

srossross commented Nov 8, 2022

Uh oh!

ngimel commented Nov 8, 2022

Uh oh!

lezcano commented Nov 9, 2022

Uh oh!

lezcano commented Nov 9, 2022

Uh oh!

pytorchmergebot commented Nov 9, 2022

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

srossross commented Sep 24, 2022 •

edited

Loading

pytorch-bot bot commented Sep 24, 2022 •

edited

Loading

linux-foundation-easycla bot commented Oct 4, 2022 •

edited

Loading