-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[namedtensor] fix autograd support for torch.mean(tensor, dimname) #29199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Previously, we called `native::mean_cpu_gpu` inside `mean(Tensor, Dimname)`; `native::mean_cpu_gpu` is not supported by autograd. This PR replaces `native::mean_cpu_gpu` with `at::mean(Tensor, int)` so that the dimname overload can piggyback off of autograd support for `at::mean(Tensor, int)`. Also added tests (those didn't exist before) for autograd support for named tensor reduction functions. Test Plan: - `python test/test_namedtensor.py -v`
…dimname)" Previously, we called `native::mean_cpu_gpu` inside `mean(Tensor, Dimname)`; `native::mean_cpu_gpu` is not supported by autograd. This PR replaces `native::mean_cpu_gpu` with `at::mean(Tensor, int)` so that the dimname overload can piggyback off of autograd support for `at::mean(Tensor, int)`. Also added tests (those didn't exist before) for autograd support for named tensor reduction functions. Test Plan: - `python test/test_namedtensor.py -v`
Previously, we called `native::mean_cpu_gpu` inside `mean(Tensor, Dimname)`; `native::mean_cpu_gpu` is not supported by autograd. This PR replaces `native::mean_cpu_gpu` with `at::mean(Tensor, int)` so that the dimname overload can piggyback off of autograd support for `at::mean(Tensor, int)`. Also added tests (those didn't exist before) for autograd support for named tensor reduction functions. Test Plan: - `python test/test_namedtensor.py -v` ghstack-source-id: dae04db Pull Request resolved: #29199
…dimname)" Previously, we called `native::mean_cpu_gpu` inside `mean(Tensor, Dimname)`; `native::mean_cpu_gpu` is not supported by autograd. This PR replaces `native::mean_cpu_gpu` with `at::mean(Tensor, int)` so that the dimname overload can piggyback off of autograd support for `at::mean(Tensor, int)`. Also added tests (those didn't exist before) for autograd support for named tensor reduction functions. Test Plan: - `python test/test_namedtensor.py -v`
test/test_namedtensor.py
Outdated
| def test_autograd_supports_dimname_overload(op, device): | ||
| t = torch.empty(2, 3, 5, names=('N', 'C', 'L'), device=device, requires_grad=True) | ||
| sum_all_outputs(op(t, 'C')).backward() | ||
| self.assertTrue(t.grad is not None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do anything to validate the value of the gradient? would it be helpful to validate the grad against the equivalent op on an unnamed tensor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: btw I think there's a self.assertIsNotNone()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do anything to validate the value of the gradient?
We can, but see the next point
would it be helpful to validate the grad against the equivalent op on an unnamed tensor?
Because autograd ignores all names on tensors, we don't need to validate the grad as long as we test that "autograd ignores all names on tensors"
nit: btw I think there's a self.assertIsNotNone()
Sure I'll use that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
autograd ignores all names on tensors
will that ever not be the case for any kinds of gradients?
I think you had also mentioned an issue with the correctness of the mean gradient that we were also fixing, though maybe that's (going to be) a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
autograd ignores all names on tensors
will that ever not be the case for any kinds of gradients?
In the future we may design and implement name-aware autograd. This is not going to happen soon. When that happens, we can revisit the testing plan for this.
I think you had also mentioned an issue with the correctness of the mean gradient that we were also fixing, though maybe that's (going to be) a separate PR?
yeah, that's orthogonal to this PR. It is this one: #29137 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR fixes #28286
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the clarification
…dimname)" Previously, we called `native::mean_cpu_gpu` inside `mean(Tensor, Dimname)`; `native::mean_cpu_gpu` is not supported by autograd. This PR replaces `native::mean_cpu_gpu` with `at::mean(Tensor, int)` so that the dimname overload can piggyback off of autograd support for `at::mean(Tensor, int)`. Also added tests (those didn't exist before) for autograd support for named tensor reduction functions. Test Plan: - `python test/test_namedtensor.py -v`
Previously, we called `native::mean_cpu_gpu` inside `mean(Tensor, Dimname)`; `native::mean_cpu_gpu` is not supported by autograd. This PR replaces `native::mean_cpu_gpu` with `at::mean(Tensor, int)` so that the dimname overload can piggyback off of autograd support for `at::mean(Tensor, int)`. Also added tests (those didn't exist before) for autograd support for named tensor reduction functions. Test Plan: - `python test/test_namedtensor.py -v` ghstack-source-id: 3686dcc Pull Request resolved: #29199
Summary: Pull Request resolved: pytorch/pytorch#29199 Previously, we called `native::mean_cpu_gpu` inside `mean(Tensor, Dimname)`; `native::mean_cpu_gpu` is not supported by autograd. This PR replaces `native::mean_cpu_gpu` with `at::mean(Tensor, int)` so that the dimname overload can piggyback off of autograd support for `at::mean(Tensor, int)`. Also added tests (those didn't exist before) for autograd support for named tensor reduction functions. Test Plan: - `python test/test_namedtensor.py -v` Differential Revision: D18334617 Pulled By: zou3519 fbshipit-source-id: 1714eb3fd93714fe860f208831e8d910f01c1c78
Stack from ghstack:
Previously, we called
native::mean_cpu_gpuinsidemean(Tensor, Dimname);native::mean_cpu_gpuis not supported by autograd. This PR replacesnative::mean_cpu_gpuwithat::mean(Tensor, int)so that the dimnameoverload can piggyback off of autograd support for
at::mean(Tensor, int).Also added tests (those didn't exist before) for autograd support for
named tensor reduction functions.
Test Plan:
python test/test_namedtensor.py -vDifferential Revision: D18334617