Skip to content

Add the __torch_function__ API override mechanism#30730

Closed
ngoldbaum wants to merge 1 commit intopytorch:masterfrom
ngoldbaum:torch_function_redo
Closed

Add the __torch_function__ API override mechanism#30730
ngoldbaum wants to merge 1 commit intopytorch:masterfrom
ngoldbaum:torch_function_redo

Conversation

@ngoldbaum
Copy link
Copy Markdown
Contributor

This is a re-do of #27064, which was reverted (b8792c0). This was landed at the same time as other work that added new operators to the torch namespace so the check for whether the torch namespace is exhaustively checked for overridability was triggering test failures.

I've temporarily disabled that check and added an explanatory comment that the check will be re-enabled in a future PR that will be merged during a time when the commit velocity on PyTorch is lower.

@ngoldbaum ngoldbaum requested a review from ezyang December 4, 2019 16:10
@ngoldbaum ngoldbaum force-pushed the torch_function_redo branch from 1b9ea84 to 0b52a11 Compare December 4, 2019 16:45
# smaller chance that enabling this test will land simultaneously
# with a new operator being added to the torch namespace.

# assert len(untested_funcs) == 0, msg.format(pprint.pformat(untested_funcs))
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commented out test and explanatory comment are here.

Copy link
Copy Markdown
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@shoyer
Copy link
Copy Markdown

shoyer commented Dec 5, 2019

The torch/_overrides.py module should probably still have a NumPy copyright notice, right? Large sections of that module are loosely adapted from https://github.com/numpy/numpy/blob/v1.16.0/numpy/core/overrides.py

@rgommers
Copy link
Copy Markdown
Collaborator

rgommers commented Dec 5, 2019

@shoyer I didn't consider that, because I'm quite used to code moving around between NumPy, SciPy and other projects, and we normally don't bother with formal copyright notices. The _overrides.py file is indeed for the most part a copy of NumPy code, happy to add a copyright notice - I'll open a PR.

And a note for completeness, due credit was given of course, this is the note at the top of _overrides.py:

NOTE: heavily inspired by NumPy's ``__array_function__`` (see:
https://github.com/pytorch/pytorch/issues/24015 and
https://www.numpy.org/neps/nep-0018-array-function-protocol.html

@shoyer
Copy link
Copy Markdown

shoyer commented Dec 5, 2019

No worries, and I did notice those credits :)

I wasn't sure what license pytorch uses, but given that it appears to be 3 clause BSD like NumPy this is probably fine.

(I've been trained to be a little overly sensitive on this stuff at Google, where we're asked to add copyright notices to every file in our own open source projects.)

@rgommers
Copy link
Copy Markdown
Collaborator

rgommers commented Dec 5, 2019

dd copyright notices to every file in our own open source projects

Yeah, that never makes much sense to me. Files change over time, and usually any copyright notice that's put in at some point in time becomes outdated. The one way to figure it out is by looking at the per-line git history after some time of evolution.

I wasn't sure what license pytorch uses, but given that it appears to be 3 clause BSD like NumPy this is probably fine.

Yes, same license as NumPy. Okay thanks, I take that as no PR needed.

Although I did just check https://github.com/pytorch/pytorch/blob/master/LICENSE#L33 which says

All other contributions:
Copyright(c) 2015, 2016 the respective contributors
All rights reserved.

so it's a few years out of date, may update that anyway at least:)

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Dec 5, 2019

Yeah, our LICENSE file is super out of date. Updates more than welcome.

@hl475
Copy link
Copy Markdown
Contributor

hl475 commented Dec 5, 2019

Hi @ngoldbaum, we found there is some performance regression introduced from your PR. According to our benchmark, we found

================================================================================
Before the change, Program Output:
================================================================================
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M8_N8_parts2_cpu
# Input: M: 8, N: 8, parts: 2, device: cpu
Forward Execution Time (us) : 6.546

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M256_N512_parts2_cpu
# Input: M: 256, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 6.443

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M512_N512_parts2_cpu
# Input: M: 512, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 6.437

================================================================================
After the change, Program Output:
================================================================================
# ----------------------------------------
# PyTorch/Caffe2 Operator Micro-benchmarks
# ----------------------------------------
# Tag : short

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M8_N8_parts2_cpu
# Input: M: 8, N: 8, parts: 2, device: cpu
Forward Execution Time (us) : 4.252

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M256_N512_parts2_cpu
# Input: M: 256, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 4.142

# Benchmarking PyTorch: split
# Mode: Eager
# Name: split_M512_N512_parts2_cpu
# Input: M: 512, N: 512, parts: 2, device: cpu
Forward Execution Time (us) : 4.210

================================================================================

where before means current codebase, and after means remove the change in this PR.

To reproduce the benchmark, please read https://github.com/pytorch/pytorch/tree/master/benchmarks/operator_benchmark and run python -m pt.split_test.

cc @ezyang @mingzhe09088

@hl475 hl475 reopened this Dec 5, 2019
@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Dec 5, 2019

@ngoldbaum This is probably related to the decorator hooks. Maybe this function needs to get moved to C++?

@ngoldbaum
Copy link
Copy Markdown
Contributor Author

I believe that regression is expected given that split is implemented in Python, back in November @ezyang found @rgommers argument that if performance was a hard requirement then things in torch.functional should be rewritten in C++ persuasive:

#27064 (comment)
#27064 (comment)

If we don't want to have overhead for the operators that are implemented in Python then I think we would need to rewrite the torch_function_dispatch decorator. However, I don't know if there's a way to avoid all overhead, since we'd still need to inspect the types of the arguments at runtime, something that happens automatically for operators written in C++.

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Dec 5, 2019

@ngoldbaum I think I made a good argument that we could get it pretty fast with proper decorator design: just a few direct type() comparisons in straightline bytecode. @hl475 is the operator above the only regressing one or are there others you caught in your test suite? Let's either (1) remove the decorator from them or (2) port them to C++. If doing (2) turns out to be difficult in some cases, let's reopen the question of a fancier decorator mechanism.

@hl475
Copy link
Copy Markdown
Contributor

hl475 commented Dec 5, 2019

@ezyang I think PyTorch/Caffe2 Operator Micro-benchmarks
suites has split op bench, so we were able to catch it. For other ops changed in this PR (broadcast, einsum, isinf etc), they are not covered by operator_benchmark suites yet.

@rgommers
Copy link
Copy Markdown
Collaborator

rgommers commented Dec 5, 2019

A 2.3 us hit is anyway unexpected, that's way more than what a single use of torch_function_dispatch should incur. I think this needs profiling before jumping to a conclusion about it being that decorator.

@rgommers
Copy link
Copy Markdown
Collaborator

rgommers commented Dec 5, 2019

hl475 reopened this 27 minutes ago

I'm not sure what happened here. It looks like the PR is open now, but the merge was not reverted. Should we close this again and open a new issue about this performance regression?

@ezyang
Copy link
Copy Markdown
Contributor

ezyang commented Dec 5, 2019

I'm not sure what happened here. It looks like the PR is open now, but the merge was not reverted. Should we close this again and open a new issue about this performance regression?

Yes, unless you think we should revert the entire diff. (but if it's just decorator related, I'd prefer if we just delete the decorators)

@rgommers
Copy link
Copy Markdown
Collaborator

rgommers commented Dec 5, 2019

Yeah don't think a full revert is going to be helpful. Let's just try and figure this out in the next couple of hours first. I'll close this and open a new issue, will Cc you all.

@shoyer
Copy link
Copy Markdown

shoyer commented Dec 6, 2019

@rgommers -- while I have you here, is there a discussion anywhere of why the protocol for __torch_function__ omits the types argument from __array_function__? I would be curious to understanding the reasoning behind this, and especially if it is also relevant to NumPy.

wuhuikx pushed a commit to wuhuikx/pytorch that referenced this pull request Jan 30, 2020
Summary:
This is a re-do of pytorch#27064, which was reverted (pytorch@b8792c0). This was landed at the same time as other work that added new operators to the `torch` namespace so the check for whether the `torch` namespace is exhaustively checked for overridability was triggering test failures.

I've temporarily disabled that check and added an explanatory comment that the check will be re-enabled in a future PR that will be merged during a time when the commit velocity on PyTorch is lower.
Pull Request resolved: pytorch#30730

Differential Revision: D18813270

Pulled By: ezyang

fbshipit-source-id: 70477c4656dca8fea6e7bc59259555041fcfbf68
@rgommers
Copy link
Copy Markdown
Collaborator

is there a discussion anywhere of why the protocol for __torch_function__ omits the types argument from __array_function__? I would be curious to understanding the reasoning behind this, and especially if it is also relevant to NumPy.

@shoyer sorry for the long delay in answering this. I forgot the answer, and never got around to digging through the history of when I did this around Sep'19.

Now the discussion at pytorch/rfcs#3 reminded me: I believe I removed (or never added) types because it was extra complexity to handle subclasses, and Tensor subclasses aren't really a thing today.

Also note that in master we've now deviated much further from the NumPy design, see gh-32194. This has yielded a much reduced overhead, at the limited cost of not having signature verification. This was possible because of the C++ machinery in PyTorch that does signature parsing in a way that NumPy doesn't have.

In pytorch/rfcs#3 there'll be some more follow-up that may result in changes.

@shoyer
Copy link
Copy Markdown

shoyer commented Feb 7, 2020

@rgommers thanks for following up here! It has been really interesting to follow along the work your team has been doing in supporting overrides in PyTorch. I'm impressed by how you've been able to eliminate almost all of the overhead!

I asked because the my motivation for the types argument in __array_function__ was never about subclasses (which I agree are generally awful). I was mostly concerned with making it as easy and idiomatic as possible for implementers of __array_function__ to defer to unknown array library. This is generally a good thing for the ecosystem, but people writing special methods tend not to bother, at least for builtin numeric protocols like __add__ :).

facebook-github-bot pushed a commit that referenced this pull request Feb 21, 2020
…onal (#32799)

Summary:
This adds `__torch_function__` support for all functions in `torch.functional` and `torch.nn.functional`.

The changes to C++ code and codegen scripts are to facilitate adding `__torch_function__` support for the native functions in `torch._C._nn`. Note that I moved the `handle_torch_function` C++ function to a header that both `python_torch_functions.cpp` and `python_nn_functions.cpp` include. The changes to `python_nn_functions.cpp` mirror the changes I made to `python_torch_functions.cpp` when `__torch_function__` support was first added in #27064. Due to the somewhat different way the `torch._C` and `torch._C._nn` namespaces are initialized I needed to create a new static reference to the `torch._C._nn` namespace (`THPNNVariableFunctions`). I'm not sure if that is the best way to do this. In principle I could import these namespaces in each kernel and avoid the global variable but that would have a runtime cost.

I added `__torch_function__` support to the Python functions in `torch.nn.functional` following the approach in #32194.

I re-enabled the test that checks if all functions in the `torch` namespace are explicitly tested for `__torch_function__` support. I also generalized the check to work for `torch.functional` and `torch.nn.functional` as well. This test was explicitly disabled in #30730 and I'm happy to disable it again if you think that's appropriate. I figured now was as good a time as any to try to re-enable it.

Finally I adjusted the existing torch API tests to suppress deprecation warnings and add keyword arguments used by some of the code in `torch.nn.functional` that were missed when I originally added the tests in #27064.
Pull Request resolved: #32799

Differential Revision: D19956809

Pulled By: ezyang

fbshipit-source-id: 40d34e0109cc4b9f3ef62f409d2d35a1d84e3d22
ttumiel pushed a commit to ttumiel/pytorch that referenced this pull request Mar 4, 2020
…onal (pytorch#32799)

Summary:
This adds `__torch_function__` support for all functions in `torch.functional` and `torch.nn.functional`.

The changes to C++ code and codegen scripts are to facilitate adding `__torch_function__` support for the native functions in `torch._C._nn`. Note that I moved the `handle_torch_function` C++ function to a header that both `python_torch_functions.cpp` and `python_nn_functions.cpp` include. The changes to `python_nn_functions.cpp` mirror the changes I made to `python_torch_functions.cpp` when `__torch_function__` support was first added in pytorch#27064. Due to the somewhat different way the `torch._C` and `torch._C._nn` namespaces are initialized I needed to create a new static reference to the `torch._C._nn` namespace (`THPNNVariableFunctions`). I'm not sure if that is the best way to do this. In principle I could import these namespaces in each kernel and avoid the global variable but that would have a runtime cost.

I added `__torch_function__` support to the Python functions in `torch.nn.functional` following the approach in pytorch#32194.

I re-enabled the test that checks if all functions in the `torch` namespace are explicitly tested for `__torch_function__` support. I also generalized the check to work for `torch.functional` and `torch.nn.functional` as well. This test was explicitly disabled in pytorch#30730 and I'm happy to disable it again if you think that's appropriate. I figured now was as good a time as any to try to re-enable it.

Finally I adjusted the existing torch API tests to suppress deprecation warnings and add keyword arguments used by some of the code in `torch.nn.functional` that were missed when I originally added the tests in pytorch#27064.
Pull Request resolved: pytorch#32799

Differential Revision: D19956809

Pulled By: ezyang

fbshipit-source-id: 40d34e0109cc4b9f3ef62f409d2d35a1d84e3d22
facebook-github-bot pushed a commit that referenced this pull request Aug 6, 2020
Summary:
According to pytorch/rfcs#3

From the goals in the RFC:

1. Support subclassing `torch.Tensor` in Python (done here)
2. Preserve `torch.Tensor` subclasses when calling `torch` functions on them (done here)
3. Use the PyTorch API with `torch.Tensor`-like objects that are _not_ `torch.Tensor`
   subclasses (done in #30730)
4. Preserve `torch.Tensor` subclasses when calling `torch.Tensor` methods. (done here)
5. Propagating subclass instances correctly also with operators, using
   views/slices/indexing/etc. (done here)
6. Preserve subclass attributes when using methods or views/slices/indexing. (done here)
7. A way to insert code that operates on both functions and methods uniformly
   (so we can write a single function that overrides all operators). (done here)
8. The ability to give external libraries a way to also define
   functions/methods that follow the `__torch_function__` protocol. (will be addressed in a separate PR)

This PR makes the following changes:

1. Adds the `self` argument to the arg parser.
2. Dispatches on `self` as well if `self` is not `nullptr`.
3. Adds a `torch._C.DisableTorchFunction` context manager to disable `__torch_function__`.
4. Adds a `torch::torch_function_enabled()` and `torch._C._torch_function_enabled()` to check the state of `__torch_function__`.
5. Dispatches all `torch._C.TensorBase` and `torch.Tensor` methods via `__torch_function__`.

TODO:

- [x] Sequence Methods
- [x] Docs
- [x] Tests

Closes #28361

Benchmarks in #37091 (comment)

Pull Request resolved: #37091

Reviewed By: ngimel

Differential Revision: D22765678

Pulled By: ezyang

fbshipit-source-id: 53f8aa17ddb8b1108c0997f6a7aa13cb5be73de0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants