Step 6: Rename _unique2 to unique and add int? dim #18655

zasdfgbnm · 2019-03-30T14:17:35Z

Stack from ghstack:

Step 7: remove _unique #18661 Step 7: remove _unique
Step 6: Rename _unique2 to unique and add int? dim #18655 Step 6: Rename _unique2 to unique and add int? dim
Step 5: remove _unique_dim in favor of unique_dim #18654 Step 5: remove _unque_dim in favor of unique_dim
Step 4: add support for unique with dim=None #18651 Step 4: add support for unique with dim=None
Step 3: Add support for return_counts to torch.unique for dim not None #18650 Step 3: Add support for return_counts to torch.unique for dim not None
Step 2: Rename _unique_dim2_temporary_will_remove_soon to unique_dim #18649 Step 2: Rename _unique_dim2_temporary_will_remove_soon to unique_dim
Step 1: Secretly add return_counts to unique, and refactor unique_dim for performance #18648 Step 1: Secretly add return_counts to unique, and refactor unique_dim for performance

Rename _unique2 to unique
Add optional dim argument to make it looks like the signature of Python's torch.unique.
Inside torch.unique, use unique and get rid of unique_dim.
Unbind unique_dim totally from Python at codegen.
Add OSS ONNX test for unique
Add jit test for unique

Previously tried in #17097 and cause internal error, not sure about this
time.

Differential Revision: D15034051

@wanchaol

- Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

… for performance (#18648) Summary: Pull Request resolved: #18648 ghimport-source-id: 1cf4a8f Stack from [ghstack](https://github.com/ezyang/ghstack): * #18661 Step 7: remove _unique * #18655 Step 6: Rename _unique2 to unique and add int? dim * #18654 Step 5: remove _unque_dim in favor of unique_dim * #18651 Step 4: add support for unique with dim=None * #18650 Step 3: Add support for return_counts to torch.unique for dim not None * #18649 Step 2: Rename _unique_dim2_temporary_will_remove_soon to unique_dim * **#18648 Step 1: Secretly add return_counts to unique, and refactor unique_dim for performance** `unique` is fragile, previously I tried to change it in #18391 and #17097, they all pass OSS tests but finally get reverted due to internal failure. My previous work of refactoring unique #18459 is based on #18391, and after #18391 get reverted, I could not work on #18459. To continue working on #18459, #18391, and #17097 without worrying about internal failures, I am suggesting the following steps for the improvements of `unique` and `unique_dim`. soumith Please take this and there is no need to put #18391 back. The motivation is basically to move forward as much as possible without causing any internal failures. So I will try to divide it into steps and sort from low probability of internal failure to high probability. (I don't know what the internal failure is, so I have to guess). Let's merge these PR stack one by one until we enounter internal failure. Step 1: Create two new ATen operators, `_unique2_temporary_will_remove_soon` and `_unique_dim2_temporary_will_remove_soon` and keep `_unique` and `_unique_dim` unchanged. The backend of these two functions and `_unique` and `_unique_dim` are all the same, the only difference is the temporary ones support `return_counts` but not the `_unique` and `_unique_dim`. Step one is mostly #18391 + #18459. The cuda8 errors has been fixed. At this point, there is no user visible API change, so no docs are updated. `torch.unique` does not support `return_counts` yet, and `return_counts` is tested through the newly added temporary operators. This step just added two new ATen operators, so there shouldn't be any internal failure. Step 2: Rename `_unique_dim2_temporary_will_remove_soon` to `unique_dim`. This should cause no internal failure either, because no change to existing operators. The only thing to worry about is to delete `unique_dim` from python side because we don't want users to use it. At this point, C++ users now have `return_counts` support for `unique_dim`. Step 3: Update the docs of `torch.unique` and use `unique_dim` inside `torch.unique` to support `return_counts` In the docs, we should say `torch.unique` with None dim support does not support `return_counts` yet. This might cause internal failure. Step 4: Rename `_unique2_temporary_will_remove_soon` to `_unique2` and use `_unique2` inside `torch.unique` to support `return_counts`. Update the docs saying that `torch.unique` with None dim now support `return_counts`. This might cause internal failure. Step 5: Remove `_unique_dim`. This might cause internal failure. Step 6: Rename `_unique2` to `unique`, add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. Inside `torch.unique`, use `unique` and get rid of `unique_dim`. Unbind `unique_dim` totally from Python at codegen. This is likely to cause internal fail. Step 7: Remove `_unique`. This is very likely to cause internal failure. This PR ====== This PR is for step 1. This create two new ATen operators, `_unique2_temporary_will_remove_soon` and `_unique_dim2_temporary_will_remove_soon` and implement `return_counts` inside them and do refactor for performance improvements. Please review ngimel VitalyFedyunin. They are mostly copied from #18391 and #18459, so the review should be easy. Below is a benchmark on a tensor of shape `torch.Size([15320, 2])`: Before --------- ```python print(torch.__version__) %timeit a.unique(dim=0, sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(dim=0, sorted=True, return_inverse=True); torch.cuda.synchronize() ``` ``` 1.0.1 192 µs ± 1.61 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 548 ms ± 3.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) ``` ```python print(torch.__version__) %timeit a.unique(sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(sorted=True, return_inverse=True); torch.cuda.synchronize() ``` ``` 1.0.1 226 µs ± 929 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each) 302 µs ± 7.06 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` After ------- ```python print(torch.__version__) %timeit a.unique(dim=0, sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(dim=0, sorted=True, return_inverse=True); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted=True, return_inverse=False, return_counts=True); torch.cuda.synchronize() %timeit torch._unique_dim2_temporary_will_remove_soon(a, dim=0, sorted=True, return_inverse=True, return_counts=True); torch.cuda.synchronize() ``` ``` 1.1.0a0+83ab8ac 190 µs ± 2.14 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 237 µs ± 1.23 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 219 µs ± 2.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 263 µs ± 1.15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` ```python print(torch.__version__) %timeit a.unique(sorted=True, return_inverse=False); torch.cuda.synchronize() %timeit a.unique(sorted=True, return_inverse=True); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, sorted=True, return_inverse=False, return_counts=True); torch.cuda.synchronize() %timeit torch._unique2_temporary_will_remove_soon(a, sorted=True, return_inverse=True, return_counts=True); torch.cuda.synchronize() ``` ``` 1.1.0a0+83ab8ac 232 µs ± 2.21 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 301 µs ± 1.65 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 264 µs ± 7.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) 339 µs ± 9.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each) ``` Differential Revision: D14730905 fbshipit-source-id: 10026b4b98628a8565cc28a13317d29adf1225cc

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

zasdfgbnm · 2019-04-11T02:14:33Z

alright... I did something wrong when rebasing... Will fix later

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

@wanchaol

Step 6: Rename _unique2 to unique and add int? dim - Rename `_unique2` to `unique` - Add optional `dim` argument to make it looks like the signature of Python's `torch.unique`. - Inside `torch.unique`, use `unique` and get rid of `unique_dim`. - Unbind `unique_dim` totally from Python at codegen. - Add OSS ONNX test for unique - Add jit test for unique Previously tried in #17097 and cause internal error, not sure about this time. cc: @wanchaol gh-metadata: pytorch pytorch 18655 gh/zasdfgbnm/6/head

zasdfgbnm · 2019-04-24T04:22:50Z

@VitalyFedyunin This should be ready as well

dzhulgakov · 2019-04-24T07:11:45Z

@VitalyFedyunin - Just sanity-checking - is this change backward-compatible in terms of serialized JIT IR?

VitalyFedyunin · 2019-04-24T15:33:16Z

It might be not, I have one more PR with potentially similar problem under review now. Planning to write instructions how to test cover it.

zasdfgbnm · 2019-11-01T18:15:49Z

@VitalyFedyunin I am closing this and #18661 since it has been so old. Rebasing would be harder than rewrite it I guess.

VitalyFedyunin · 2019-11-01T18:18:33Z

Oh, damn sorry. Thanks for cleaning up.

zasdfgbnm · 2019-11-01T18:22:06Z

@VitalyFedyunin Not a problem. Changing the API in aten operators is always hard.

zasdfgbnm added 4 commits March 30, 2019 10:20

This was referenced Mar 31, 2019

Step 7: remove _unique #18661

Closed

Step 1: Secretly add return_counts to unique, and refactor unique_dim for performance #18648

Closed

zasdfgbnm added 4 commits April 8, 2019 11:04

zasdfgbnm added 9 commits April 10, 2019 22:38

ezyang added the open source label Jun 5, 2019

zasdfgbnm closed this Nov 1, 2019

facebook-github-bot deleted the gh/zasdfgbnm/6/head branch December 2, 2019 15:16

Chillee mentioned this pull request Apr 28, 2022

Clean up PyTorch's private operators #76514

Open

25 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Step 6: Rename _unique2 to unique and add int? dim #18655

Step 6: Rename _unique2 to unique and add int? dim #18655

Uh oh!

zasdfgbnm commented Mar 30, 2019 •

edited by VitalyFedyunin

Loading

Uh oh!

zasdfgbnm commented Apr 11, 2019

Uh oh!

zasdfgbnm commented Apr 24, 2019

Uh oh!

dzhulgakov commented Apr 24, 2019

Uh oh!

VitalyFedyunin commented Apr 24, 2019

Uh oh!

zasdfgbnm commented Nov 1, 2019

Uh oh!

VitalyFedyunin commented Nov 1, 2019

Uh oh!

zasdfgbnm commented Nov 1, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Step 6: Rename _unique2 to unique and add int? dim #18655

Step 6: Rename _unique2 to unique and add int? dim #18655

Uh oh!

Conversation

zasdfgbnm commented Mar 30, 2019 • edited by VitalyFedyunin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zasdfgbnm commented Apr 11, 2019

Uh oh!

zasdfgbnm commented Apr 24, 2019

Uh oh!

dzhulgakov commented Apr 24, 2019

Uh oh!

VitalyFedyunin commented Apr 24, 2019

Uh oh!

zasdfgbnm commented Nov 1, 2019

Uh oh!

VitalyFedyunin commented Nov 1, 2019

Uh oh!

zasdfgbnm commented Nov 1, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zasdfgbnm commented Mar 30, 2019 •

edited by VitalyFedyunin

Loading