-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Add return_counts to torch.unique #15495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
torch/functional.py
Outdated
|
|
||
|
|
||
| def unique(input, sorted=False, return_inverse=False, dim=None): | ||
| def unique(input, sorted=False, return_inverse=False, return_counts=False, dim=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation below needs updating
| Tensor inverse_indices = at::empty({0}, self.options().dtype(kLong)); | ||
| if (return_inverse) { | ||
| Tensor counts = at::empty({0}, self.options().dtype(kLong)); | ||
| if (return_inverse || return_counts) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit questionable. You're computing both the inverse and the counts if either of them are requested. But if one is requested but not the other you're just doing wasted work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment! Yeah, I thought the same, but decided for code clarity, since counts is being calculated in basically the same loop as the inverse_indices and wanted to avoid code duplication.
EDIT: I'm wrong at this point. I'm using the inverse_map for the counts so that the sorted/unsorted unique tensors correspond to the counts. What do you think?
| if (return_counts) { | ||
| counts.resize_(output.sizes()).fill_(0); | ||
| Tensor val = at::ones({num_inp}, self.options().dtype(kLong)); | ||
| counts.scatter_add_(0, inverse_indices.view(-1), val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Humm, a broadcasting scatter_add_ would have been very useful here :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you mean such that we can just pass a single 1 as the value?
Is there a more efficient version of this code?
|
PR looks basically reasonable (even if the CUDA implementation is a little inefficient), but docs must be updated before we can merge this |
|
pushed mergefix |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
This appears to be BC-breaking in ONNX, in a way that is not caught by current OSS tests. In internal tests, we see the following failure: I think the problem is probably that we need to adjust the symbolic definition for unique to handle having three args. |
|
@ezyang Could you give me a hint on adjusting the symbolic definition or is it being worked on in another PR? |
|
Unfortunately I don't think anyone is taking point on this. The steps to resolve would be to first reproduce the error (probably it should be sufficient to just try exporting unique in the first place), and then adjust the symbolic to handle the new keyword argument. I can take a look (eventually) if you don't have the bandwidth for it. |
|
@ezyang |
|
Thanks! |
|
@ezyang |
|
I'd suggest merging with master (there are some conflicts) and getting another run of CI. |
| output.resize_(num_out); | ||
|
|
||
| Tensor counts = at::empty({0}, self.options().dtype(kLong)); | ||
| if (return_counts) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd need to condition computation of inverse_indices on return_counts too, otherwise if return_counts is true and inverse_indices is false, this won't work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, you can do thrust::unique_by_key with counting iterator as a key, and thrust::adjacent_differences to get counts, that will probably have better worst case performance for cases where you have a lot of equal elements and there would be a lot of atomic collisions in scatter_add_, and you also won't compute inverse_indices that you might not need.
ezyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting changes as per @ngimel's comment
Summary: Fixes: #12598 This PR was originally authorized by ptrblck at #15495, but since there was no update for months after the request change, I clone that branch and resolve the code reviews here. Hope everything is good now. Especially, the implementation of count is changed from ptrblck's original algorithm to the one ngimel suggest, i.e. using `unique_by_key` and `adjacent_difference`. The currently implementation of `_unique_dim` is VERY slow for computing inverse index and counts, see #18405. I will refactor `_unique_dim` in a later PR. For this PR, please allow me to keep the implementation as is. cc: ptrblck ezyang ngimel colesbury Pull Request resolved: #18391 Reviewed By: soumith Differential Revision: D14605905 Pulled By: VitalyFedyunin fbshipit-source-id: 555f5a12a8e28c38b10dfccf1b6bb16c030bfdce
Summary: Fixes: pytorch/pytorch#12598 This PR was originally authorized by ptrblck at pytorch/pytorch#15495, but since there was no update for months after the request change, I clone that branch and resolve the code reviews here. Hope everything is good now. Especially, the implementation of count is changed from ptrblck's original algorithm to the one ngimel suggest, i.e. using `unique_by_key` and `adjacent_difference`. The currently implementation of `_unique_dim` is VERY slow for computing inverse index and counts, see pytorch/pytorch#18405. I will refactor `_unique_dim` in a later PR. For this PR, please allow me to keep the implementation as is. cc: ptrblck ezyang ngimel colesbury Pull Request resolved: pytorch/pytorch#18391 Reviewed By: soumith Differential Revision: D14605905 Pulled By: VitalyFedyunin fbshipit-source-id: 555f5a12a8e28c38b10dfccf1b6bb16c030bfdce
Add
return_countargument totorch.unique.Original feature request: #12598
The performance is approx. the same as using the
dimargument here.Additionally to the added tests in
test_torch.py, I've used these tests to make sure all combinations yield the same result as numpy's implementation.Sorry again for the delay of this PR and thanks @t-vi for providing me access to his server and GPU, while I was building mine.
CC @ezyang @colesbury
Best,
ptrblck