Optimize CPU version performance of the nonzero function.#15190
Optimize CPU version performance of the nonzero function.#15190VitalyFedyunin wants to merge 1 commit intopytorch:masterfrom
Conversation
|
Related to #14848 |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@gchanan bug fixed |
facebook-github-bot
left a comment
There was a problem hiding this comment.
@VitalyFedyunin has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
gchanan
left a comment
There was a problem hiding this comment.
A test for the non-contiguous case would be a nice addition.
This looks good to go though, feel free to either commit or address the comments then commit.
) Summary: Optimized CPU version of the nonzero. Now 2x faster (in avg.) than numpy. Can be further optimized for 1D tensors and boolean tensors. Pull Request resolved: pytorch#15190 Differential Revision: D13468570 fbshipit-source-id: 31e155c5ef247a8983b4c1c12f25b0aafb315e43
c859e9d to
9cbfe30
Compare
facebook-github-bot
left a comment
There was a problem hiding this comment.
@VitalyFedyunin is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: Optimized CPU version of the nonzero. Now 2x faster (in avg.) than numpy. Can be further optimized for 1D tensors and boolean tensors. Pull Request resolved: pytorch/pytorch#15190 Differential Revision: D13468570 Pulled By: VitalyFedyunin fbshipit-source-id: e55ce54d60626a42d9a10a02e407856458b8055e
|
Hi @gchanan also using |
|
@botcs are you talking about CPU or GPU implementations |
|
@botcs please file an issue. |
|
@VitalyFedyunin how can this be optimized for Boolean tensors? |
|
To be honest first step here should be migrating from TH to Aten code, as it might help with vectorization. |

Optimized CPU version of the nonzero. Now 2x faster (in avg.) than numpy.
Can be further optimized for 1D tensors and boolean tensors.