-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Made a += b for lists do an in place add #21896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // Get the appropriate builtin op for this augmented assignment | ||
| // If the RHS is a tensor, return the corresponding ATen in-place op | ||
| // If it's a list of scalars, then return the corresponding list augment op | ||
| Symbol getAugOp(const AugAssign& stmt, bool isTensor) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what you're doing here, this seems unrelated. Can we have this in a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what's needed to make a += b do the inplace add instead of desugaring it into a = a + b.
smessmer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good now. Thanks.
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Chillee is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: In talks with smessmer, we decided that it'd be better to put the logic in `list`, as optimal behavior requires knowing `.capacity()` Results on my cpu (for the benchmark here: https://twitter.com/VahidK/status/1138674536679821312) now look like this: ``` Pytorch batch_gather took 0.018311 seconds. Pytorch batch_gather jit took 0.013921 seconds. Pytorch vectorized batch_gather took 0.001384 seconds. ``` Previously, `batch_gather jit` took 3x as long as `batch_gather`. Some logic taken from pytorch/pytorch#21690. Note that these two PR's are somewhat orthogonal. That PR handles this benchmark by looking at the alias analysis, while this PR specializes for `+=`. Note that we can't jit the vectorized version as we think `torch.arange` returns a float tensor. Pull Request resolved: pytorch/pytorch#21896 Differential Revision: D15998628 Pulled By: Chillee fbshipit-source-id: b0085960da4613578b94deb98ac62c0a4532a8c3
Summary: In talks with smessmer, we decided that it'd be better to put the logic in `list`, as optimal behavior requires knowing `.capacity()` Results on my cpu (for the benchmark here: https://twitter.com/VahidK/status/1138674536679821312) now look like this: ``` Pytorch batch_gather took 0.018311 seconds. Pytorch batch_gather jit took 0.013921 seconds. Pytorch vectorized batch_gather took 0.001384 seconds. ``` Previously, `batch_gather jit` took 3x as long as `batch_gather`. Some logic taken from pytorch#21690. Note that these two PR's are somewhat orthogonal. That PR handles this benchmark by looking at the alias analysis, while this PR specializes for `+=`. Note that we can't jit the vectorized version as we think `torch.arange` returns a float tensor. Pull Request resolved: pytorch#21896 Differential Revision: D15998628 Pulled By: Chillee fbshipit-source-id: b0085960da4613578b94deb98ac62c0a4532a8c3
In talks with @smessmer, we decided that it'd be better to put the logic in
list, as optimal behavior requires knowing.capacity()Results on my cpu (for the benchmark here: https://twitter.com/VahidK/status/1138674536679821312) now look like this:
Previously,
batch_gather jittook 3x as long asbatch_gather.Some logic taken from #21690. Note that these two PR's are somewhat orthogonal. That PR handles this benchmark by looking at the alias analysis, while this PR specializes for
+=.Note that we can't jit the vectorized version as we think
torch.arangereturns a float tensor.