Skip to content

Conversation

@zou3519
Copy link
Contributor

@zou3519 zou3519 commented Sep 19, 2019

Stack from ghstack:

We should allocate an empty tensor as a result tensor when performing
binary ops. Currently some ops use empty_like(self) as the initial
result tensor before passing it into TensorIterator. This is not very
efficient because TensorIterator may resize the tensor due to
broadcasting, causing more memory allocation. By using an empty tensor
as the result tensor, we only need to allocate/resize memory once as
opposed to twice.

Also fixes #26495. The bug
there is that the implementation of pow is missing a resize in one
case.

Test Plan:

  • new test
  • run tests

Differential Revision: D17500025

We should allocate an empty tensor as a result tensor when performing
binary ops. Currently some ops use `empty_like(self)` as the initial
result tensor before passing it into TensorIterator. This is not very
efficient because TensorIterator may resize the tensor due to
broadcasting, causing more memory allocation. By using an empty tensor
as the result tensor, we only need to allocate/resize memory once as
opposed to twice.

Also fixes #26495. The bug
there is that the implementation of `pow` is missing a resize in one
case.

Test Plan:
- new test
- run tests
zou3519 added a commit that referenced this pull request Sep 19, 2019
We should allocate an empty tensor as a result tensor when performing
binary ops. Currently some ops use `empty_like(self)` as the initial
result tensor before passing it into TensorIterator. This is not very
efficient because TensorIterator may resize the tensor due to
broadcasting, causing more memory allocation. By using an empty tensor
as the result tensor, we only need to allocate/resize memory once as
opposed to twice.

Also fixes #26495. The bug
there is that the implementation of `pow` is missing a resize in one
case.

Test Plan:
- new test
- run tests

ghstack-source-id: 0705c58
Pull Request resolved: #26498
zdevito pushed a commit to zdevito/ATen that referenced this pull request Sep 20, 2019
…26498)

Summary:
Pull Request resolved: pytorch/pytorch#26498

We should allocate an empty tensor as a result tensor when performing
binary ops. Currently some ops use `empty_like(self)` as the initial
result tensor before passing it into TensorIterator. This is not very
efficient because TensorIterator may resize the tensor due to
broadcasting, causing more memory allocation. By using an empty tensor
as the result tensor, we only need to allocate/resize memory once as
opposed to twice.

Also fixes pytorch/pytorch#26495. The bug
there is that the implementation of `pow` is missing a resize in one
case.

Test Plan:
- new test
- run tests

Differential Revision: D17500025

Pulled By: zou3519

fbshipit-source-id: bff4949af5e75541c04669b961bcf2e1ec456faf
@facebook-github-bot
Copy link
Contributor

@zou3519 merged this pull request in e2515a4.

@facebook-github-bot facebook-github-bot deleted the gh/zou3519/180/head branch October 28, 2019 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants