Skip to content

Conversation

@ezyang
Copy link
Contributor

@ezyang ezyang commented Jan 22, 2021

Stack from ghstack:

Previously, we unconditionally called resize_output, even when it was
known (via the op.will_resize boolean) that you have exactly the right
size already. Thanks to Natalia Gimelshein for diagnosing the problem
and Scott Wolchok for convincing me to fix it this way (as opposed
to speeding up resize_output).

Signed-off-by: Edward Z. Yang [email protected]

Previously, we unconditionally called resize_output, even when it was
known (via the op.will_resize boolean) that you have exactly the right
size already.  Thanks to Natalia Gimelshein for diagnosing the problem
and Scott Wolchok for convincing me to fix it this way (as opposed
to speeding up resize_output).

Signed-off-by: Edward Z. Yang <[email protected]>

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jan 22, 2021

💊 CI failures summary and remediations

As of commit 7eb65a7 (more details on the Dr. CI page):


💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

@ezyang ezyang requested review from ngimel and swolchok January 22, 2021 21:23
…s unnecessary"

Previously, we unconditionally called resize_output, even when it was
known (via the op.will_resize boolean) that you have exactly the right
size already.  Thanks to Natalia Gimelshein for diagnosing the problem
and Scott Wolchok for convincing me to fix it this way (as opposed
to speeding up resize_output).

Signed-off-by: Edward Z. Yang <[email protected]>

[ghstack-poisoned]
ezyang added a commit that referenced this pull request Jan 22, 2021
Previously, we unconditionally called resize_output, even when it was
known (via the op.will_resize boolean) that you have exactly the right
size already.  Thanks to Natalia Gimelshein for diagnosing the problem
and Scott Wolchok for convincing me to fix it this way (as opposed
to speeding up resize_output).

Signed-off-by: Edward Z. Yang <[email protected]>

ghstack-source-id: 5f9089d
Pull Request resolved: #50958
@ezyang
Copy link
Contributor Author

ezyang commented Jan 25, 2021

This didn't improve benchmarks as much as I was hoping.

Benchmarks:

Control test (no change expected):

import torch
from torch.utils.benchmark import Timer

c = Timer(
    stmt='a + a',
    setup=f'a = torch.rand(3, 4)'
).collect_callgrind(number=100, collect_baseline=False)
print(c.counts(denoise=True))

Before: 825172
After: 826016

Example from benchmark suite (expected to improve) a += 1

Before: 1735152
After: 1734484

@ngimel
Copy link
Collaborator

ngimel commented Jan 25, 2021

I guess it's a problem specifically on the thing Bert was benchmarking, at::add_out(x,x,y); x += 1 was taking an "inplace" pass that was doing the right thing even before.

@ezyang ezyang closed this Feb 9, 2021
@facebook-github-bot facebook-github-bot deleted the gh/ezyang/902/head branch March 12, 2021 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants