-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Don't make factory methods create a tensor and then immediately copy it #17565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
zou3519
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me, but I'm not familiar with what we're doing with autograd meta. @yf225 how does this look to you?
I can believe the numbers because it looks like we were previously copying the TensorImpl
|
|
||
| inline Tensor as_variable(Tensor tensor) { | ||
| return make_variable(std::move(tensor), /*requires_grad=*/false); | ||
| return make_variable(tensor, /*requires_grad=*/false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the original code is doing std::move(tensor) already, why do we need to change it back to the slow path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yf225 std::move was a no-op here before I added the rvalue reference overload of make_variable, so I am retaining the old behavior. I didn't want to mess with stuff unrelated to what I was changing.
torch/csrc/autograd/variable.h
Outdated
| } | ||
|
|
||
| // Moves out of the tensor instead of copying it. | ||
| // Can save ~10μs of overhead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment should be put in the function declaration above instead, to be in line with other factory functions. Also we likely shouldn't mention the actual number of overhead time savings because it's test environment dependent and also subject to change in implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's a good point.
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@umanwizard has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
actually, now that I think about it a bit more, the semantics of this are a bit confusing and not really the same as normal |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@umanwizard has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
Running |
Create a
make_variableoverride that moves out of a tensor instead of going throughshallow_copy_and_detach. Call this override from factory methods likeemptythat create a brand new tensor, do nothing with it, and then copy it into a variable.Will update this with actual numbers, but it seems to get rid of around 20-40% of the overhead of calling
torch.empty(0)