Skip to content

Conversation

@umanwizard
Copy link
Contributor

Create a make_variable override that moves out of a tensor instead of going through shallow_copy_and_detach. Call this override from factory methods like empty that create a brand new tensor, do nothing with it, and then copy it into a variable.

Will update this with actual numbers, but it seems to get rid of around 20-40% of the overhead of calling torch.empty(0)

@umanwizard umanwizard requested a review from zou3519 February 28, 2019 04:28
@zou3519 zou3519 requested a review from yf225 February 28, 2019 15:13
Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, but I'm not familiar with what we're doing with autograd meta. @yf225 how does this look to you?

I can believe the numbers because it looks like we were previously copying the TensorImpl


inline Tensor as_variable(Tensor tensor) {
return make_variable(std::move(tensor), /*requires_grad=*/false);
return make_variable(tensor, /*requires_grad=*/false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the original code is doing std::move(tensor) already, why do we need to change it back to the slow path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yf225 std::move was a no-op here before I added the rvalue reference overload of make_variable, so I am retaining the old behavior. I didn't want to mess with stuff unrelated to what I was changing.

Brennan Vincent added 2 commits February 28, 2019 10:32
}

// Moves out of the tensor instead of copying it.
// Can save ~10μs of overhead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment should be put in the function declaration above instead, to be in line with other factory functions. Also we likely shouldn't mention the actual number of overhead time savings because it's test environment dependent and also subject to change in implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that's a good point.

@umanwizard umanwizard requested a review from yf225 February 28, 2019 20:26
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umanwizard has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@umanwizard
Copy link
Contributor Author

actually, now that I think about it a bit more, the semantics of this are a bit confusing and not really the same as normal std::move -- I will rename the involved function to make it more clear.

@umanwizard umanwizard closed this Feb 28, 2019
@umanwizard umanwizard reopened this Feb 28, 2019
Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umanwizard has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@umanwizard
Copy link
Contributor Author

%timeit torch.empty()
before: 1.33, after 1.13

Running torch.empty() with cache fully cleared before each trial, and taking min value:
before: 20.29, after: 18.42 (min of 10,000 trials)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants