-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Factor backend routing logic out of convolution forward #67790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CI Flow Status⚛️ CI FlowRuleset - Version:
You can add a comment to the PR and tag @pytorchbot with the following commands: # ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun
# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slowFor more information, please take a look at the CI Flow Wiki. |
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit a8c7cc8 (more details on the Dr. CI page): 💚 💚 Looks good so far! There are no failures yet. 💚 💚 This comment was automatically generated by Dr. CI (expand for details).Please report bugs/suggestions to the (internal) Dr. CI Users group. |
aten/src/ATen/native/ConvUtils.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved here from Convolution.cpp
aten/src/ATen/native/ConvUtils.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's two overloads right now, which isn't great. The one exposed to python purposefully matches the signature of convolution
torch/csrc/Module.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exposes ConvBackend and _select_conv_backend to python
06571a1 to
6c777d6
Compare
albanD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Only some comments and spurious public API.
5dd4a97 to
e53eff3
Compare
I just started reading through this, but what is the plan for those? Is it fine to ignore them in this PR or do they need to be handled here? |
aten/src/ATen/native/ConvUtils.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You don't need to fix this because this is a move, but this looks like a bug: groups seems like it should be int64_t?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I agree with you - seems like a bug. Think it's worth opening an issue for?
test/test_nn.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be torch.backends.xnnpack.enabled, not sure if that helps...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, looks like I can get a test in for that backend by disabling cuDNN / mkldnn and checking torch.backends.xnnpack.enabled. My plan was to punt on mobile testing for this PR but I'll open an issue suggesting this
zou3519
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems reasonable to me
albanD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM only richard's nit
Current plan is to ignore them for this PR but add a note that they should be handled at some point. I was able to get miopen tested, so that just leaves the mobile backends. I can open a mobile issue indicating that they're not tested and leave it to those folks to deal with it. |
|
@jbschlosser has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
1 similar comment
|
@jbschlosser has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
61973c8 to
c4a9816
Compare
|
@jbschlosser has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Summary: This PR introduces a new function `_select_conv_backend` that returns a `ConvBackend` enum representing the selected backend for a given set of convolution inputs and params. The function and enum are exposed to python for testing purposes through `torch/csrc/Module.cpp` (please let me know if there's a better place to do this). A new set of tests validates that the correct backend is selected for several sets of inputs + params. Some backends aren't tested yet: * nnpack (for mobile) * xnnpack (for mobile) * winograd 3x3 (for mobile) Some flowcharts for reference:   Pull Request resolved: pytorch#67790 Reviewed By: zou3519 Differential Revision: D32280878 Pulled By: jbschlosser fbshipit-source-id: 4b9946f09411993b1c1d2f6bca8b485be823aebe
|
This pull request was exported from Phabricator. Differential Revision: D32280878 |
c4a9816 to
a8c7cc8
Compare
|
@jbschlosser merged this pull request in 9a2db6f. |
… linalg functions on GPU (#67980) Summary: Per title. This PR introduces a global flag that lets pytorch prefer one of the many backend implementations while calling linear algebra functions on GPU. Usage: ```python torch.backends.cuda.preferred_linalg_library('cusolver') ``` Available options (str): `'default'`, `'cusolver'`, `'magma'`. Issue #63992 inspired me to write this PR. No heuristic is perfect on all devices, library versions, matrix shapes, workloads, etc. We can obtain better performance if we can conveniently switch linear algebra backends at runtime. Performance of linear algebra operators after this PR should be no worse than before. The flag is set to **`'default'`** by default, which makes everything the same as before this PR. The implementation of this PR is basically following that of #67790. Pull Request resolved: #67980 Reviewed By: mruberry Differential Revision: D32849457 Pulled By: ngimel fbshipit-source-id: 679fee7744a03af057995aef06316306073010a6
This PR introduces a new function
_select_conv_backendthat returns aConvBackendenum representing the selected backend for a given set of convolution inputs and params.The function and enum are exposed to python for testing purposes through
torch/csrc/Module.cpp(please let me know if there's a better place to do this).A new set of tests validates that the correct backend is selected for several sets of inputs + params. Some backends aren't tested yet:
Some flowcharts for reference:

