-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[Mobile] Disable ProfilingGraphExecutorImpl for mobile #30067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
### Summary
The mobile build from master was broken for two weeks due to a runtime error:
```shell
libc++abi.dylib: terminating with uncaught exception of type torch::jit::script::ErrorReport:
Unknown builtin op: aten::_adaptive_avg_pool2d_backward.
Could not find any similar ops to aten::_adaptive_avg_pool2d_backward. This op may not exist or may not be currently supported in TorchScript.
:
at <string>:9:28
grad_self = grad.expand(self.size()) / (self_size[-1] * self_size[-2])
else:
grad_self = torch._adaptive_avg_pool2d_backward(grad, self)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
return grad_self
```
Since we've disabled the autograd for the opensourced version, the `backward` ops won't get registered by JIT.
When `forward` runs, `ProfilingGraphExecutorImpl::getPlanFor(Stack& stack)` gets called, which will try to invoke `differentiate` according to the value of `needs_gradient`.
```cpp
// TODO: insert grad propagation
bool needs_gradient = getProfilingMode()
? needsGradientInProfilingMode(copy->block())
: true;
if (needs_gradient) {
// for Simple Executor skip creating autodiff graphs
// and let autograd handle backward for us
if (getProfilingMode()) {
auto diff_nodes = CreateAutodiffSubgraphs(
copy,
getAutodiffSubgraphInlining() ? autodiffSubgraphNodeThreshold : 1);
for (Node *dnode : diff_nodes) {
auto diff_graph = std::move(dnode->g(attr::Subgraph));
Gradient gradient = differentiate(diff_graph);
runOptimization(gradient.f);
// run non diff optimization on the forward graph
runNondiffOptimization(gradient.f);
packGradient(gradient, dnode);
}
InlineAutodiffSubgraphs(copy, getAutodiffSubgraphInlining()
? autodiffSubgraphInlineThreshold
: 1);
}
} else {
runNondiffOptimization(copy);
}
```
The fix is sort of a workaround to get rid of the `profiling_mode`. Feel free to drop comments, if there is a better way to fix it.
### Test Plan
- The error above disappears
- Don't break CI
cc @AshkanAliabadi
Contributor
|
@zdevito - do you have preference how we should fix this? Is there any case we need enable profiling on mobile build? If not probably we can use C10_MOBILE macro to set this off by default. Autograd doesn't need this to work, correct? (we need autograd for federate learning on mobile) |
zdevito
approved these changes
Nov 19, 2019
Contributor
|
This fix is what we want for now. It was an oversite that this got enabled for mobile. |
Contributor
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Merged
oncall: jit
Add this issue/PR to JIT oncall triage queue
oncall: mobile
Related to mobile support, including iOS and Android
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack:
Summary
The mobile build has been broken since last week due to a runtime error caused by a missing operator in JIT:
How this happens
Since we've disabled the autograd for the opensourced version, the
backwardops won't get registered by JIT.When
forwardruns, aGraphExecutorwill be created according to the value ofexecutor_mode. In the mobile case , this one was set to true, which gives us theProfilingGraphExecutorImplobject. Seems like this executor will eventually try to emit IR for autograd schemas? which causes the error.Fix
There are two ways to fix it.
profiling_modeas well asexecutor_modeon mobile. Like whatFBCODE_CAFFE2does here.torch::jit::getExecutorMode() = false;before calling forward.(IMO, The second fix is sort of a workaround as it doesn't make sense from a user perspective (Why I need to do this). But the up side is that we don't have to introduce yet another macro )
Feel free to drop comments, if there is a better way to fix it.
How this was not detected by our mobile CI
We're working on adding runtime tests to our mobile build to prevent similar issues like this.
Test Plan
cc @AshkanAliabadi
Differential Revision: D18605998