[Mobile] Disable ProfilingGraphExecutorImpl for mobile #30067

xta0 · 2019-11-19T04:49:22Z

Stack from ghstack:

[Mobile] Disable ProfilingGraphExecutorImpl for mobile #30067 [Mobile] Disable profiling_model for mobile

Summary

The mobile build has been broken since last week due to a runtime error caused by a missing operator in JIT:

libc++abi.dylib: terminating with uncaught exception of type torch::jit::script::ErrorReport: 
Unknown builtin op: aten::_adaptive_avg_pool2d_backward.
Could not find any similar ops to aten::_adaptive_avg_pool2d_backward. This op may not exist or may not be currently supported in TorchScript.
:
at <string>:9:28
                grad_self = grad.expand(self.size()) / (self_size[-1] * self_size[-2])
            else:
                grad_self = torch._adaptive_avg_pool2d_backward(grad, self)
                            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE

            return grad_self

How this happens

Since we've disabled the autograd for the opensourced version, the backward ops won't get registered by JIT.

When forward runs, a GraphExecutor will be created according to the value of executor_mode. In the mobile case , this one was set to true, which gives us the ProfilingGraphExecutorImpl object. Seems like this executor will eventually try to emit IR for autograd schemas? which causes the error.

Fix

There are two ways to fix it.

Add a macro to disable profiling_mode as well as executor_mode on mobile. Like what FBCODE_CAFFE2 does here.
Disable the two modes in runtime, by calling torch::jit::getExecutorMode() = false; before calling forward.

(IMO, The second fix is sort of a workaround as it doesn't make sense from a user perspective (Why I need to do this). But the up side is that we don't have to introduce yet another macro )

Feel free to drop comments, if there is a better way to fix it.

How this was not detected by our mobile CI

We're working on adding runtime tests to our mobile build to prevent similar issues like this.

Test Plan

The error above disappears
Don't break CI

cc @AshkanAliabadi

Differential Revision: D18605998

ghstack-source-id: 06b4886 Pull Request resolved: #30067

@AshkanAliabadi

### Summary The mobile build from master was broken for two weeks due to a runtime error: ```shell libc++abi.dylib: terminating with uncaught exception of type torch::jit::script::ErrorReport: Unknown builtin op: aten::_adaptive_avg_pool2d_backward. Could not find any similar ops to aten::_adaptive_avg_pool2d_backward. This op may not exist or may not be currently supported in TorchScript. : at <string>:9:28 grad_self = grad.expand(self.size()) / (self_size[-1] * self_size[-2]) else: grad_self = torch._adaptive_avg_pool2d_backward(grad, self) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE return grad_self ``` Since we've disabled the autograd for the opensourced version, the `backward` ops won't get registered by JIT. When `forward` runs, `ProfilingGraphExecutorImpl::getPlanFor(Stack& stack)` gets called, which will try to invoke `differentiate` according to the value of `needs_gradient`. ```cpp // TODO: insert grad propagation bool needs_gradient = getProfilingMode() ? needsGradientInProfilingMode(copy->block()) : true; if (needs_gradient) { // for Simple Executor skip creating autodiff graphs // and let autograd handle backward for us if (getProfilingMode()) { auto diff_nodes = CreateAutodiffSubgraphs( copy, getAutodiffSubgraphInlining() ? autodiffSubgraphNodeThreshold : 1); for (Node *dnode : diff_nodes) { auto diff_graph = std::move(dnode->g(attr::Subgraph)); Gradient gradient = differentiate(diff_graph); runOptimization(gradient.f); // run non diff optimization on the forward graph runNondiffOptimization(gradient.f); packGradient(gradient, dnode); } InlineAutodiffSubgraphs(copy, getAutodiffSubgraphInlining() ? autodiffSubgraphInlineThreshold : 1); } } else { runNondiffOptimization(copy); } ``` The fix is sort of a workaround to get rid of the `profiling_mode`. Feel free to drop comments, if there is a better way to fix it. ### Test Plan - The error above disappears - Don't break CI cc @AshkanAliabadi

ghstack-source-id: 810d00c Pull Request resolved: #30067

ljk53 · 2019-11-19T21:57:40Z

@zdevito - do you have preference how we should fix this? Is there any case we need enable profiling on mobile build? If not probably we can use C10_MOBILE macro to set this off by default. Autograd doesn't need this to work, correct? (we need autograd for federate learning on mobile)

zdevito · 2019-11-19T23:14:10Z

This fix is what we want for now. It was an oversite that this got enabled for mobile.

facebook-github-bot · 2019-11-20T14:10:02Z

@xta0 merged this pull request in 2367e71.

[Mobile] Disable profiling_model for mobile

5297d2b

xta0 requested a review from apaszke as a code owner November 19, 2019 04:49

xta0 added a commit that referenced this pull request Nov 19, 2019

[Mobile] Disable profiling_model for mobile

0af638f

ghstack-source-id: 06b4886 Pull Request resolved: #30067

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Nov 19, 2019

xta0 requested a review from ljk53 November 19, 2019 05:14

xta0 added the oncall: mobile Related to mobile support, including iOS and Android label Nov 19, 2019

xta0 added a commit that referenced this pull request Nov 19, 2019

[Mobile] Disable profiling_model for mobile

47d3ee0

ghstack-source-id: 810d00c Pull Request resolved: #30067

xta0 changed the title ~~[Mobile] Disable profiling_model for mobile~~ [Mobile] Disable ProfilingGraphExecutorImpl for mobile Nov 19, 2019

ljk53 requested a review from zdevito November 19, 2019 21:57

supriyar mentioned this pull request Nov 19, 2019

Model Loading issue in android #29806

Closed

zdevito approved these changes Nov 19, 2019

View reviewed changes

facebook-github-bot closed this in 2367e71 Nov 20, 2019

facebook-github-bot added the merged label Nov 20, 2019

IvanKobzarev mentioned this pull request Nov 20, 2019

PytorchDemoApp at 1.4_nightly pytorch/android-demo-app#29

Open

facebook-github-bot deleted the gh/xta0/43/head branch November 23, 2019 15:16

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Mobile] Disable ProfilingGraphExecutorImpl for mobile #30067

[Mobile] Disable ProfilingGraphExecutorImpl for mobile #30067

Uh oh!

xta0 commented Nov 19, 2019 •

edited

Loading

Uh oh!

ljk53 commented Nov 19, 2019

Uh oh!

zdevito commented Nov 19, 2019

Uh oh!

facebook-github-bot commented Nov 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[Mobile] Disable ProfilingGraphExecutorImpl for mobile #30067

[Mobile] Disable ProfilingGraphExecutorImpl for mobile #30067

Uh oh!

Conversation

xta0 commented Nov 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How this happens

Fix

How this was not detected by our mobile CI

Test Plan

Uh oh!

ljk53 commented Nov 19, 2019

Uh oh!

zdevito commented Nov 19, 2019

Uh oh!

facebook-github-bot commented Nov 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

xta0 commented Nov 19, 2019 •

edited

Loading