[pytorch][mobile] custom build script #30144

ljk53 · 2019-11-20T06:27:00Z

Stack from ghstack:

[pytorch][mobile] custom build script #30144 [pytorch][mobile] custom build script

Summary:
Create script to produce libtorch that only contains ops needed by specific
models. Developers can use this workflow to further optimize mobile build size.

Need keep a dummy stub for unused (stripped) ops because some JIT side
logic requires certain function schemas to be existed in the JIT op
registry.

Test Steps:

Build "dump_operator_names" binary and use it to dump root ops needed
by a specific model:

build/bin/dump_operator_names --model=mobilenetv2.pk --output=mobilenetv2.yaml

The MobileNetV2 model should use the following ops:

- aten::t
- aten::dropout
- aten::mean.dim
- aten::add.Tensor
- prim::ListConstruct
- aten::addmm
- aten::_convolution
- aten::batch_norm
- aten::hardtanh_
- aten::mm

NOTE that for some reason it outputs "aten::addmm" but actually uses "aten::mm".
You need fix it manually for now.

Run custom build script locally (use Android as an example):

SELECTED_OP_LIST=mobilenetv2.yaml scripts/build_pytorch_android.sh armeabi-v7a

Checkout demo app that uses locally built library instead of
downloading from jcenter repo:

git clone --single-branch --branch custom_build [email protected]:ljk53/android-demo-app.git

Copy locally built libraries to demo app folder:

find ${HOME}/src/pytorch/android -name '*.aar' -exec cp {} ${HOME}/src/android-demo-app/HelloWorldApp/app/libs/ \;

Build demo app with locally built libtorch:

cd ${HOME}/src/android-demo-app/HelloWorldApp
./gradlew clean && ./gradlew assembleDebug

Install and run the demo app.

In-APK arm-v7 libpytorch_jni.so build size reduced from 5.5M to 2.9M.

Differential Revision: D18612127

Summary: Create script to produce libtorch that only contains ops needed by specific models. Developers can use this workflow to further optimize mobile build size. Need keep a dummy stub for unused (stripped) ops because some JIT side logic requires certain function schemas to be existed in the JIT op registry. Test Steps: 1. Build "dump_operator_names" binary and use it to dump root ops needed by a specific model: ``` build/bin/dump_operator_names --model=mobilenetv2.pk --output=mobilenetv2.yaml ``` 2. The MobileNetV2 model should use the following ops: ``` - aten::t - aten::dropout - aten::mean.dim - aten::add.Tensor - prim::ListConstruct - aten::addmm - aten::_convolution - aten::batch_norm - aten::hardtanh_ - aten::mm ``` NOTE that for some reason it outputs "aten::addmm" but actually uses "aten::mm". You need fix it manually for now. 3. Run custom build script locally (use Android as an example): ``` SELECTED_OP_LIST=mobilenetv2.yaml scripts/build_pytorch_android.sh armeabi-v7a ``` 4. Checkout demo app that uses locally built library instead of downloading from jcenter repo: ``` git clone --single-branch --branch custom_build [email protected]:ljk53/android-demo-app.git ``` 5. Copy locally built libraries to demo app folder: ``` find ${HOME}/src/pytorch/android -name '*.aar' -exec cp {} ${HOME}/src/android-demo-app/HelloWorldApp/app/libs/ \; ``` 6. Build demo app with locally built libtorch: ``` cd ${HOME}/src/android-demo-app/HelloWorldApp ./gradlew clean && ./gradlew assembleDebug ``` 7. Install and run the demo app. In-APK arm-v7 libpytorch_jni.so build size reduced from 5.5M to 2.9M. [ghstack-poisoned]

Summary: Create script to produce libtorch that only contains ops needed by specific models. Developers can use this workflow to further optimize mobile build size. Need keep a dummy stub for unused (stripped) ops because some JIT side logic requires certain function schemas to be existed in the JIT op registry. Test Steps: 1. Build "dump_operator_names" binary and use it to dump root ops needed by a specific model: ``` build/bin/dump_operator_names --model=mobilenetv2.pk --output=mobilenetv2.yaml ``` 2. The MobileNetV2 model should use the following ops: ``` - aten::t - aten::dropout - aten::mean.dim - aten::add.Tensor - prim::ListConstruct - aten::addmm - aten::_convolution - aten::batch_norm - aten::hardtanh_ - aten::mm ``` NOTE that for some reason it outputs "aten::addmm" but actually uses "aten::mm". You need fix it manually for now. 3. Run custom build script locally (use Android as an example): ``` SELECTED_OP_LIST=mobilenetv2.yaml scripts/build_pytorch_android.sh armeabi-v7a ``` 4. Checkout demo app that uses locally built library instead of downloading from jcenter repo: ``` git clone --single-branch --branch custom_build [email protected]:ljk53/android-demo-app.git ``` 5. Copy locally built libraries to demo app folder: ``` find ${HOME}/src/pytorch/android -name '*.aar' -exec cp {} ${HOME}/src/android-demo-app/HelloWorldApp/app/libs/ \; ``` 6. Build demo app with locally built libtorch: ``` cd ${HOME}/src/android-demo-app/HelloWorldApp ./gradlew clean && ./gradlew assembleDebug ``` 7. Install and run the demo app. In-APK arm-v7 libpytorch_jni.so build size reduced from 5.5M to 2.9M. ghstack-source-id: ac089ad Pull Request resolved: #30144

ljk53 · 2019-11-20T06:35:01Z

@iseeyuan do you know why MobileNetV2 contains "aten::addmm" operator in the instruction list but it ended up calling "aten::mm"? Do we need first look up JIT registry to "resolve" the operator binding before dumping?

iseeyuan · 2019-11-20T14:56:32Z

@ljk53 It may be caused by JIT optimization passes as well. Let me check if it's possible to dump the op list right before the interpreter running stage. It may be deeper in the graph executor.

ljk53 · 2019-11-20T16:24:39Z

@ljk53 It may be caused by JIT optimization passes as well. Let me check if it's possible to dump the op list right before the interpreter running stage. It may be deeper in the graph executor.

Is it possible to do different JIT optimization on mobile v.s. on server (might depend on input as well)? If we dump operator list on server how can we make sure it covers all possible optimization passes? cc: @zdevito

ljk53 · 2019-11-20T16:27:20Z

BTW, I think this script can be reviewed independently of the op-dump issue. We can fix the issue separately.

ezyang · 2019-11-20T18:35:31Z

Can we put the docs in a place more durable than a PR description?

iseeyuan · 2019-11-20T18:49:27Z

@ljk53 It may be caused by JIT optimization passes as well. Let me check if it's possible to dump the op list right before the interpreter running stage. It may be deeper in the graph executor.

Is it possible to do different JIT optimization on mobile v.s. on server (might depend on input as well)? If we dump operator list on server how can we make sure it covers all possible optimization passes? cc: @zdevito

It may be different from the final bytecode in full JIT. For lite interpreter we dump the bytecode from the original module, without JIT opt passes. We have one set of bytecode independent of input-based optimizations. The performance should not be affected significantly (cc @zdevito ). The actual performance difference is pending to be measured in detail.

iseeyuan · 2019-11-20T19:00:48Z

Probably a stupid idea: it may not be hard to disable opt passes in mobile (in general, not just for lite interpreter). In that sense we don't need to worry about the input-dependent passes and different ops introduced by each pass. Not sure if it can be at least a short-term solution.

ezyang · 2019-11-20T19:00:52Z

CMakeLists.txt


 set(ONNX_NAMESPACE "onnx_torch" CACHE STRING "A namespace for ONNX; needed to build with other frameworks that share ONNX.")
+set(SELECTED_OP_LIST "" CACHE STRING
+    "Path to the yaml file that contains the list of operators to include for custom build. Include all operators by default.")


For doc discoverability, would be good to have a link to the relevant docs here

facebook-github-bot · 2019-11-20T21:41:09Z

@ljk53 merged this pull request in 43fb001.

… custom build Summary: PR #30144 introduced custom build script to tailor build to specific models. It requires a list of all potentially used ops at build time. Some JIT optimization passes can transform the IR by replacing operators, e.g. decompose pass can replace aten::addmm with aten::mm if coefficients are 1s. Disabling optimization pass can ensure that the list of ops we dump from the model is the list of ops that are needed. Test Plan: - rerun the test on PR #30144 to verify the raw list without aten::mm works. [ghstack-poisoned]

… custom build Summary: PR #30144 introduced custom build script to tailor build to specific models. It requires a list of all potentially used ops at build time. Some JIT optimization passes can transform the IR by replacing operators, e.g. decompose pass can replace aten::addmm with aten::mm if coefficients are 1s. Disabling optimization pass can ensure that the list of ops we dump from the model is the list of ops that are needed. Test Plan: - rerun the test on PR #30144 to verify the raw list without aten::mm works. ghstack-source-id: 28e4a40 Pull Request resolved: #30285

) Summary: Pull Request resolved: #30285 PR #30144 introduced custom build script to tailor build to specific models. It requires a list of all potentially used ops at build time. Some JIT optimization passes can transform the IR by replacing operators, e.g. decompose pass can replace aten::addmm with aten::mm if coefficients are 1s. Disabling optimization pass can ensure that the list of ops we dump from the model is the list of ops that are needed. Test Plan: - rerun the test on PR #30144 to verify the raw list without aten::mm works. Differential Revision: D18652777 Pulled By: ljk53 fbshipit-source-id: 084751cb9a9ee16d8df7e743e9e5782ffd8bc4e3

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Nov 20, 2019

ljk53 requested review from IvanKobzarev, dreiss, ezyang, gchanan, iseeyuan and xta0 November 20, 2019 06:28

ezyang reviewed Nov 20, 2019

View reviewed changes

ezyang approved these changes Nov 20, 2019

View reviewed changes

facebook-github-bot closed this in 43fb001 Nov 20, 2019

facebook-github-bot added the merged label Nov 20, 2019

ljk53 mentioned this pull request Nov 22, 2019

[pytorch][mobile] disable JIT optimizer in Android wrapper for mobile custom build #30285

Closed

facebook-github-bot deleted the gh/ljk53/74/head branch November 24, 2019 15:16

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[pytorch][mobile] custom build script #30144

[pytorch][mobile] custom build script #30144

Uh oh!

ljk53 commented Nov 20, 2019 •

edited

Loading

Uh oh!

ljk53 commented Nov 20, 2019

Uh oh!

iseeyuan commented Nov 20, 2019

Uh oh!

ljk53 commented Nov 20, 2019

Uh oh!

ljk53 commented Nov 20, 2019

Uh oh!

ezyang commented Nov 20, 2019

Uh oh!

iseeyuan commented Nov 20, 2019

Uh oh!

iseeyuan commented Nov 20, 2019

Uh oh!

ezyang Nov 20, 2019

Uh oh!

facebook-github-bot commented Nov 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[pytorch][mobile] custom build script #30144

[pytorch][mobile] custom build script #30144

Uh oh!

Conversation

ljk53 commented Nov 20, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ljk53 commented Nov 20, 2019

Uh oh!

iseeyuan commented Nov 20, 2019

Uh oh!

ljk53 commented Nov 20, 2019

Uh oh!

ljk53 commented Nov 20, 2019

Uh oh!

ezyang commented Nov 20, 2019

Uh oh!

iseeyuan commented Nov 20, 2019

Uh oh!

iseeyuan commented Nov 20, 2019

Uh oh!

ezyang Nov 20, 2019

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Nov 20, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

ljk53 commented Nov 20, 2019 •

edited

Loading