Skip to content

Conversation

@pull
Copy link

@pull pull bot commented Sep 25, 2019

See Commits and Changes for more details.


Created by pull[bot]. Want to support this open source service? Please star it : )

xta0 and others added 30 commits September 17, 2019 11:06
Summary:
Pull Request resolved: #26261

### Summary

Previously we have enabled the CI jobs for Pull Requests and nightly build

-  **#25840 [iOS][Circle CI] Add PR jobs for iOS builds**
-  **#26074 [IOS][CIRCLE CI] Nightly jobs for iOS builds**

The testing phase is missing in the nightly build process.  Although we are able to generate the build and upload it to the AWS,  there is no way to know whether the binary is valid or not (there could be a linking error). To add the test phase to the process, we need

1. Put a dummy test App in the repo.
2. After the build jobs finishes, manually link the static libs to the dummy app to produce an executable using the xcode tool chain.
3. If there is no linking error, then upload the binaris to AWS. If there is an error, then stops the following process and reports an error in CI.

The second and third steps depends on the first step which needs to be landed first.

### Test Plan
- Don't break any existing CI jobs

Test Plan: Imported from OSS

Differential Revision: D17408929

Pulled By: xta0

fbshipit-source-id: e391da242639943005453d1318795f981034cc72
Summary:
This is the first step of adding CI for bc breaking changes detection of function shcemas.
Pull Request resolved: #26321

Reviewed By: hl475

Differential Revision: D17425468

Pulled By: houseroad

fbshipit-source-id: b4bb36e5597043407c943b5b8dfe2b1ac3248cb2
Summary:
Inserting markers using the nvtx-equivalent API is not supported yet.
Pull Request resolved: #26300

Differential Revision: D17425573

Pulled By: bddppq

fbshipit-source-id: 4df6c695ba07ab68e7f4dc2f77edde06f78fdac7
Summary:
In schema matching we allow a homogenous tuple to be matched to list arguments. This logic wasn't yet extended for vartype lists, causing stuff like `len((1, 2, 3))` to fail.

Fix for #20500
Pull Request resolved: #25944

Differential Revision: D17431514

Pulled By: eellison

fbshipit-source-id: 2ad98bab15eaa496471df651572735eb35183323
…rch on ROCm (#25409)

Summary:
Fix the regex (requires enabling extglob) for two digit clang releases.

While there, also fix it for three digit releases with the hope that I
do not need to touch it for some time.

Unfortunately, this regex requires extglob to be enabled in the shell.
Pull Request resolved: #25409

Differential Revision: D17431786

Pulled By: bddppq

fbshipit-source-id: a50b2ff525d9b6046deae9c8725c92d67119599a
)

Summary:
This is for testing purposes.
Pull Request resolved: #26115

Differential Revision: D17433122

Pulled By: zou3519

fbshipit-source-id: bf41327e6141e9ae589fcf18254c2a8cdd868dd7
Summary:
Add min and max of a list to JIT. Fixes #26036
Pull Request resolved: #26351

Differential Revision: D17427547

Pulled By: eellison

fbshipit-source-id: 45796b4076eef0b496b01c2cc710ec4dc15a1ee6
Summary: Pull Request resolved: #26346

Test Plan: Imported from OSS

Differential Revision: D17421345

Pulled By: gchanan

fbshipit-source-id: 03b3c61edc13994d96b1d60648da7335fb090531
Summary:
Pull Request resolved: #26135

This change adds the support to call QNNPACK using the refactored API for Linear operators (Fully Connected)
It also has certain cmake changes to enable builing and using pytorch_qnnpack inside aten
I have disabled USE_QNNPACK in CMakeLists.txt. Enabling it results in picking kernels from third_party/QNNPACK during runtime since the function names are the same.

Test Plan:
python test/test_quantized.py TestQNNPackOps.test_qlinear_qnnpack

Imported from OSS

Differential Revision: D17434885

fbshipit-source-id: 084698026938f4529f61d12e86dfe82534ec73dd
…ype lists

Test Plan: revert-hammer

Differential Revision:
D17431514

Original commit changeset: 2ad98bab15ea

fbshipit-source-id: 5cf445fd1e37629c700b9b3740fe13ca941e4db9
Summary:
ROCm CI jobs are running on Jenkins. They have the "-test{1,2}" parts in "JOB_BASE_NAME", not "BUILD_ENVIRONMENT".
Pull Request resolved: #26380

Differential Revision: D17439523

Pulled By: bddppq

fbshipit-source-id: 31e2a986d1b7ea40c90ab399a3c1e0a328ae3a92
Summary:
The Pickler previously had a distinction between tensors that would be inlined in 1 pickle binary (matching the format of `torch.save()`) and tensors that are saved elsewhere with only a reference stored in the binary. This PR moves that distinction out to `torch::pickle_save` to match the eager Python interface.

The change can be seen in `register_prim_ops.cpp` where the call to `jit::pickle` is now `torch::pickle_save`
](https://our.intern.facebook.com/intern/diff/17175215/)
Pull Request resolved: #25502

Pulled By: driazati

Differential Revision: D17175215

fbshipit-source-id: 8c9a21327cc79eaf6a0e488ea99e305be52f82b1
Summary: Pull Request resolved: #26347

Differential Revision: D17430034

Pulled By: ailzhang

fbshipit-source-id: 4d3a07617a37aa2d1ddf4fd874c0a678c716bf3e
Summary:
Pull Request resolved: #25658

This unflattens `dim` according to the shape specified in `namedshape`.
`namedshape` may be either an OrderedDict or an iterable of (name, size)
tuples.

Future:
- It is possible to make it take a dict in Python >= 3.6 because those are
ordered by default, but I'll leave that task for the future.

Test Plan: - new tests [namedtensor ci]

Differential Revision: D17192655

Pulled By: zou3519

fbshipit-source-id: fd9bd2f462c23a4df1c23d66f2aa95076ff1b160
Summary:
Pull Request resolved: #24022

In histogram observer add an approximation for L2 error minimization for selecting min/max.
By selecting new min/max, we filter out outliers in input distribution.

This follows the implementation of NormMinimization::NonlinearQuantizationParamsSearch in caffe2/quantization/server/norm_minimization.cc
ghstack-source-id: 90298789

Test Plan: buck test mode/dev caffe2/test:quantization -- 'test_histogram_observer'

Differential Revision: D16713239

fbshipit-source-id: 82631ba47974e25689c9c66bc3088117090e26d4
Summary:
Pull Request resolved: #26147

We may try to unpickle a byte string in py3 that was pickled from py2. Therefore we need to add encoding latin1.

Reviewed By: kennyhorror

Differential Revision: D17305677

fbshipit-source-id: c0c8a51909629a65eb72bb81cccfbabaee9f8d01
Summary:
Pull Request resolved: #26349

The directory holds a lot of private helper functions that help
implement named tensor functionality. Instead of naming each helper
function with a leading underscore, I change the name of the import to
`_namedtensor_internals` to signal it should not be used directly.

Test Plan: - [namedtensor ci]

Differential Revision: D17424178

Pulled By: zou3519

fbshipit-source-id: 8f7b74346765759303480e581038a661021acf53
Summary:
Pull Request resolved: #26350

Python 3 lets us use `...` to perform indexing. Semantically, `...`
means "the rest of the unspecified dimensions". For example, while
indexing, one can do (for 5D `tensor`) `tensor[0, 0, ..., 0]` and
the `...` is expanded into `tensor[0, 0, :, :, 0]`.

Previously, we were using '*' to represent a similar behavior in names.
For example, `tensor.refine_names` supports things like the following:

```
x = torch.randn(2, 3, 4, 5, 6)
x_out = x.refine_names('*', 'H', 'W')  # refine only the last two
dimensions
```

This PR changes it so that named tensor API functions recognize `'...'`
(in Python 2 and Python 3) and `...` (in Python 3 exclusively) instead
of `'*'`.

Test Plan: - [namedtensor ci]

Differential Revision: D17424666

Pulled By: zou3519

fbshipit-source-id: 003182879fd38ced3fea051217572a457cdaf7cf
Summary:
Cyclic reference was introduced in a previous version due to runtime overwriting of the bound method `optimizer.step`. This is now avoided by keeping a weak reference to the optimizer instance.

Credit: https://stackoverflow.com/questions/26157952/why-set-a-bound-method-to-python-object-create-a-circular-reference
Pull Request resolved: #25776

Differential Revision: D17420770

Pulled By: ezyang

fbshipit-source-id: 546ec94cf725ebfddb310b24e6a2e146ddecd1f6
Summary:
ignore the folder and its children ios/TestApp
Pull Request resolved: #26399

Differential Revision: D17451239

Pulled By: houseroad

fbshipit-source-id: d6ba666bf955454eca4a10c00784ee5947a70f59
Summary: Pull Request resolved: #26150

Test Plan: Imported from OSS

Differential Revision: D17427579

Pulled By: pbelevich

fbshipit-source-id: 68d012076aa86dee9f23fad71a2d265d75f56d22
Summary:
per #22226, The current sparse allreduce in ProcessGroupGloo pads the indices and values tensors to the maximum length across all processes and then performs a regular allgather (because they'll have equal size across processes). Instead, we can use allgatherv. This is mostly a win for memory usage if there is severe size imbalance between processes.

close #22226
Pull Request resolved: #23917

Test Plan:
buck run mode/dev-nosan caffe2/test:c10d -- test_c10d.ProcessGroupGlooTest.test_sparse_allreduce_basics

buck run mode/dev-nosan caffe2/test:c10d -- test_c10d.ProcessGroupGlooTest.test_sparse_allreduce_basics_cuda

buck run mode/dev-nosan caffe2/test:c10d -- test_c10d.ProcessGroupGlooTest.test_sparse_allreduce_checks

Differential Revision: D16664985

Pulled By: zhaojuanmao

fbshipit-source-id: e7d3c0770cbc09f9175b3027b527e95053724843
Summary:
- Removes torchtest
- <s>Moves test_torch tests skipped on ROCm to generic device test class</s>
- Creates test_nn generic device test class

Next: adding dtypes to generic device testing framework.
Pull Request resolved: #26374

Test Plan: Change is to tests themselves.

Differential Revision: D17442218

Pulled By: mruberry

fbshipit-source-id: d7e4451d09fc9049478b35a7efb8bb580071e8c8
…ment (#26313)

Summary:
# Problem

If there is not enough number of thread in the RPC Agent thread pool. Some circular dependent works could cause deadlock.

The current to way to get around this deadlock is to provide abundant number of threads.

# Solution

as titled
Pull Request resolved: #26313

Differential Revision: D17405491

Pulled By: xush6528

fbshipit-source-id: a1d9b6a84db0371cd4b63328fa00f651c0808485
Summary:
Currently calc_erfinv's float version on CPU is missing. This commit adds the float version (by templating).

I also used this opportunity to clean up calc_erfinv a bit.
Pull Request resolved: #26070

Reviewed By: ezyang

Differential Revision: D17368024

Pulled By: VitalyFedyunin

fbshipit-source-id: 00cc3097f340022b3788143e6c12b01c35d72f13
Summary:
Pull Request resolved: #26353

### Summary

As the new iOS building script has been landed, this PR will clean up some redundant code for the PR jobs.

### Test Plan

- Don't break any existing CI jobs
- Don't break the old iOS CI jobs

Test Plan: Imported from OSS

Differential Revision: D17457253

Pulled By: xta0

fbshipit-source-id: 0d85117533a62d0b9b7b859b0044fd4388c3c9d4
…26352)

Summary:
Pull Request resolved: #26352

"named_guard: P" is the same as "supports_named_tensor: !P".
Also changed the error message to be more understandable to users.

Test Plan:
- `TEST_NAMEDTENSOR=1 pytest test/test_namedtensor.py -v`
- [namedtensor ci]

Differential Revision: D17426234

Pulled By: zou3519

fbshipit-source-id: 4cab780e6e29e184e79cdd3690f41df9ebb2ecb5
Summary: Pull Request resolved: #26155

Differential Revision: D17455540

Pulled By: Krovatkin

fbshipit-source-id: e3aee465d108b59691d6c68f85fbf212a5d6a125
…_trigamma (#25791)

Summary:
- There are some missing casts.
- Functions like ::log, ::sin will potentially always invoke the double version on host. For
  example, compiling the following code:

  ```c++
  #include <cmath>

  float log_float(float f) {
      return ::logf(f);
  }

  double log_double(double f) {
      return ::log(f);
  }

  float log_float2(float f) {
      return ::log(f);
  }

  float log_float3(float f) {
      return std::log(f);
  }
  ```

  using `g++ -c -O3` leads to:

      log_float(float):
              jmp     logf
      log_double(double):
              jmp     log
      log_float2(float):
              subq    $8, %rsp
              cvtss2sd        %xmm0, %xmm0
              call    log
              addq    $8, %rsp
              cvtsd2ss        %xmm0, %xmm0
              ret
      log_float3(float):
              jmp     logf

  Note that log_float2 delegates the call to the double version of log
  (surrounded by cast), while log_float3 delegates the call correctly to
  logf. See https://godbolt.org/z/KsRWwW
Pull Request resolved: #25791

Differential Revision: D17452312

Pulled By: izdeby

fbshipit-source-id: 6276a011a373cd7cb144f9ecd84116aa206e7d1b
Summary:
This PR has been updated. Since ORIGINAL PR comment below.

ROCm CI builds have been hanging as we've been refactoring tests, even when these refactors seem entirely innocuous. This PR started by commenting out test_stft, for example, a Python test never run on ROCm, and that was sufficient to reliably hang the ROCm build in CI.

Putting ROCm tests back on the default stream appears to remove this hang. So this PR now does that. This is likely to unblock development.

ORIGINAL: Some test changes appear to be causing ROCm builds to hang in CI. This PR is an attempt to diagnose the source of the hang.
Pull Request resolved: #26394

Test Plan: Change is to test themselves.

Differential Revision: D17456678

Pulled By: mruberry

fbshipit-source-id: 38d00d01c64b5055c1dfed01687ce3e1c9372887
davidriazati and others added 27 commits September 24, 2019 10:44
Summary:
If the `Union` contains a non-class type, `issubclass` would fail, this
adds a check for that case
](https://our.intern.facebook.com/intern/diff/17505206/)
Pull Request resolved: #26312

Pulled By: driazati

Differential Revision: D17505206

fbshipit-source-id: 1331e412f938e2f08ecb079972147f11e3ec77cd
Summary:
Pull Request resolved: #26681

att

Test Plan:
ci

Imported from OSS

Differential Revision: D17542833

fbshipit-source-id: 653e906b0e146763609c69ef0de7f9cf38621586
Summary:
Pull Request resolved: #26694

Previously we would not properly populate `errorDesc` for:
```
./torch/jit/__init__.py:13:1: F401 'torch.nn.ModuleList' imported but unused
```

because we wanted only letters and spaces. Be more permissive

Test Plan: Imported from OSS

Differential Revision: D17551999

Pulled By: suo

fbshipit-source-id: b82567df1fa3c9729e7427dc3461bedfb40933dc
Summary:
**Summary**:
Adds `torch::nn::Identity` module support for the C++ API.

**Issue**: #25883

**Reviewer**: yf225
Pull Request resolved: #26713

Differential Revision: D17550982

Pulled By: yf225

fbshipit-source-id: f24483846e82d5d276d77a1a0c50884f3bc05112
Summary:
Pull Request resolved: #26554

Previously, in `TCPStore`'s constructor we did not pass in a timeout to
the `connect` function, which thus used the default timeout (-1, so infinite).
But the timeout variable in `TCPStore.cpp `is configurable by the user and set to
be 300 seconds by default, so we should be passing this into the connect function.

Test Plan: see above.

Differential Revision: D17486779

fbshipit-source-id: 42d38a3b8d492d9e9ff09110990a8e4a3a1292b2
Summary:
Pull Request resolved: #26728

Use Caffe2::mobile_threadpool() in linear and conv operators

Perf
Without threadpool - 76ms
With threadpool - 41 ms

Test Plan:
python test/test_quantized.py TestQNNPackOps

Imported from OSS

Differential Revision: D17553510

fbshipit-source-id: dd5b06f526f65d87727ec7e3dad0a5fa74cba9f9
Summary:
- Add support for linear and cubic interpolate in opset 11.
- Add support for 1d and 3d interpolate in nearest mode for opset 7 and 8.
- Add tests for all cases of interpolate in ORT tests (nearest/linear/cubic, 1d/2d/3d, upsample/downsample).
Pull Request resolved: #24805

Reviewed By: hl475

Differential Revision: D17330801

Pulled By: houseroad

fbshipit-source-id: 1bdefff9e72f5e70c51f4721e1d7347478b7505b
Summary:
- Normalization mean and std specified as parameters instead of hardcode
 - imageYUV420CenterCropToFloat32Tensor before this change worked only with square tensors (width==height) - added generalization to support width != height with all rotations and scalings
- javadocs
Pull Request resolved: #26690

Differential Revision: D17556006

Pulled By: IvanKobzarev

fbshipit-source-id: 63f3321ea2e6b46ba5c34f9e92c48d116f7dc5ce
Summary: Pull Request resolved: #25592

Test Plan: Imported from OSS

Differential Revision: D17552470

Pulled By: VitalyFedyunin

fbshipit-source-id: 6c8cc4f46dd390c231b2d0aac664ad2a6ac8876e
…chNorm2d.

Test Plan: revert-hammer

Differential Revision:
D17514653

Original commit changeset: 7d9cc8f619b7

fbshipit-source-id: 2cf32082a46fe169a1db4926df78a9f3256616ad
…es of the Module.

Test Plan: revert-hammer

Differential Revision:
D17513451

Original commit changeset: cf8f9b450e71

fbshipit-source-id: 319ec9399173eb06556969dc6be365b319c1ab6c
Summary:
Pull Request resolved: #26738

someone may use torch._export directly. Here we change the onnx_export_type's default value to None,
and if it's pytorch onnx caffe2 bundle, we set it to ONNX_ATEN_FALLBACK, otherwise, it's ONNX.

Test Plan: ci

Reviewed By: hl475

Differential Revision: D17546452

fbshipit-source-id: 38e53926e2b101484bbbce7b58ebcd6af8c42438
Summary:
Pull Request resolved: #26587

-
ghstack-source-id: 90557226

Test Plan: unit tests

Differential Revision: D17515048

fbshipit-source-id: 3459ee80efec29080060ec29d67642d789dd8749
Summary:
Pull Request resolved: #26696

att

Test Plan:
ci

Imported from OSS

Differential Revision: D17558701

fbshipit-source-id: 96ef87db74bd1a5d4ddc69867ae71d78c0df83fd
Summary:
Pull Request resolved: #26506

[pytorch] [distributed] Made test forgiving to allow rpc agent to return one of the two errors.
ghstack-source-id: 90667534

Test Plan: Made sure pg based UT works.

Differential Revision: D17488899

fbshipit-source-id: 41f76cf4b4a0ca5e651a5403d6e67b639f0b9c4f
Summary:
Pull Request resolved: #26656

Updating the NDK to r18 or newer triggers a path in our CI scripts so that we now build with clang instead of gcc.
Google discontinued the gcc support for android quite a while ago, clang is the only way forward.
ghstack-source-id: 90698985

Test Plan: CI

Reviewed By: dreiss

Differential Revision: D17533570

fbshipit-source-id: 5eef4d5a539d8bb1a6682f000d0b5d33b3752819
Summary:
Pull Request resolved: #25429

Previously we are using empty to generate test tensors, this PR changes the test tensors to use
randint so that we can test things properly
Also added a set_sizes_and_strides and removed .contiguous() in int_repr function to preserve the
original size and strides

Test Plan:
python test/test_quantized_tensor.py

Imported from OSS

Differential Revision: D17559660

fbshipit-source-id: d4ce81d577296c1137270fdaa6b1359fb703896f
Summary:
Pull Request resolved: #26636

This PR defines a lot of dimname overloads so that when named tensor
support is added for those operators, we will not have to modify the
autogenerated TensorMethods.h, thereby avoiding potential merge
conflicts in the future.

Overloads were added for the following:
- all
- any
- argmax
- argmin
- cumsum
- cumprod
- index_copy
- kthvalue
- mode
- permute
- squeeze
- index_add
- index_fill
- scatter
- scatter_add
- index_select
- gather
- sort
- argsort

Test Plan: - [namedtensor ci]

Differential Revision: D17522984

Pulled By: zou3519

fbshipit-source-id: eca6dea819ba4e4e43b71b700d5cf09176f00061
…c05a57 (#26736)

Summary:
Pull Request resolved: #26736

Previous import was 23bb6ea1a71f08e200114a153f48bd7adb66d486

Included changes:
- **[ab6b9420](onnx/onnx@ab6b9420)**: Relax IF's shape inference rule (#2345) <Wei-Sheng Chin>
- **[c5af774a](onnx/onnx@c5af774a)**: Clarify behavior in ConvTranspose (#2343) <Wei-Sheng Chin>
- **[a20ba2f1](onnx/onnx@a20ba2f1)**: Fix node test case model for Gemm scalar bias case (#2342) <Hariharan Seshadri>
- **[1aa176e0](onnx/onnx@1aa176e0)**: Update pybind (#2340) <Changming Sun>
- **[7840504d](onnx/onnx@7840504d)**: Update gen_doc script to validate proto3 files (#2122) <Raymond Yang>
- **[bd35e623](onnx/onnx@bd35e623)**: Fix some backend tests  (#2335) <Hariharan Seshadri>

Test Plan: ci

Reviewed By: hl475

Differential Revision: D17552449

fbshipit-source-id: 424acb261b54fc98485f782f6922b11b28c836eb
…6740)

Summary:
Now, we skip all function schema contains quantize key word
Pull Request resolved: #26740

Reviewed By: hl475

Differential Revision: D17561753

Pulled By: houseroad

fbshipit-source-id: c5e47ada072e71bfa2341a0af8f1743e86ef733c
…lper

Test Plan: revert-hammer

Differential Revision:
D17558701

Original commit changeset: 96ef87db74bd

fbshipit-source-id: fc398d3b8bb1cd0bae573e3fdac5cfb883b31373
Summary:
Pull Request resolved: #26558

Previously, name inference gets called after dimensions are wrapped.
This PR makes it so that name inference always wraps dimensions so that
it can be called anywhere. Ideally we would only wrap dimensions once,
but many of our operators wrap dimensions in weird places.

Wrapping dimensions in name inference is pretty inexpensive and only
happens for named tensors (name inference does not run on unnamed
tensors.)

Test Plan: - [namedtensor ci]

Differential Revision: D17557049

Pulled By: zou3519

fbshipit-source-id: 68c5636489e233dbf2588ab6ad4e379a6fe4c8ba
Summary: Pull Request resolved: #26688

Pulled By: driazati

Differential Revision: D17560634

fbshipit-source-id: e1c50d1ca24e0313c2b7d704c488a29ef6a47cad
… Opset 11

Test Plan: revert-hammer

Differential Revision:
D17330801

Original commit changeset: 1bdefff9e72f

fbshipit-source-id: dff07477403170c27260f736ab6e6010f0deca9f
Test Plan: revert-hammer

Differential Revision:
D17559660

Original commit changeset: d4ce81d57729

fbshipit-source-id: b6c9dc31f08935d255fa9eb3a830bafc76a13799
…26760)

Summary:
Pull Request resolved: #26760

Follow-up of D17514003 . Change Caffe2 code to use the new PackedDepthWiseConvMatrix interface.

Test Plan: CI

Reviewed By: dskhudia

Differential Revision: D17514350

fbshipit-source-id: 691d9f1fd35bdb7dd8ba152287f3a34359dc1f4c
…InitTensor for better clarity (#26756)

Summary:
This PR includes the following improvements:
1. Add comments for limitations of the multidim tensor factory function `torch::tensor(...)`, noting the fact that `torch::tensor({})` and mixed data type such as `torch::tensor({{bool, 2.0}})` are not supported at the moment. (I will also update https://pytorch.org/cppdocs/notes/tensor_creation.html to include usage examples for the multidim tensor factory function `torch::tensor(...)`)
2. Rename `ListInitTensor` to `InitListTensor`, for better naming consistency.

This addresses reviews in #26210. I will work on a separate PR to move the factory function to `at::`.
Pull Request resolved: #26756

Differential Revision: D17560136

Pulled By: yf225

fbshipit-source-id: eb8b45226e999784da48f75cc8953a998582df99
@pull pull bot added the ⤵️ pull label Sep 25, 2019
@pull pull bot merged commit d4dc844 into peterjc123:master Sep 25, 2019
pull bot pushed a commit that referenced this pull request Sep 11, 2021
…ytorch#63339)

Summary:
Pull Request resolved: pytorch#63339

# Context
https://fb.workplace.com/groups/pytorch.dev/permalink/900474523864362/?comment_id=901125403799274&reply_comment_id=905023386742809

##### WHAT IS A STACK TRACE?
A stack trace (also called stack backtrace or stack traceback) is a report of the active stack frames at a certain point in time during the execution of a program.

Typically when an exception is thrown, one would expect to see the code (file:line) that threw the exception, and every intermediate frame up to and including the main function.

We are enabling android stack trace to help debugging on android devices.

Test Plan:
## Steps to test
```
buck build fbsource//xplat/caffe2/mode/aibench_pytorch_android -c pt.enable_qpl=0 -c pt.has_backtraces=1 fbsource//xplat/caffe2/fb/lite_predictor:lite_predictorAndroid#android-x86_64

one_world android emulator android-28

adb push ~/fbsource/buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictorAndroid#android-x86_64 /data/local/tmp

cd /data/local/tmp
./lite_predictorAndroid#android-x86_64

./lite_predictorAndroid#android-x86_64 --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true
```

## See how model file is not found stack traces is:

### before
```
./lite_predictorAndroid#android-x86_64 --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true

Run with 2 threads
Run with 2 threads
Loading model...
terminating with uncaught exception of type c10::Error: open file failed, file path: ./detect.bc
Exception raised from RAIIFile at xplat/caffe2/caffe2/serialize/file_adapter.cc:13 (most recent call first):
(no backtrace available)
Aborted
```

### after
```
134|generic_x86_64:/data/local/tmp $ ./lite_predictorAndroid#android-x86_64 --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true
Run with 2 threads
Run with 2 threads
Loading model...
terminating with uncaught exception of type c10::Error: open file failed, file path: ./detect.bc
Exception raised from RAIIFile at xplat/caffe2/caffe2/serialize/file_adapter.cc:13 (most recent call first):
 frame #0       c10::get_backtrace(unsigned long, unsigned long, bool)[0x59494274f10e]
 frame #1       [0x5949427b1eee]
 frame #2       [0x5949427b1eb2]
 frame #3       [0x5949427b1cdc]
 frame #4       std::__ndk1::function<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > ()>::operator()() const[0x5949427afc34]
 frame #5       c10::Error::Error(c10::SourceLocation, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >)[0x5949427b05b1]
 frame #6       c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x5949427aca5f]
 frame #7       caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x5949426b37b2]
 frame #8       caffe2::serialize::FileAdapter::FileAdapter(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x5949426b3903]
 frame #9       torch::jit::_load_for_mobile(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, c10::optional<c10::Device>, std::__ndk1::unordered_map<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> >, std::__ndk1::hash<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > >, std::__ndk1::equal_to<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > >, std::__ndk1::allocator<std::__ndk1::pair<std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > > > >&)[0x5949422737bd]
 frame #10      torch::jit::_load_for_mobile(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, c10::optional<c10::Device>)[0x594942273769]
 frame #11      benchmark(std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, int, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&, bool, int, int, int, bool, int, bool, int, double, bool, bool, bool, std::__ndk1::basic_string<char, std::__ndk1::char_traits<char>, std::__ndk1::allocator<char> > const&)[0x59494189b21d]
 frame #12      main[0x594941882aff]
 frame #13      __libc_init[0x7b699d08578d]
```

### what we get for os:linux
```
(base) [[email protected] /data/users/pavithran/fbsource] ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor --model ./detect.bc --input_dims "1,3,192,192" --input_type float --warmup 20 --iter 5 --report_pep true
Run with 24 threads
Run with 24 threads
Loading model...
terminate called after throwing an instance of 'c10::Error'
  what():  open file failed, file path: ./detect.bc
Exception raised from RAIIFile at xplat/caffe2/caffe2/serialize/file_adapter.cc:13 (most recent call first):
frame #0: ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor() [0x20cb7fe]
frame #1: ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor() [0x20cb6c6]
frame #2: std::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > ()>::operator()() const + 0x54 (0x20ca4e4 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #3: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x57 (0x20ca9a7 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #4: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x7a (0x20c823a in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #5: caffe2::serialize::FileAdapter::RAIIFile::RAIIFile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x96 (0x206f3d6 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #6: caffe2::serialize::FileAdapter::FileAdapter(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x42 (0x206f502 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #7: torch::jit::_load_for_mobile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, c10::optional<c10::Device>, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&) + 0x30 (0x1be826c in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #8: torch::jit::_load_for_mobile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, c10::optional<c10::Device>) + 0x35 (0x1be8214 in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #9: benchmark(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, int, int, int, bool, int, bool, int, double, bool, bool, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0x16d (0x12093ad in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #10: main + 0x25c (0x11f933c in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)
frame #11: __libc_start_main + 0x105 (0x7fc7b9f2ed95 in /usr/local/fbcode/platform009/lib/libc.so.6)
frame #12: _start + 0x2a (0x11f902a in ./buck-out/gen/xplat/caffe2/fb/lite_predictor/lite_predictor)

Aborted (core dumped)
````

Reviewed By: dhruvbird

Differential Revision: D30135947

fbshipit-source-id: f50c634ef4545843305cad4b4a14a8776b1aec76
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.