Merge upstream master #485

rohithkrn · 2019-10-01T00:50:15Z

No description provided.

Summary: Pull Request resolved: pytorch#26417 Signed-off-by: Edward Z. Yang <[email protected]> Test Plan: Imported from OSS Differential Revision: D17548776 Pulled By: ezyang fbshipit-source-id: 8c79893ee4216780edb838671e701de5518c4cd0

…orch#26685) Summary: Pull Request resolved: pytorch#26685 This prevents XLA from picking up on named tensor APIs. I ran into some problems while attempting to support dimname overloads in XLA; since we don't need the first iteration of named tensors to work with XLA this is OK. Test Plan: - run CI. Differential Revision: D17538893 Pulled By: zou3519 fbshipit-source-id: 93d579c93f5b1dc68541c07c4a3d61792859507d

Summary: GitHub commits: facebook/litho@ff4a610 facebook/mvfst@ad81c38 pytorch/FBGEMM@518d8a1 Test Plan: n/a Reviewed By: yns88 fbshipit-source-id: 2a9a47805569a43e05d044c5494b57f6a7996bc4

… and clean up functional test code (pytorch#26559) Summary: This ensures that `F::cosine_similarity` and `F::pairwise_distance` can be used simply by including `torch/torch.h` and set `namespace F = torch::nn::functional`. Pull Request resolved: pytorch#26559 Differential Revision: D17507421 Pulled By: yf225 fbshipit-source-id: f895dde3634d5c8ca66ee036903e327e5cdab6b1

Summary: There is an issue with the torchvision version not matching the pytorch version if one builds the docker from a tag, see issue pytorch#25917. The current solution requires one to re-init the submodules or manually change the version of torchvision. This PR allows one to build the docker image without torchvision, which not only fixes the above mentioned bug but also frees non-image pytorch users from the tyranny of torchvision 😆. In all seriousness, for NLP researchers especially torchvision isn't a necessity for pytorch and all non-essential items shouldn't be in the docker. This option removes one extra thing that can go wrong. Pull Request resolved: pytorch#26168 Differential Revision: D17550001 Pulled By: soumith fbshipit-source-id: 48b8b9e22b75eef3afb392c618742215d3920e9d

…h#26020) Summary: Current integer scalar exps are always cast to double. This commit avoids cast if the tensor is also integral and the scalar is positive to speed up. Benchmark (Debian Buster, g++ 8, Intel(R) Xeon(R) E-2136 CPU @ 3.30GHz 0 0:0 3300.00 MHz , Debug build, Turbo turned off): ```python import timeit for n, t in [(1000, 13000), (10_000, 1300)]: for e in (2, 3, 4): for dtype in ('torch.int16', 'torch.int32', 'torch.int64'): print(f'a.pow({e}) (a.numel() == {n}) for {t} times') print(f'dtype {dtype}, {t} times', end='\t\t') print(timeit.timeit(f'a.pow({e})', setup=f'import torch; a = torch.arange({n}, device="cpu", dtype={dtype})', number=t)) ``` Before: ``` a.pow(2) (a.numel() == 1000) for 13000 times dtype torch.int16, 13000 times 1.6958350749996498 a.pow(2) (a.numel() == 1000) for 13000 times dtype torch.int32, 13000 times 0.7989626339999631 a.pow(2) (a.numel() == 1000) for 13000 times dtype torch.int64, 13000 times 0.7973162800003593 a.pow(3) (a.numel() == 1000) for 13000 times dtype torch.int16, 13000 times 1.8660746679997828 a.pow(3) (a.numel() == 1000) for 13000 times dtype torch.int32, 13000 times 0.8101709959996697 a.pow(3) (a.numel() == 1000) for 13000 times dtype torch.int64, 13000 times 0.8135280149999744 a.pow(4) (a.numel() == 1000) for 13000 times dtype torch.int16, 13000 times 5.010833072999958 a.pow(4) (a.numel() == 1000) for 13000 times dtype torch.int32, 13000 times 4.801007671999741 a.pow(4) (a.numel() == 1000) for 13000 times dtype torch.int64, 13000 times 3.963344578000033 a.pow(2) (a.numel() == 10000) for 1300 times dtype torch.int16, 1300 times 1.6216251330001796 a.pow(2) (a.numel() == 10000) for 1300 times dtype torch.int32, 1300 times 0.5672429639998882 a.pow(2) (a.numel() == 10000) for 1300 times dtype torch.int64, 1300 times 0.5544572270000572 a.pow(3) (a.numel() == 10000) for 1300 times dtype torch.int16, 1300 times 1.656308512999658 a.pow(3) (a.numel() == 10000) for 1300 times dtype torch.int32, 1300 times 1.502670819999821 a.pow(3) (a.numel() == 10000) for 1300 times dtype torch.int64, 1300 times 0.5757876879997639 a.pow(4) (a.numel() == 10000) for 1300 times dtype torch.int16, 1300 times 4.775718216999849 a.pow(4) (a.numel() == 10000) for 1300 times dtype torch.int32, 1300 times 4.754745475000163 a.pow(4) (a.numel() == 10000) for 1300 times dtype torch.int64, 1300 times 3.737249878000057 ``` After: ``` a.pow(2) (a.numel() == 1000) for 13000 times dtype torch.int16, 13000 times 1.1006453190002503 a.pow(2) (a.numel() == 1000) for 13000 times dtype torch.int32, 13000 times 1.0849009019998448 a.pow(2) (a.numel() == 1000) for 13000 times dtype torch.int64, 13000 times 1.093259106000005 a.pow(3) (a.numel() == 1000) for 13000 times dtype torch.int16, 13000 times 1.0859826279997833 a.pow(3) (a.numel() == 1000) for 13000 times dtype torch.int32, 13000 times 1.1076840900000207 a.pow(3) (a.numel() == 1000) for 13000 times dtype torch.int64, 13000 times 1.0755480369998622 a.pow(4) (a.numel() == 1000) for 13000 times dtype torch.int16, 13000 times 1.918211066999902 a.pow(4) (a.numel() == 1000) for 13000 times dtype torch.int32, 13000 times 1.9183043200000611 a.pow(4) (a.numel() == 1000) for 13000 times dtype torch.int64, 13000 times 1.930021430999659 a.pow(2) (a.numel() == 10000) for 1300 times dtype torch.int16, 1300 times 0.7271483560002707 a.pow(2) (a.numel() == 10000) for 1300 times dtype torch.int32, 1300 times 0.7289002070001516 a.pow(2) (a.numel() == 10000) for 1300 times dtype torch.int64, 1300 times 0.7267536800000016 a.pow(3) (a.numel() == 10000) for 1300 times dtype torch.int16, 1300 times 0.7301799359997858 a.pow(3) (a.numel() == 10000) for 1300 times dtype torch.int32, 1300 times 0.7289195180001116 a.pow(3) (a.numel() == 10000) for 1300 times dtype torch.int64, 1300 times 0.7270008230002531 a.pow(4) (a.numel() == 10000) for 1300 times dtype torch.int16, 1300 times 1.5354506029998447 a.pow(4) (a.numel() == 10000) for 1300 times dtype torch.int32, 1300 times 1.528263066999898 a.pow(4) (a.numel() == 10000) for 1300 times dtype torch.int64, 1300 times 1.5369428439998956 ``` --- Best viewed with whitespace changes turned off Pull Request resolved: pytorch#26020 Differential Revision: D17485400 Pulled By: VitalyFedyunin fbshipit-source-id: 3a16b074825a5aab0f7e7af3d8100f9e4b7011a3

Summary: Pull Request resolved: pytorch#26709 Polishes implementation from pytorch#25975. Primarily, we use NoopObserver to communicate that weights need to be quantized to float16. The very top-level API (quantize_dynamic) stays the same with `dtype` argument but the implementation follows the common flow. One can argue that dynamic fp16 quantization doesn't really fit into the 'observer' mechanism. It's in fact not ideal, but it's better to have the same flow than branching on both dtype and qconfig. Test Plan: Imported from OSS Differential Revision: D17544103 Pulled By: dzhulgakov fbshipit-source-id: 6af3f18c35929a1a53ea734079c005f656e4925f

Summary: Pull Request resolved: pytorch#26684 Output heights and widths are already calculated by conv_p. Remove the duplicate calculation. ghstack-source-id: 90633432 Test Plan: buck test mode/dev caffe2/test:quantized ``` Summary (total time 18.69s): PASS: 45 FAIL: 0 SKIP: 10 caffe2/test:quantized - test_qadd_scalar_relu (test_quantized.TestQuantizedOps) caffe2/test:quantized - test_equal (test_quantized.TestQuantizedOps) caffe2/test:quantized - test_qnnpack_add (test_quantized.TestQNNPackOps) caffe2/test:quantized - test_qconv_unpack (test_quantized.TestQNNPackOps) caffe2/test:quantized - test_qlinear_unpack (test_quantized.TestQNNPackOps) caffe2/test:quantized - test_compare_tensor_scalar (test_quantized.TestComparatorOps) caffe2/test:quantized - test_qconv_qnnpack (test_quantized.TestQNNPackOps) caffe2/test:quantized - test_qlinear_qnnpack (test_quantized.TestQNNPackOps) caffe2/test:quantized - test_qnnpack_relu (test_quantized.TestQNNPackOps) caffe2/test:quantized - test_qnnpack_maxpoolMore details at https://our.intern.facebook.com/intern/buck/build/3b394f1e-ab99-4e59-bdf5-2766f46e9869 2d (test_quantized.TestQNNPackOps) FATAL: 0 TIMEOUT: 0 OMIT: 0 ``` Differential Revision: D17538375 fbshipit-source-id: b4b60e93fdec4cc7bbf6aee7182381221dfac243

Summary: - Ports all CUDA tests to TestAutogradDeviceType except those using multiple devices Pull Request resolved: pytorch#26708 Differential Revision: D17549435 Pulled By: mruberry fbshipit-source-id: b564186444201d1351934b6a7d21f67bdfca6e3b

Summary: Pull Request resolved: pytorch#22752 Differential Revision: D17543836 Pulled By: Krovatkin fbshipit-source-id: 5cbca220943a580169bf60ac09780b6e67075d2b

…export API (pytorch#26146) Summary: This is a follow-up PR for pytorch#23284. In that PR we had removed changing the default behavior for `keep_initializers_as_input` argument to the export API. With this PR we are enabling that change in that if `keep_initializers_as_input` is not specified then value/behavior for this argument is chosen automatically depending on whether the export type is ONNX or not. This was part of the earlier PR was removed for further review. The test points have also been updated. This change may fail some internal tests which may require explicitly setting `keep_initializers_as_input=True` to preserve old behavior. Pull Request resolved: pytorch#26146 Reviewed By: hl475 Differential Revision: D17369677 Pulled By: houseroad fbshipit-source-id: 2aec2cff50d215714ee8769505ef24d2b7865a11

Summary: fix pytorch#26032. This was broken by a bad openssl release in conda. Should be fixed now. Testing... Pull Request resolved: pytorch#26697 Differential Revision: D17542095 Pulled By: ailzhang fbshipit-source-id: ba99f9b36ef2a7c793842cf91bd46fb2634ac1aa

Summary: Pull Request resolved: pytorch#26253 Test Plan: Imported from OSS Differential Revision: D17529994 Pulled By: jamesr66a fbshipit-source-id: e3aff71da35b05ed61710cdb88d72b51c944168b

Summary: Pull Request resolved: pytorch#26680 This was introduced before under the assumption that we'll have a qconv_per_tensor_affine and a qconv_per_channel_affine, but turns out we don't have these, so we'll remove thse functions. Test Plan: python test/test_jit.py 'TestJit.test_quant_fusion' Imported from OSS Differential Revision: D17542607 fbshipit-source-id: b90ce5738170f0922bdc2eb1c4dbecd930f68a48

…ytorch#26581) Summary: Pull Request resolved: pytorch#26581 We're currently inlining immediate values of the constants directly into IR when we generate it providing no way to access these values by their names later. This change registers such values as atrtibutes of the module so that they are not lost after IR generation. Differential Revision: D17513451 Test Plan: Imported from OSS Pulled By: ZolotukhinM fbshipit-source-id: cf8f9b450e7178692211abd905ffd2d7ce5a6ce1

Summary: Pull Request resolved: pytorch#26584 Test Plan: Imported from OSS Differential Revision: D17514653 Pulled By: ZolotukhinM fbshipit-source-id: 7d9cc8f619b7dbe26fa58eac37cc131929c004d4

Summary: Pull Request resolved: pytorch#26553 Differential Revision: D17551426 Pulled By: driazati fbshipit-source-id: 53ce05882091aca4617586bc53944ee4c8b3a622

Summary: If the `Union` contains a non-class type, `issubclass` would fail, this adds a check for that case ](https://our.intern.facebook.com/intern/diff/17505206/) Pull Request resolved: pytorch#26312 Pulled By: driazati Differential Revision: D17505206 fbshipit-source-id: 1331e412f938e2f08ecb079972147f11e3ec77cd

Summary: Pull Request resolved: pytorch#26681 att Test Plan: ci Imported from OSS Differential Revision: D17542833 fbshipit-source-id: 653e906b0e146763609c69ef0de7f9cf38621586

Summary: Pull Request resolved: pytorch#26694 Previously we would not properly populate `errorDesc` for: ``` ./torch/jit/__init__.py:13:1: F401 'torch.nn.ModuleList' imported but unused ``` because we wanted only letters and spaces. Be more permissive Test Plan: Imported from OSS Differential Revision: D17551999 Pulled By: suo fbshipit-source-id: b82567df1fa3c9729e7427dc3461bedfb40933dc

Summary: **Summary**: Adds `torch::nn::Identity` module support for the C++ API. **Issue**: pytorch#25883 **Reviewer**: yf225 Pull Request resolved: pytorch#26713 Differential Revision: D17550982 Pulled By: yf225 fbshipit-source-id: f24483846e82d5d276d77a1a0c50884f3bc05112

Summary: Pull Request resolved: pytorch#26554 Previously, in `TCPStore`'s constructor we did not pass in a timeout to the `connect` function, which thus used the default timeout (-1, so infinite). But the timeout variable in `TCPStore.cpp `is configurable by the user and set to be 300 seconds by default, so we should be passing this into the connect function. Test Plan: see above. Differential Revision: D17486779 fbshipit-source-id: 42d38a3b8d492d9e9ff09110990a8e4a3a1292b2

Summary: Pull Request resolved: pytorch#26728 Use Caffe2::mobile_threadpool() in linear and conv operators Perf Without threadpool - 76ms With threadpool - 41 ms Test Plan: python test/test_quantized.py TestQNNPackOps Imported from OSS Differential Revision: D17553510 fbshipit-source-id: dd5b06f526f65d87727ec7e3dad0a5fa74cba9f9

Summary: - Add support for linear and cubic interpolate in opset 11. - Add support for 1d and 3d interpolate in nearest mode for opset 7 and 8. - Add tests for all cases of interpolate in ORT tests (nearest/linear/cubic, 1d/2d/3d, upsample/downsample). Pull Request resolved: pytorch#24805 Reviewed By: hl475 Differential Revision: D17330801 Pulled By: houseroad fbshipit-source-id: 1bdefff9e72f5e70c51f4721e1d7347478b7505b

Summary: - Normalization mean and std specified as parameters instead of hardcode - imageYUV420CenterCropToFloat32Tensor before this change worked only with square tensors (width==height) - added generalization to support width != height with all rotations and scalings - javadocs Pull Request resolved: pytorch#26690 Differential Revision: D17556006 Pulled By: IvanKobzarev fbshipit-source-id: 63f3321ea2e6b46ba5c34f9e92c48d116f7dc5ce

Summary: Pull Request resolved: pytorch#25592 Test Plan: Imported from OSS Differential Revision: D17552470 Pulled By: VitalyFedyunin fbshipit-source-id: 6c8cc4f46dd390c231b2d0aac664ad2a6ac8876e

…chNorm2d. Test Plan: revert-hammer Differential Revision: D17514653 Original commit changeset: 7d9cc8f619b7 fbshipit-source-id: 2cf32082a46fe169a1db4926df78a9f3256616ad

…es of the Module. Test Plan: revert-hammer Differential Revision: D17513451 Original commit changeset: cf8f9b450e71 fbshipit-source-id: 319ec9399173eb06556969dc6be365b319c1ab6c

Summary: Pull Request resolved: pytorch#26738 someone may use torch._export directly. Here we change the onnx_export_type's default value to None, and if it's pytorch onnx caffe2 bundle, we set it to ONNX_ATEN_FALLBACK, otherwise, it's ONNX. Test Plan: ci Reviewed By: hl475 Differential Revision: D17546452 fbshipit-source-id: 38e53926e2b101484bbbce7b58ebcd6af8c42438

Summary: Pull Request resolved: pytorch#26587 - ghstack-source-id: 90557226 Test Plan: unit tests Differential Revision: D17515048 fbshipit-source-id: 3459ee80efec29080060ec29d67642d789dd8749

Summary: Pull Request resolved: pytorch#26516 ghstack-source-id: 90982010 Test Plan: Integrate per-channel support into conv and linear modules. The following tests pass: buck test caffe2/test:quantized -- 'test_linear_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details buck test caffe2/test:quantized -- 'test_conv_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details buck test caffe2/test:quantized -- 'test_float_quant_compare_per_channel \(test_quantized_models\.ModelNumerics\)' --print-passing-details Differential Revision: D17342622 fbshipit-source-id: f0d618928e3d9348672c589a6b7a47049c372a2e

Summary: For TensorRT test introduced in pytorch#26426 Pull Request resolved: pytorch#26835 Reviewed By: hl475 Differential Revision: D17580108 Pulled By: houseroad fbshipit-source-id: c57fafec228b78c26b8a7946c92ad7434425bbd4

…motion. (pytorch#27017) Summary: pytorch#26981 Pull Request resolved: pytorch#27017 Differential Revision: D17651454 Pulled By: ifedan fbshipit-source-id: c6313caa11598a0ef160e1c6d2f3c33d03ce80c5

Summary: Pull Request resolved: pytorch#27008 Test Plan: Imported from OSS Differential Revision: D17649174 Pulled By: jamesr66a fbshipit-source-id: e3e6c4bb31e1ad8ed1ebe27f803f90d564ecfe53

Summary: Pull Request resolved: pytorch#26819 Test Plan: Imported from OSS Differential Revision: D17627829 Pulled By: pbelevich fbshipit-source-id: be4d803c7d4ba2c59e54d154eeebc63794465191

Summary: Pull Request resolved: pytorch#26582 Add the blob quantization. replace the op in the eval/predictor net. Test Plan: # Unit test: ----- buck build fblearner/flow/projects/dper/tests/validators:test_exporter_options_validators ./buck-out/gen/fblearner/flow/projects/dper/tests/validators/test_exporter_options_validators#binary.par ---- buck build caffe2/caffe2/fb/dper/layer_models/tests:exporter_test ./buck-out/gen/caffe2/caffe2/fb/dper/layer_models/tests/exporter_test-2.7#binary.par Reviewed By: chocjy Differential Revision: D17439720 fbshipit-source-id: 68de5d0322b0111aeca5ed552210bf80a4cddc78

Summary: Make cudnn rnn respect current stream. After this lands, non-default test stream can be reenabled in pytorch#26791 Pull Request resolved: pytorch#27026 Test Plan: default stream functionality is tested in existing tests, stream safety tests will be added in pytorch#26791 Differential Revision: D17656967 Pulled By: ngimel fbshipit-source-id: 8b051aedd1df089b21f666ec553a5acefffdac88

Summary: Pull Request resolved: pytorch#26368 A boxed fallback function is a per-TensorTypeId generic function that is called if there is no operator specific function (either for the TensorTypeId, or the UndefinedTensorId fallback for that operator). The outline of the patch: * Add a new `boxed_fallback_table_` in the top-level ATenDispatch struct (this is NOT per operator) for storing boxed fallback functions. Boxed fallback functions provisionally have the interface `void(const char* schema, torch::jit::Stack*)`; we expect to replace the `const char* schema` with something more detailed in the near future. * Boxing and unboxing is not supported for all arguments, as IValue doesn't cover the full set of types that are in ATen. The `supports_boxed_fallback` type trait tests if boxing is possible. The list of exclusions here was experimentally generated. One notable exclusion is that we cannot handle any mutable functions, as they return a `Tensor&` which we cannot conveniently manufacture from a Tensor on an IValue stack. * The actual boxed fallback is handled in two phases. First, we do a (compile time) test to see if boxing/unboxing is supported. If it is, we do a runtime test to check if a fallback function is installed. If both conditions are met, we allocate a stack, push our arguments onto it, and then call the boxed fallback function. The return value is expected to be a single argument on the top of the stack; we retrieve it, unpack it, and return it to the user. At present, there are no tests for this diff. This diff also makes `multi_dispatch_tensor_type_set` safer by explicitly specifying that it takes all arguments as references. This prevents the function from accidentally moving in rvalue references (this never actually happened because none of the overloads of apply move out their arguments; but they could have.) Signed-off-by: Edward Z. Yang <[email protected]> Differential Revision: D17448555 Test Plan: Imported from OSS Pulled By: ezyang fbshipit-source-id: 43a5be7bdfb5c0b466d01a55380cb79e6d938044

Summary: Pull Request resolved: pytorch#26623 Per-channel fake quant cpu and cuda operators, per-channel support in fake quant module, tests for per-channel fake-quant and serializability of fake quant modules ghstack-source-id: 91008299 ghstack-source-id: 91008299 Test Plan: buck test mode/dev caffe2/test:fake_quant -- Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/1970324848875929 ✓ caffe2/test:fake_quant - test_backward_per_tensor (test_fake_quant.TestFakeQuantizePerTensor) 0.242 1/10 (passed) ✓ caffe2/test:fake_quant - test_numerical_consistency_per_tensor (test_fake_quant.TestFakeQuantizePerTensor) 0.204 2/10 (passed) ✓ caffe2/test:fake_quant - test_fq_serializable (test_fake_quant.TestFakeQuantizePerTensor) 0.174 3/10 (passed) ✓ caffe2/test:fake_quant - test_numerical_consistency_per_channel (test_fake_quant.TestFakeQuantizePerChannel) 0.279 4/10 (passed) ✓ caffe2/test:fake_quant - test_forward_per_tensor (test_fake_quant.TestFakeQuantizePerTensor) 0.241 5/10 (passed) ✓ caffe2/test:fake_quant - test_forward_per_channel (test_fake_quant.TestFakeQuantizePerChannel) 0.353 6/10 (passed) ✓ caffe2/test:fake_quant - test_fq_module (test_fake_quant.TestFakeQuantizePerTensor) 0.354 7/10 (passed) ✓ caffe2/test:fake_quant - test_backward_per_channel (test_fake_quant.TestFakeQuantizePerChannel) 0.334 8/10 (passed) ✓ caffe2/test:fake_quant - test_fq_serializable (test_fake_quant.TestFakeQuantizePerChannel) 0.168 9/10 (passed) ✓ caffe2/test:fake_quant - test_fq_module (test_fake_quant.TestFakeQuantizePerChannel) 0.429 10/10 (passed) ✓ caffe2/test:fake_quant - main 0.000 (passed) Differential Revision: D17439406 fbshipit-source-id: 64bfff5e4f40bc2ab8af2b432c7bc33805418077

Summary: Pull Request resolved: pytorch#26624 For QAT we need to be able to control batch norm for all modules from the top. Adding helper functions to enable/disable batch norm freezing during training ghstack-source-id: 91008297 Test Plan: buck test caffe2/test:quantization -- --print-passing-details Differential Revision: D17512199 fbshipit-source-id: f7b981e2b1966ab01c4dbb161030177274a998b6

…st (pytorch#26625) Summary: Pull Request resolved: pytorch#26625 ghstack-source-id: 91008296 Test Plan: buck test caffe2/test:quantized -- 'test_weight_only_activation_only_fakequant \(test_quantized_models\.ModelNumerics\)' --print-passing-details Differential Revision: D17520342 fbshipit-source-id: 26e148d3299afcfdfb1187aff6ab80687ed8df47

Summary: Pull Request resolved: pytorch#26627 ghstack-source-id: 91008337 Test Plan: buck test caffe2/test:quantization -- --print-passing-details Differential Revision: D17518194 fbshipit-source-id: 1eb8a7a85dc811c4ee5228d68563abb157613ceb

Summary: Fixes: pytorch#26038 Somewhere between v1.1 and master `nonzero` become `abstract` and was marked as differentiable (by mistake) we need to but them into TH section of `tools/autograd/derivatives.yaml ` to fix it. Pull Request resolved: pytorch#26980 Differential Revision: D17632276 Pulled By: VitalyFedyunin fbshipit-source-id: d6cabcc53348af6148cea5a1bd1af2ef12547373

Summary: Pull Request resolved: pytorch#27031 Differential Revision: D17665998 Pulled By: ezyang fbshipit-source-id: 6926e304c75ba878520627f1e829412f633b1bec

Summary: Fixes pytorch#26998 Pull Request resolved: pytorch#27047 Differential Revision: D17666050 Pulled By: ezyang fbshipit-source-id: f02ebd5320ae25f8949be20d0744fe3cd3e2fee9

Summary: Pull Request resolved: pytorch#27056 Original commit changeset: 137b1faa86b4 As per discussion in D17330625, unlanding until CPU/GPU versions are split Test Plan: waitforsandcastle Differential Revision: D17663237 fbshipit-source-id: 93fdc6b40c6a6fc610342a74eaaa1886ddc8c519

Summary: Pull Request resolved: pytorch#26579 Remove `linear_prepack` call and attach a module to the parent class that contains the packed weight and bias, this is to support serialization of the quantized model since the packed weight and bias is not serializable and we need to overwrite the `__getstate__` and `__setstate__` function to be able to serialize them Test Plan: python test/test_jit.py Imported from OSS Differential Revision: D17636397 fbshipit-source-id: 3b81b6faa4413e4309453fd6acec2f0be6fd2f16

Summary: Pull Request resolved: pytorch#26761 This PR serialize autograd ops into its own namespace by turning the serialization op name into `torch.autograd.op`, this is to keep the original code namespace rather than turning all to the global namespace, this will be more properly handled in the future when we handle the module namespace. This change also preserve BC until we have namespace handling Test Plan: Imported from OSS Differential Revision: D17645438 fbshipit-source-id: 656ec6b31d4fc2252585de73117c4d40a122678e

Summary: Pull Request resolved: pytorch#26939 Updated quant fusion patterns to work with modules with prepack params folded into module. Test Plan: python test/test_jit.py 'TestJit.test_quant_fusion' Imported from OSS Differential Revision: D17636398 fbshipit-source-id: 8e7917e981260b81ed6038a1c2ccf19049726395

Summary: This PR adds the ability to use tuples directly with the IValue constructor rather than the vector<IValue> approach Pull Request resolved: pytorch#26668 Differential Revision: D17668653 Pulled By: bwasti fbshipit-source-id: aff7f62fe3b502df78b28b2355cff88d88ad288c

Test Plan: revert-hammer Differential Revision: D17666050 Original commit changeset: f02ebd5320ae fbshipit-source-id: 6bc8fe583e350e2b573f767af85d1287dd048d1f

…6993) Summary: We always build the Release version of Sleef on gcc 7. Sep 26 02:59:19 cd /var/lib/jenkins/cpp-build/caffe2/build/sleef/src/libm && /opt/cache/bin/cc -DDORENAME=1 -DENABLE_ALIAS=1 -DENABLE_BUILTIN_MATH=1 -DENABLE_PUREC_SCALAR=1 -DENABLE_SYS_getrandom=1 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DSLEEF_STATIC_LIBS=1 -DTH_BLAS_MKL -D_FILE_OFFSET_BITS=64 -I/var/lib/jenkins/cpp-build/caffe2/build/aten/src -I/var/lib/jenkins/workspace/aten/src -I/var/lib/jenkins/cpp-build/caffe2/build -I/var/lib/jenkins/workspace -isystem /var/lib/jenkins/cpp-build/caffe2/build/third_party/gloo -isystem /var/lib/jenkins/workspace/cmake/../third_party/gloo -isystem /var/lib/jenkins/workspace/cmake/../third_party/googletest/googlemock/include -isystem /var/lib/jenkins/workspace/cmake/../third_party/googletest/googletest/include -isystem /var/lib/jenkins/workspace/third_party/protobuf/src -isystem /opt/python/2.7.9/include -isystem /var/lib/jenkins/workspace/third_party/gemmlowp -isystem /var/lib/jenkins/workspace/third_party/neon2sse -I/var/lib/jenkins/workspace/cmake/../third_party/benchmark/include -isystem /var/lib/jenkins/workspace/third_party -isystem /var/lib/jenkins/workspace/cmake/../third_party/eigen -isystem /var/lib/jenkins/workspace/torch/include -isystem /opt/rocm/hip/include -isystem /include -I/var/lib/jenkins/cpp-build/caffe2/build/caffe2/contrib/aten -I/var/lib/jenkins/workspace/third_party/onnx -I/var/lib/jenkins/cpp-build/caffe2/build/third_party/onnx -I/var/lib/jenkins/workspace/third_party/foxi -I/var/lib/jenkins/cpp-build/caffe2/build/third_party/foxi -isystem /var/lib/jenkins/workspace/third_party/ideep/include -I/var/lib/jenkins/workspace/third_party/NNPACK/include -I/var/lib/jenkins/workspace/third_party/NNPACK/src -I/var/lib/jenkins/workspace/third_party/cpuinfo/include -I/var/lib/jenkins/workspace/third_party/pthreadpool/include -I/var/lib/jenkins/workspace/third_party/FXdiv/include -I/var/lib/jenkins/workspace/third_party/psimd/include -I/var/lib/jenkins/workspace/third_party/FP16/include -I/var/lib/jenkins/workspace/third_party/sleef/src/common -I/var/lib/jenkins/workspace/third_party/sleef/src/arch -I/var/lib/jenkins/cpp-build/caffe2/build/sleef/src/libm/include -I/var/lib/jenkins/workspace/third_party/sleef/src/libm -Wall -Wno-unused -Wno-attributes -Wno-unused-result -Wno-psabi -ffp-contract=off -fno-math-errno -fno-trapping-math -g -O1 -fPIC -DCAFFE2_USE_GLOO -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -std=gnu99 -o CMakeFiles/sleefpurec_scalar.dir/sleefsimdsp.c.o -c /var/lib/jenkins/workspace/third_party/sleef/src/libm/sleefsimdsp.c Sep 26 02:59:20 /var/lib/jenkins/workspace/third_party/sleef/src/libm/sleefsimdsp.c: In function 'gammafk': Sep 26 02:59:20 /var/lib/jenkins/workspace/third_party/sleef/src/libm/sleefsimdsp.c:3103:1: internal compiler error: in trunc_int_for_mode, at explow.c:55 Sep 26 02:59:20 } Sep 26 02:59:20 ^ Sep 26 02:59:20 Please submit a full bug report, Sep 26 02:59:20 with preprocessed source if appropriate. Sep 26 02:59:20 See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions. Sep 26 02:59:20 sleef/src/libm/CMakeFiles/sleefpurec_scalar.dir/build.make:67: recipe for target 'sleef/src/libm/CMakeFiles/sleefpurec_scalar.dir/sleefsimdsp.c.o' failed Sep 26 02:59:20 make[2]: Leaving directory '/var/lib/jenkins/cpp-build/caffe2/build' Also updated Sleef submodule to include fixes that are missed in pytorch#26749 pytorch#26994 provides a potentially cleaner fix Close pytorch#26892 Pull Request resolved: pytorch#26993 Differential Revision: D17669103 Pulled By: ezyang fbshipit-source-id: 1b87a4a8fecc6441de3b008aee6929537768be1a

…27061) Summary: Pull Request resolved: pytorch#27061 Previously the cronjobs were run on master, but now the nightly builds count as "PRs" so we must whitelist them from should_run calculation. Signed-off-by: Edward Z. Yang <[email protected]> Test Plan: Imported from OSS Differential Revision: D17669066 Pulled By: ezyang fbshipit-source-id: 3b92bf1d09aefa7ef524ea93dfa8c6f566161887

Summary: Fixes pytorch#26333 This effectively rewrites all infix arithmetic operators *op* as: ```python def __op__(self, other): try: return self.op(other) except TypeError: return NotImplemented ``` Where `TypeError` is raised from the argument parser when `other` is not a `Tensor`. This should be okay, so long as `TypeError` isn't raised by the function for any other reasons. I couldn't find any examples where this was an issue. Pull Request resolved: pytorch#26507 Differential Revision: D17669097 Pulled By: ezyang fbshipit-source-id: 2c1a1087057c9298915d713d3fea7d682d014c72

Summary: GitHub commits: facebook/fbthrift@fff8f73 facebookarchive/fbzmq@4081fb9 facebook/folly@c6e6dff facebook/litho@941e49f facebook/proxygen@17d9c5e facebook/wangle@c5db605 facebookincubator/fizz@e6e46fe facebookincubator/katran@642c3e1 facebook/mvfst@3d11c53 pytorch/FBGEMM@0d7da7c Test Plan: n/a Reviewed By: zpao fbshipit-source-id: b817e19167517a26a7b4663cc69dcada07ae3021

…ytorch#26908) Summary: Pull Request resolved: pytorch#26908 This is because in static dispatch, variable is not supported. But you might still "have" a variable somehow (e.g., in JIT initialization code), so we need to translate between the two worlds. Attempt to fix pytorch#26764 Signed-off-by: Edward Z. Yang <[email protected]> Test Plan: Imported from OSS Differential Revision: D17607550 Pulled By: ezyang fbshipit-source-id: ea2cb931215d749efd95259ad7a1a8ae96c38aed

rohithkrn · 2019-10-01T06:01:23Z

ROCm tests passed. Merging.

ezyang and others added 30 commits September 24, 2019 08:41

Typo fix (pytorch#26417)

05f7081

Summary: Pull Request resolved: pytorch#26417 Signed-off-by: Edward Z. Yang <[email protected]> Test Plan: Imported from OSS Differential Revision: D17548776 Pulled By: ezyang fbshipit-source-id: 8c79893ee4216780edb838671e701de5518c4cd0

Updating submodules

714b05e

Summary: GitHub commits: facebook/litho@ff4a610 facebook/mvfst@ad81c38 pytorch/FBGEMM@518d8a1 Test Plan: n/a Reviewed By: yns88 fbshipit-source-id: 2a9a47805569a43e05d044c5494b57f6a7996bc4

Add traces to specialize_autograd and lower_grad_of (2nd try)

8b12602

Summary: Pull Request resolved: pytorch#22752 Differential Revision: D17543836 Pulled By: Krovatkin fbshipit-source-id: 5cbca220943a580169bf60ac09780b6e67075d2b

Trivial quantized torch.mean implementation

cf272d4

Summary: Pull Request resolved: pytorch#26253 Test Plan: Imported from OSS Differential Revision: D17529994 Pulled By: jamesr66a fbshipit-source-id: e3aff71da35b05ed61710cdb88d72b51c944168b

Un-hardcode epsilon constant in FoldConvBatchNorm2d.

eddda3a

Summary: Pull Request resolved: pytorch#26584 Test Plan: Imported from OSS Differential Revision: D17514653 Pulled By: ZolotukhinM fbshipit-source-id: 7d9cc8f619b7dbe26fa58eac37cc131929c004d4

Add doc building instructions

5cc3534

Summary: Pull Request resolved: pytorch#26553 Differential Revision: D17551426 Pulled By: driazati fbshipit-source-id: 53ce05882091aca4617586bc53944ee4c8b3a622

Remove _dequantize_per_tensor (pytorch#26681)

3f72bcf

Summary: Pull Request resolved: pytorch#26681 att Test Plan: ci Imported from OSS Differential Revision: D17542833 fbshipit-source-id: 653e906b0e146763609c69ef0de7f9cf38621586

Simplify operator sign using the helper.

9dd9a7e

Summary: Pull Request resolved: pytorch#25592 Test Plan: Imported from OSS Differential Revision: D17552470 Pulled By: VitalyFedyunin fbshipit-source-id: 6c8cc4f46dd390c231b2d0aac664ad2a6ac8876e

Revert D17514653: [quant] Un-hardcode epsilon constant in FoldConvBat…

1058373

…chNorm2d. Test Plan: revert-hammer Differential Revision: D17514653 Original commit changeset: 7d9cc8f619b7 fbshipit-source-id: 2cf32082a46fe169a1db4926df78a9f3256616ad

Revert D17513451: Register values listed in __constants__ as attribut…

f43b7c4

…es of the Module. Test Plan: revert-hammer Differential Revision: D17513451 Original commit changeset: cf8f9b450e71 fbshipit-source-id: 319ec9399173eb06556969dc6be365b319c1ab6c

Address review comments in pytorch#26272 (pytorch#26587)

d212320

Summary: Pull Request resolved: pytorch#26587 - ghstack-source-id: 90557226 Test Plan: unit tests Differential Revision: D17515048 fbshipit-source-id: 3459ee80efec29080060ec29d67642d789dd8749

raghuramank100 and others added 28 commits September 28, 2019 14:05

Migrate le/gt/ge/eq/ne from the TH to Aten. Added support of type pro…

ee2c79d

…motion. (pytorch#27017) Summary: pytorch#26981 Pull Request resolved: pytorch#27017 Differential Revision: D17651454 Pulled By: ifedan fbshipit-source-id: c6313caa11598a0ef160e1c6d2f3c33d03ce80c5

Improve repr for quantized modules

4d7bec5

Summary: Pull Request resolved: pytorch#27008 Test Plan: Imported from OSS Differential Revision: D17649174 Pulled By: jamesr66a fbshipit-source-id: e3e6c4bb31e1ad8ed1ebe27f803f90d564ecfe53

C++ API parity: AdaptiveAvgPool3d

1a3997e

Summary: Pull Request resolved: pytorch#26819 Test Plan: Imported from OSS Differential Revision: D17627829 Pulled By: pbelevich fbshipit-source-id: be4d803c7d4ba2c59e54d154eeebc63794465191

Fix Windows CI

1eaa9f8

Summary: Pull Request resolved: pytorch#27031 Differential Revision: D17665998 Pulled By: ezyang fbshipit-source-id: 6926e304c75ba878520627f1e829412f633b1bec

Fixed seek offset size to 64bit. (pytorch#27047)

1afe3fc

Summary: Fixes pytorch#26998 Pull Request resolved: pytorch#27047 Differential Revision: D17666050 Pulled By: ezyang fbshipit-source-id: f02ebd5320ae25f8949be20d0744fe3cd3e2fee9

Revert D17666050: [pytorch][PR] Fixed seek offset size to 64bit.

b16358b

Test Plan: revert-hammer Differential Revision: D17666050 Original commit changeset: f02ebd5320ae fbshipit-source-id: 6bc8fe583e350e2b573f767af85d1287dd048d1f

merge up master

fcbce9b

update changes from upstream to bf16 for compare ops

f5fb09e

rohithkrn merged commit 0e6c9c8 into ROCm:bf16_bringup Oct 1, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Merge upstream master #485

Merge upstream master #485

Uh oh!

rohithkrn commented Oct 1, 2019

Uh oh!

rohithkrn commented Oct 1, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Merge upstream master #485

Merge upstream master #485

Uh oh!

Conversation

rohithkrn commented Oct 1, 2019

Uh oh!

rohithkrn commented Oct 1, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants