Skip to content

Conversation

@rohithkrn
Copy link

No description provided.

ezyang and others added 30 commits September 24, 2019 08:41
Summary:
Pull Request resolved: pytorch#26417

Signed-off-by: Edward Z. Yang <[email protected]>

Test Plan: Imported from OSS

Differential Revision: D17548776

Pulled By: ezyang

fbshipit-source-id: 8c79893ee4216780edb838671e701de5518c4cd0
…orch#26685)

Summary:
Pull Request resolved: pytorch#26685

This prevents XLA from picking up on named tensor APIs. I ran into some
problems while attempting to support dimname overloads in XLA; since we
don't need the first iteration of named tensors to work with XLA this is
OK.

Test Plan: - run CI.

Differential Revision: D17538893

Pulled By: zou3519

fbshipit-source-id: 93d579c93f5b1dc68541c07c4a3d61792859507d
Summary:
GitHub commits:

facebook/litho@ff4a610
facebook/mvfst@ad81c38
pytorch/FBGEMM@518d8a1

Test Plan: n/a

Reviewed By: yns88

fbshipit-source-id: 2a9a47805569a43e05d044c5494b57f6a7996bc4
… and clean up functional test code (pytorch#26559)

Summary:
This ensures that `F::cosine_similarity` and `F::pairwise_distance` can be used simply by including `torch/torch.h` and set `namespace F = torch::nn::functional`.
Pull Request resolved: pytorch#26559

Differential Revision: D17507421

Pulled By: yf225

fbshipit-source-id: f895dde3634d5c8ca66ee036903e327e5cdab6b1
Summary:
There is an issue with the torchvision version not matching the pytorch version if one builds the docker from a tag, see issue pytorch#25917.  The current solution requires one to re-init the submodules or manually change the version of torchvision.  This PR allows one to build the docker image without torchvision, which not only fixes the above mentioned bug but also frees non-image pytorch users from the tyranny of torchvision 😆.

In all seriousness, for NLP researchers especially torchvision isn't a necessity for pytorch and all non-essential items shouldn't be in the docker.  This option removes one extra thing that can go wrong.
Pull Request resolved: pytorch#26168

Differential Revision: D17550001

Pulled By: soumith

fbshipit-source-id: 48b8b9e22b75eef3afb392c618742215d3920e9d
…h#26020)

Summary:
Current integer scalar exps are always cast to double. This commit avoids cast if the tensor is also
integral and the scalar is positive to speed up.

Benchmark (Debian Buster, g++ 8, Intel(R) Xeon(R) E-2136 CPU @ 3.30GHz	0	0:0	3300.00 MHz	, Debug
build, Turbo turned off):

```python
import timeit

for n, t in [(1000, 13000),
            (10_000, 1300)]:
    for e in (2, 3, 4):
        for dtype in ('torch.int16', 'torch.int32', 'torch.int64'):
            print(f'a.pow({e}) (a.numel() == {n}) for {t} times')
            print(f'dtype {dtype}, {t} times', end='\t\t')
            print(timeit.timeit(f'a.pow({e})',
                                setup=f'import torch; a = torch.arange({n}, device="cpu", dtype={dtype})',
                                number=t))
```

Before:

```
a.pow(2) (a.numel() == 1000) for 13000 times
dtype torch.int16, 13000 times		1.6958350749996498
a.pow(2) (a.numel() == 1000) for 13000 times
dtype torch.int32, 13000 times		0.7989626339999631
a.pow(2) (a.numel() == 1000) for 13000 times
dtype torch.int64, 13000 times		0.7973162800003593
a.pow(3) (a.numel() == 1000) for 13000 times
dtype torch.int16, 13000 times		1.8660746679997828
a.pow(3) (a.numel() == 1000) for 13000 times
dtype torch.int32, 13000 times		0.8101709959996697
a.pow(3) (a.numel() == 1000) for 13000 times
dtype torch.int64, 13000 times		0.8135280149999744
a.pow(4) (a.numel() == 1000) for 13000 times
dtype torch.int16, 13000 times		5.010833072999958
a.pow(4) (a.numel() == 1000) for 13000 times
dtype torch.int32, 13000 times		4.801007671999741
a.pow(4) (a.numel() == 1000) for 13000 times
dtype torch.int64, 13000 times		3.963344578000033
a.pow(2) (a.numel() == 10000) for 1300 times
dtype torch.int16, 1300 times		1.6216251330001796
a.pow(2) (a.numel() == 10000) for 1300 times
dtype torch.int32, 1300 times		0.5672429639998882
a.pow(2) (a.numel() == 10000) for 1300 times
dtype torch.int64, 1300 times		0.5544572270000572
a.pow(3) (a.numel() == 10000) for 1300 times
dtype torch.int16, 1300 times		1.656308512999658
a.pow(3) (a.numel() == 10000) for 1300 times
dtype torch.int32, 1300 times		1.502670819999821
a.pow(3) (a.numel() == 10000) for 1300 times
dtype torch.int64, 1300 times		0.5757876879997639
a.pow(4) (a.numel() == 10000) for 1300 times
dtype torch.int16, 1300 times		4.775718216999849
a.pow(4) (a.numel() == 10000) for 1300 times
dtype torch.int32, 1300 times		4.754745475000163
a.pow(4) (a.numel() == 10000) for 1300 times
dtype torch.int64, 1300 times		3.737249878000057
```

After:

```
a.pow(2) (a.numel() == 1000) for 13000 times
dtype torch.int16, 13000 times		1.1006453190002503
a.pow(2) (a.numel() == 1000) for 13000 times
dtype torch.int32, 13000 times		1.0849009019998448
a.pow(2) (a.numel() == 1000) for 13000 times
dtype torch.int64, 13000 times		1.093259106000005
a.pow(3) (a.numel() == 1000) for 13000 times
dtype torch.int16, 13000 times		1.0859826279997833
a.pow(3) (a.numel() == 1000) for 13000 times
dtype torch.int32, 13000 times		1.1076840900000207
a.pow(3) (a.numel() == 1000) for 13000 times
dtype torch.int64, 13000 times		1.0755480369998622
a.pow(4) (a.numel() == 1000) for 13000 times
dtype torch.int16, 13000 times		1.918211066999902
a.pow(4) (a.numel() == 1000) for 13000 times
dtype torch.int32, 13000 times		1.9183043200000611
a.pow(4) (a.numel() == 1000) for 13000 times
dtype torch.int64, 13000 times		1.930021430999659
a.pow(2) (a.numel() == 10000) for 1300 times
dtype torch.int16, 1300 times		0.7271483560002707
a.pow(2) (a.numel() == 10000) for 1300 times
dtype torch.int32, 1300 times		0.7289002070001516
a.pow(2) (a.numel() == 10000) for 1300 times
dtype torch.int64, 1300 times		0.7267536800000016
a.pow(3) (a.numel() == 10000) for 1300 times
dtype torch.int16, 1300 times		0.7301799359997858
a.pow(3) (a.numel() == 10000) for 1300 times
dtype torch.int32, 1300 times		0.7289195180001116
a.pow(3) (a.numel() == 10000) for 1300 times
dtype torch.int64, 1300 times		0.7270008230002531
a.pow(4) (a.numel() == 10000) for 1300 times
dtype torch.int16, 1300 times		1.5354506029998447
a.pow(4) (a.numel() == 10000) for 1300 times
dtype torch.int32, 1300 times		1.528263066999898
a.pow(4) (a.numel() == 10000) for 1300 times
dtype torch.int64, 1300 times		1.5369428439998956
```

 ---

Best viewed with whitespace changes turned off
Pull Request resolved: pytorch#26020

Differential Revision: D17485400

Pulled By: VitalyFedyunin

fbshipit-source-id: 3a16b074825a5aab0f7e7af3d8100f9e4b7011a3
Summary:
Pull Request resolved: pytorch#26709

Polishes implementation from pytorch#25975. Primarily, we use NoopObserver to communicate that weights need to be quantized to float16. The very top-level API (quantize_dynamic) stays the same with `dtype` argument but the implementation follows the common flow.

One can argue that dynamic fp16 quantization doesn't really fit into the 'observer' mechanism. It's in fact not ideal, but it's better to have the same flow than branching on both dtype and qconfig.

Test Plan: Imported from OSS

Differential Revision: D17544103

Pulled By: dzhulgakov

fbshipit-source-id: 6af3f18c35929a1a53ea734079c005f656e4925f
Summary:
Pull Request resolved: pytorch#26684

Output heights and widths are already calculated by conv_p. Remove the duplicate calculation.
ghstack-source-id: 90633432

Test Plan:
buck test mode/dev caffe2/test:quantized
```
Summary (total time 18.69s):
  PASS: 45
  FAIL: 0
  SKIP: 10
    caffe2/test:quantized - test_qadd_scalar_relu (test_quantized.TestQuantizedOps)
    caffe2/test:quantized - test_equal (test_quantized.TestQuantizedOps)
    caffe2/test:quantized - test_qnnpack_add (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_qconv_unpack (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_qlinear_unpack (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_compare_tensor_scalar (test_quantized.TestComparatorOps)
    caffe2/test:quantized - test_qconv_qnnpack (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_qlinear_qnnpack (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_qnnpack_relu (test_quantized.TestQNNPackOps)
    caffe2/test:quantized - test_qnnpack_maxpoolMore details at https://our.intern.facebook.com/intern/buck/build/3b394f1e-ab99-4e59-bdf5-2766f46e9869
2d (test_quantized.TestQNNPackOps)
  FATAL: 0
  TIMEOUT: 0
  OMIT: 0
```

Differential Revision: D17538375

fbshipit-source-id: b4b60e93fdec4cc7bbf6aee7182381221dfac243
Summary:
- Ports all CUDA tests to TestAutogradDeviceType except those using multiple devices
Pull Request resolved: pytorch#26708

Differential Revision: D17549435

Pulled By: mruberry

fbshipit-source-id: b564186444201d1351934b6a7d21f67bdfca6e3b
Summary: Pull Request resolved: pytorch#22752

Differential Revision: D17543836

Pulled By: Krovatkin

fbshipit-source-id: 5cbca220943a580169bf60ac09780b6e67075d2b
…export API (pytorch#26146)

Summary:
This is a follow-up PR for pytorch#23284. In that PR we had removed changing the default behavior for `keep_initializers_as_input` argument to the export API. With this PR we are enabling that change in that if `keep_initializers_as_input` is not specified then value/behavior for this argument is chosen automatically depending on whether the export type is ONNX or not.

This was part of the earlier PR was removed for further review. The test points have also been updated.

This change may fail some internal tests which may require explicitly setting `keep_initializers_as_input=True` to preserve old behavior.
Pull Request resolved: pytorch#26146

Reviewed By: hl475

Differential Revision: D17369677

Pulled By: houseroad

fbshipit-source-id: 2aec2cff50d215714ee8769505ef24d2b7865a11
Summary:
fix pytorch#26032.
This was broken by a bad openssl release in conda. Should be fixed now. Testing...
Pull Request resolved: pytorch#26697

Differential Revision: D17542095

Pulled By: ailzhang

fbshipit-source-id: ba99f9b36ef2a7c793842cf91bd46fb2634ac1aa
Summary: Pull Request resolved: pytorch#26253

Test Plan: Imported from OSS

Differential Revision: D17529994

Pulled By: jamesr66a

fbshipit-source-id: e3aff71da35b05ed61710cdb88d72b51c944168b
Summary:
Pull Request resolved: pytorch#26680

This was introduced before under the assumption that we'll have a qconv_per_tensor_affine
and a qconv_per_channel_affine, but turns out we don't have these, so we'll remove
thse functions.

Test Plan:
python test/test_jit.py 'TestJit.test_quant_fusion'

Imported from OSS

Differential Revision: D17542607

fbshipit-source-id: b90ce5738170f0922bdc2eb1c4dbecd930f68a48
…ytorch#26581)

Summary:
Pull Request resolved: pytorch#26581

We're currently inlining immediate values of the constants directly into
IR when we generate it providing no way to access these values by their
names later. This change registers such values as atrtibutes of the
module so that they are not lost after IR generation.

Differential Revision: D17513451

Test Plan: Imported from OSS

Pulled By: ZolotukhinM

fbshipit-source-id: cf8f9b450e7178692211abd905ffd2d7ce5a6ce1
Summary: Pull Request resolved: pytorch#26584

Test Plan: Imported from OSS

Differential Revision: D17514653

Pulled By: ZolotukhinM

fbshipit-source-id: 7d9cc8f619b7dbe26fa58eac37cc131929c004d4
Summary: Pull Request resolved: pytorch#26553

Differential Revision: D17551426

Pulled By: driazati

fbshipit-source-id: 53ce05882091aca4617586bc53944ee4c8b3a622
Summary:
If the `Union` contains a non-class type, `issubclass` would fail, this
adds a check for that case
](https://our.intern.facebook.com/intern/diff/17505206/)
Pull Request resolved: pytorch#26312

Pulled By: driazati

Differential Revision: D17505206

fbshipit-source-id: 1331e412f938e2f08ecb079972147f11e3ec77cd
Summary:
Pull Request resolved: pytorch#26681

att

Test Plan:
ci

Imported from OSS

Differential Revision: D17542833

fbshipit-source-id: 653e906b0e146763609c69ef0de7f9cf38621586
Summary:
Pull Request resolved: pytorch#26694

Previously we would not properly populate `errorDesc` for:
```
./torch/jit/__init__.py:13:1: F401 'torch.nn.ModuleList' imported but unused
```

because we wanted only letters and spaces. Be more permissive

Test Plan: Imported from OSS

Differential Revision: D17551999

Pulled By: suo

fbshipit-source-id: b82567df1fa3c9729e7427dc3461bedfb40933dc
Summary:
**Summary**:
Adds `torch::nn::Identity` module support for the C++ API.

**Issue**: pytorch#25883

**Reviewer**: yf225
Pull Request resolved: pytorch#26713

Differential Revision: D17550982

Pulled By: yf225

fbshipit-source-id: f24483846e82d5d276d77a1a0c50884f3bc05112
Summary:
Pull Request resolved: pytorch#26554

Previously, in `TCPStore`'s constructor we did not pass in a timeout to
the `connect` function, which thus used the default timeout (-1, so infinite).
But the timeout variable in `TCPStore.cpp `is configurable by the user and set to
be 300 seconds by default, so we should be passing this into the connect function.

Test Plan: see above.

Differential Revision: D17486779

fbshipit-source-id: 42d38a3b8d492d9e9ff09110990a8e4a3a1292b2
Summary:
Pull Request resolved: pytorch#26728

Use Caffe2::mobile_threadpool() in linear and conv operators

Perf
Without threadpool - 76ms
With threadpool - 41 ms

Test Plan:
python test/test_quantized.py TestQNNPackOps

Imported from OSS

Differential Revision: D17553510

fbshipit-source-id: dd5b06f526f65d87727ec7e3dad0a5fa74cba9f9
Summary:
- Add support for linear and cubic interpolate in opset 11.
- Add support for 1d and 3d interpolate in nearest mode for opset 7 and 8.
- Add tests for all cases of interpolate in ORT tests (nearest/linear/cubic, 1d/2d/3d, upsample/downsample).
Pull Request resolved: pytorch#24805

Reviewed By: hl475

Differential Revision: D17330801

Pulled By: houseroad

fbshipit-source-id: 1bdefff9e72f5e70c51f4721e1d7347478b7505b
Summary:
- Normalization mean and std specified as parameters instead of hardcode
 - imageYUV420CenterCropToFloat32Tensor before this change worked only with square tensors (width==height) - added generalization to support width != height with all rotations and scalings
- javadocs
Pull Request resolved: pytorch#26690

Differential Revision: D17556006

Pulled By: IvanKobzarev

fbshipit-source-id: 63f3321ea2e6b46ba5c34f9e92c48d116f7dc5ce
Summary: Pull Request resolved: pytorch#25592

Test Plan: Imported from OSS

Differential Revision: D17552470

Pulled By: VitalyFedyunin

fbshipit-source-id: 6c8cc4f46dd390c231b2d0aac664ad2a6ac8876e
…chNorm2d.

Test Plan: revert-hammer

Differential Revision:
D17514653

Original commit changeset: 7d9cc8f619b7

fbshipit-source-id: 2cf32082a46fe169a1db4926df78a9f3256616ad
…es of the Module.

Test Plan: revert-hammer

Differential Revision:
D17513451

Original commit changeset: cf8f9b450e71

fbshipit-source-id: 319ec9399173eb06556969dc6be365b319c1ab6c
Summary:
Pull Request resolved: pytorch#26738

someone may use torch._export directly. Here we change the onnx_export_type's default value to None,
and if it's pytorch onnx caffe2 bundle, we set it to ONNX_ATEN_FALLBACK, otherwise, it's ONNX.

Test Plan: ci

Reviewed By: hl475

Differential Revision: D17546452

fbshipit-source-id: 38e53926e2b101484bbbce7b58ebcd6af8c42438
Summary:
Pull Request resolved: pytorch#26587

-
ghstack-source-id: 90557226

Test Plan: unit tests

Differential Revision: D17515048

fbshipit-source-id: 3459ee80efec29080060ec29d67642d789dd8749
raghuramank100 and others added 28 commits September 28, 2019 14:05
Summary:
Pull Request resolved: pytorch#26516

ghstack-source-id: 90982010

Test Plan:
Integrate per-channel support into conv and linear modules.
The following tests pass:
buck test caffe2/test:quantized -- 'test_linear_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details

buck test caffe2/test:quantized -- 'test_conv_api \(test_quantized_nn_mods\.ModuleAPITest\)' --print-passing-details

buck test caffe2/test:quantized -- 'test_float_quant_compare_per_channel \(test_quantized_models\.ModelNumerics\)' --print-passing-details

Differential Revision: D17342622

fbshipit-source-id: f0d618928e3d9348672c589a6b7a47049c372a2e
Summary:
For TensorRT test introduced in pytorch#26426
Pull Request resolved: pytorch#26835

Reviewed By: hl475

Differential Revision: D17580108

Pulled By: houseroad

fbshipit-source-id: c57fafec228b78c26b8a7946c92ad7434425bbd4
…motion. (pytorch#27017)

Summary:
pytorch#26981
Pull Request resolved: pytorch#27017

Differential Revision: D17651454

Pulled By: ifedan

fbshipit-source-id: c6313caa11598a0ef160e1c6d2f3c33d03ce80c5
Summary: Pull Request resolved: pytorch#27008

Test Plan: Imported from OSS

Differential Revision: D17649174

Pulled By: jamesr66a

fbshipit-source-id: e3e6c4bb31e1ad8ed1ebe27f803f90d564ecfe53
Summary: Pull Request resolved: pytorch#26819

Test Plan: Imported from OSS

Differential Revision: D17627829

Pulled By: pbelevich

fbshipit-source-id: be4d803c7d4ba2c59e54d154eeebc63794465191
Summary:
Pull Request resolved: pytorch#26582

Add the blob quantization.
replace the op in the eval/predictor net.

Test Plan:
# Unit test:

-----

buck build fblearner/flow/projects/dper/tests/validators:test_exporter_options_validators

./buck-out/gen/fblearner/flow/projects/dper/tests/validators/test_exporter_options_validators#binary.par

----

buck build caffe2/caffe2/fb/dper/layer_models/tests:exporter_test

./buck-out/gen/caffe2/caffe2/fb/dper/layer_models/tests/exporter_test-2.7#binary.par

Reviewed By: chocjy

Differential Revision: D17439720

fbshipit-source-id: 68de5d0322b0111aeca5ed552210bf80a4cddc78
Summary:
Make cudnn rnn respect current stream. After this lands, non-default test stream can be reenabled in pytorch#26791
Pull Request resolved: pytorch#27026

Test Plan: default stream functionality is tested in existing tests, stream safety tests will be added in pytorch#26791

Differential Revision: D17656967

Pulled By: ngimel

fbshipit-source-id: 8b051aedd1df089b21f666ec553a5acefffdac88
Summary:
Pull Request resolved: pytorch#26368

A boxed fallback function is a per-TensorTypeId generic function that is called if there is no operator specific function (either for the TensorTypeId, or the UndefinedTensorId fallback for that operator).

The outline of the patch:
* Add a new `boxed_fallback_table_` in the top-level ATenDispatch struct (this is NOT per operator) for storing boxed fallback functions. Boxed fallback functions provisionally have the interface `void(const char* schema, torch::jit::Stack*)`; we expect to replace the `const char* schema` with something more detailed in the near future.
* Boxing and unboxing is not supported for all arguments, as IValue doesn't cover the full set of types that are in ATen. The `supports_boxed_fallback` type trait tests if boxing is possible. The list of exclusions here was experimentally generated. One notable exclusion is that we cannot handle any mutable functions, as they return a `Tensor&` which we cannot conveniently manufacture from a Tensor on an IValue stack.
* The actual boxed fallback is handled in two phases. First, we do a (compile time) test to see if boxing/unboxing is supported. If it is, we do a runtime test to check if a fallback function is installed. If both conditions are met, we allocate a stack, push our arguments onto it, and then call the boxed fallback function. The return value is expected to be a single argument on the top of the stack; we retrieve it, unpack it, and return it to the user.

At present, there are no tests for this diff.

This diff also makes `multi_dispatch_tensor_type_set` safer by explicitly specifying that it takes all arguments as references. This prevents the function from accidentally moving in rvalue references (this never actually happened because none of the overloads of apply move out their arguments; but they could have.)

Signed-off-by: Edward Z. Yang <[email protected]>

Differential Revision: D17448555

Test Plan: Imported from OSS

Pulled By: ezyang

fbshipit-source-id: 43a5be7bdfb5c0b466d01a55380cb79e6d938044
Summary:
Pull Request resolved: pytorch#26623

Per-channel fake quant cpu and cuda operators,
per-channel support in fake quant module,
tests for per-channel fake-quant and serializability of fake quant modules

ghstack-source-id: 91008299
ghstack-source-id: 91008299

Test Plan:
buck test mode/dev caffe2/test:fake_quant  --
 Started new test run: https://our.intern.facebook.com/intern/testinfra/testrun/1970324848875929
      ✓ caffe2/test:fake_quant - test_backward_per_tensor (test_fake_quant.TestFakeQuantizePerTensor) 0.242 1/10 (passed)
      ✓ caffe2/test:fake_quant - test_numerical_consistency_per_tensor (test_fake_quant.TestFakeQuantizePerTensor) 0.204 2/10 (passed)
      ✓ caffe2/test:fake_quant - test_fq_serializable (test_fake_quant.TestFakeQuantizePerTensor) 0.174 3/10 (passed)
      ✓ caffe2/test:fake_quant - test_numerical_consistency_per_channel (test_fake_quant.TestFakeQuantizePerChannel) 0.279 4/10 (passed)
      ✓ caffe2/test:fake_quant - test_forward_per_tensor (test_fake_quant.TestFakeQuantizePerTensor) 0.241 5/10 (passed)
      ✓ caffe2/test:fake_quant - test_forward_per_channel (test_fake_quant.TestFakeQuantizePerChannel) 0.353 6/10 (passed)
      ✓ caffe2/test:fake_quant - test_fq_module (test_fake_quant.TestFakeQuantizePerTensor) 0.354 7/10 (passed)
      ✓ caffe2/test:fake_quant - test_backward_per_channel (test_fake_quant.TestFakeQuantizePerChannel) 0.334 8/10 (passed)
      ✓ caffe2/test:fake_quant - test_fq_serializable (test_fake_quant.TestFakeQuantizePerChannel) 0.168 9/10 (passed)
      ✓ caffe2/test:fake_quant - test_fq_module (test_fake_quant.TestFakeQuantizePerChannel) 0.429 10/10 (passed)
      ✓ caffe2/test:fake_quant - main 0.000 (passed)

Differential Revision: D17439406

fbshipit-source-id: 64bfff5e4f40bc2ab8af2b432c7bc33805418077
Summary:
Pull Request resolved: pytorch#26624

For QAT we need to be able to control batch norm for all modules from the top. Adding helper functions to enable/disable batch norm freezing during training
ghstack-source-id: 91008297

Test Plan: buck test caffe2/test:quantization -- --print-passing-details

Differential Revision: D17512199

fbshipit-source-id: f7b981e2b1966ab01c4dbb161030177274a998b6
…st (pytorch#26625)

Summary:
Pull Request resolved: pytorch#26625

ghstack-source-id: 91008296

Test Plan: buck test caffe2/test:quantized -- 'test_weight_only_activation_only_fakequant \(test_quantized_models\.ModelNumerics\)' --print-passing-details

Differential Revision: D17520342

fbshipit-source-id: 26e148d3299afcfdfb1187aff6ab80687ed8df47
Summary:
Pull Request resolved: pytorch#26627

ghstack-source-id: 91008337

Test Plan: buck test caffe2/test:quantization -- --print-passing-details

Differential Revision: D17518194

fbshipit-source-id: 1eb8a7a85dc811c4ee5228d68563abb157613ceb
Summary:
Fixes: pytorch#26038

Somewhere between v1.1 and master `nonzero` become `abstract` and was marked as differentiable (by mistake) we need to but them into TH section of `tools/autograd/derivatives.yaml ` to fix it.
Pull Request resolved: pytorch#26980

Differential Revision: D17632276

Pulled By: VitalyFedyunin

fbshipit-source-id: d6cabcc53348af6148cea5a1bd1af2ef12547373
Summary: Pull Request resolved: pytorch#27031

Differential Revision: D17665998

Pulled By: ezyang

fbshipit-source-id: 6926e304c75ba878520627f1e829412f633b1bec
Summary:
Fixes pytorch#26998
Pull Request resolved: pytorch#27047

Differential Revision: D17666050

Pulled By: ezyang

fbshipit-source-id: f02ebd5320ae25f8949be20d0744fe3cd3e2fee9
Summary:
Pull Request resolved: pytorch#27056

Original commit changeset: 137b1faa86b4

As per discussion in D17330625, unlanding until CPU/GPU versions are split

Test Plan: waitforsandcastle

Differential Revision: D17663237

fbshipit-source-id: 93fdc6b40c6a6fc610342a74eaaa1886ddc8c519
Summary:
Pull Request resolved: pytorch#26579

Remove `linear_prepack` call and attach a module to the
parent class that contains the packed weight and bias,
this is to support serialization of the quantized model
since the packed weight and bias is not serializable and
we need to overwrite the `__getstate__` and `__setstate__`
function to be able to serialize them

Test Plan:
python test/test_jit.py

Imported from OSS

Differential Revision: D17636397

fbshipit-source-id: 3b81b6faa4413e4309453fd6acec2f0be6fd2f16
Summary:
Pull Request resolved: pytorch#26761

This PR serialize autograd ops into its own namespace by turning the
serialization op name into `torch.autograd.op`, this is to keep the
original code namespace rather than turning all to the global namespace,
this will be more properly handled in the future when we handle the module
namespace. This change also preserve BC until we have namespace handling

Test Plan: Imported from OSS

Differential Revision: D17645438

fbshipit-source-id: 656ec6b31d4fc2252585de73117c4d40a122678e
Summary:
Pull Request resolved: pytorch#26939

Updated quant fusion patterns to work with modules with prepack
params folded into module.

Test Plan:
python test/test_jit.py 'TestJit.test_quant_fusion'

Imported from OSS

Differential Revision: D17636398

fbshipit-source-id: 8e7917e981260b81ed6038a1c2ccf19049726395
Summary:
This PR adds the ability to use tuples directly with the IValue constructor rather than the vector<IValue> approach
Pull Request resolved: pytorch#26668

Differential Revision: D17668653

Pulled By: bwasti

fbshipit-source-id: aff7f62fe3b502df78b28b2355cff88d88ad288c
Test Plan: revert-hammer

Differential Revision:
D17666050

Original commit changeset: f02ebd5320ae

fbshipit-source-id: 6bc8fe583e350e2b573f767af85d1287dd048d1f
…6993)

Summary:
We always build the Release version of Sleef on gcc 7.

    Sep 26 02:59:19 cd /var/lib/jenkins/cpp-build/caffe2/build/sleef/src/libm && /opt/cache/bin/cc  -DDORENAME=1 -DENABLE_ALIAS=1 -DENABLE_BUILTIN_MATH=1 -DENABLE_PUREC_SCALAR=1 -DENABLE_SYS_getrandom=1 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DSLEEF_STATIC_LIBS=1 -DTH_BLAS_MKL -D_FILE_OFFSET_BITS=64 -I/var/lib/jenkins/cpp-build/caffe2/build/aten/src -I/var/lib/jenkins/workspace/aten/src -I/var/lib/jenkins/cpp-build/caffe2/build -I/var/lib/jenkins/workspace -isystem /var/lib/jenkins/cpp-build/caffe2/build/third_party/gloo -isystem /var/lib/jenkins/workspace/cmake/../third_party/gloo -isystem /var/lib/jenkins/workspace/cmake/../third_party/googletest/googlemock/include -isystem /var/lib/jenkins/workspace/cmake/../third_party/googletest/googletest/include -isystem /var/lib/jenkins/workspace/third_party/protobuf/src -isystem /opt/python/2.7.9/include -isystem /var/lib/jenkins/workspace/third_party/gemmlowp -isystem /var/lib/jenkins/workspace/third_party/neon2sse -I/var/lib/jenkins/workspace/cmake/../third_party/benchmark/include -isystem /var/lib/jenkins/workspace/third_party -isystem /var/lib/jenkins/workspace/cmake/../third_party/eigen -isystem /var/lib/jenkins/workspace/torch/include -isystem /opt/rocm/hip/include -isystem /include -I/var/lib/jenkins/cpp-build/caffe2/build/caffe2/contrib/aten -I/var/lib/jenkins/workspace/third_party/onnx -I/var/lib/jenkins/cpp-build/caffe2/build/third_party/onnx -I/var/lib/jenkins/workspace/third_party/foxi -I/var/lib/jenkins/cpp-build/caffe2/build/third_party/foxi -isystem /var/lib/jenkins/workspace/third_party/ideep/include -I/var/lib/jenkins/workspace/third_party/NNPACK/include -I/var/lib/jenkins/workspace/third_party/NNPACK/src -I/var/lib/jenkins/workspace/third_party/cpuinfo/include -I/var/lib/jenkins/workspace/third_party/pthreadpool/include -I/var/lib/jenkins/workspace/third_party/FXdiv/include -I/var/lib/jenkins/workspace/third_party/psimd/include -I/var/lib/jenkins/workspace/third_party/FP16/include -I/var/lib/jenkins/workspace/third_party/sleef/src/common -I/var/lib/jenkins/workspace/third_party/sleef/src/arch -I/var/lib/jenkins/cpp-build/caffe2/build/sleef/src/libm/include -I/var/lib/jenkins/workspace/third_party/sleef/src/libm  -Wall -Wno-unused -Wno-attributes -Wno-unused-result -Wno-psabi -ffp-contract=off -fno-math-errno -fno-trapping-math -g -O1 -fPIC   -DCAFFE2_USE_GLOO -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -std=gnu99 -o CMakeFiles/sleefpurec_scalar.dir/sleefsimdsp.c.o   -c /var/lib/jenkins/workspace/third_party/sleef/src/libm/sleefsimdsp.c
    Sep 26 02:59:20 /var/lib/jenkins/workspace/third_party/sleef/src/libm/sleefsimdsp.c: In function 'gammafk':
    Sep 26 02:59:20 /var/lib/jenkins/workspace/third_party/sleef/src/libm/sleefsimdsp.c:3103:1: internal compiler error: in trunc_int_for_mode, at explow.c:55
    Sep 26 02:59:20  }
    Sep 26 02:59:20  ^
    Sep 26 02:59:20 Please submit a full bug report,
    Sep 26 02:59:20 with preprocessed source if appropriate.
    Sep 26 02:59:20 See <file:///usr/share/doc/gcc-7/README.Bugs> for instructions.
    Sep 26 02:59:20 sleef/src/libm/CMakeFiles/sleefpurec_scalar.dir/build.make:67: recipe for target 'sleef/src/libm/CMakeFiles/sleefpurec_scalar.dir/sleefsimdsp.c.o' failed
    Sep 26 02:59:20 make[2]: Leaving directory '/var/lib/jenkins/cpp-build/caffe2/build'

Also updated Sleef submodule to include fixes that are missed in pytorch#26749

pytorch#26994 provides a potentially cleaner fix

Close pytorch#26892
Pull Request resolved: pytorch#26993

Differential Revision: D17669103

Pulled By: ezyang

fbshipit-source-id: 1b87a4a8fecc6441de3b008aee6929537768be1a
…27061)

Summary:
Pull Request resolved: pytorch#27061

Previously the cronjobs were run on master, but now the nightly builds
count as "PRs" so we must whitelist them from should_run calculation.

Signed-off-by: Edward Z. Yang <[email protected]>

Test Plan: Imported from OSS

Differential Revision: D17669066

Pulled By: ezyang

fbshipit-source-id: 3b92bf1d09aefa7ef524ea93dfa8c6f566161887
Summary:
Fixes pytorch#26333

This effectively rewrites all infix arithmetic operators *op* as:
```python
def __op__(self, other):
    try:
        return self.op(other)
    except TypeError:
        return NotImplemented
```

Where `TypeError` is raised from the argument parser when `other` is not a `Tensor`. This should be okay, so long as `TypeError` isn't raised by the function for any other reasons. I couldn't find any examples where this was an issue.
Pull Request resolved: pytorch#26507

Differential Revision: D17669097

Pulled By: ezyang

fbshipit-source-id: 2c1a1087057c9298915d713d3fea7d682d014c72
…ytorch#26908)

Summary:
Pull Request resolved: pytorch#26908

This is because in static dispatch, variable is not supported.  But
you might still "have" a variable somehow (e.g., in JIT initialization
code), so we need to translate between the two worlds.

Attempt to fix pytorch#26764

Signed-off-by: Edward Z. Yang <[email protected]>

Test Plan: Imported from OSS

Differential Revision: D17607550

Pulled By: ezyang

fbshipit-source-id: ea2cb931215d749efd95259ad7a1a8ae96c38aed
@rohithkrn
Copy link
Author

ROCm tests passed. Merging.

@rohithkrn rohithkrn merged commit 0e6c9c8 into ROCm:bf16_bringup Oct 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.