Skip to content

[ONNX] Add checks in ONNXSetDynamicInputShape#49366

Closed
jiafatom wants to merge 17 commits intopytorch:onnx_ms_1from
jiafatom:shape_axis_ms_1
Closed

[ONNX] Add checks in ONNXSetDynamicInputShape#49366
jiafatom wants to merge 17 commits intopytorch:onnx_ms_1from
jiafatom:shape_axis_ms_1

Conversation

@jiafatom
Copy link
Copy Markdown
Contributor

@jiafatom jiafatom commented Dec 14, 2020

(1) This PR adds a exception check for ONNXSetDynamicInputShape.
(2) Import torch vision CI tests to pytorch CI tests, then we can monitor the test failure while submitting PRs.

This PR is originally a duplicate one from here
We need to merge towards pytorch:onnx_ms_1 now.

@facebook-github-bot facebook-github-bot added cla signed oncall: jit Add this issue/PR to JIT oncall triage queue labels Dec 14, 2020
@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Dec 14, 2020

💊 CI failures summary and remediations

As of commit 0fe3504 (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_backward_compatibility_check_test (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Dec 22 01:16:19 The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not.
Dec 22 01:16:19 processing existing schema:  gather(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, Tensor _2, int _3) -> (__torch__.torch.classes.dist_c10d.Work _0)
Dec 22 01:16:19 processing existing schema:  scatter(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor _1, Tensor[] _2, int _3) -> (__torch__.torch.classes.dist_c10d.Work _0)
Dec 22 01:16:19 processing existing schema:  reduce_scatter(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor _1, Tensor[] _2) -> (__torch__.torch.classes.dist_c10d.Work _0)
Dec 22 01:16:19 processing existing schema:  alltoall_base(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor _1, Tensor _2, int[] _3, int[] _4) -> (__torch__.torch.classes.dist_c10d.Work _0)
Dec 22 01:16:19 processing existing schema:  alltoall(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, Tensor[] _2) -> (__torch__.torch.classes.dist_c10d.Work _0)
Dec 22 01:16:19 processing existing schema:  send(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2, int _3) -> (__torch__.torch.classes.dist_c10d.Work _0)
Dec 22 01:16:19 processing existing schema:  recv(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2, int _3) -> (__torch__.torch.classes.dist_c10d.Work _0)
Dec 22 01:16:19 processing existing schema:  recv_anysource(__torch__.torch.classes.dist_c10d.ProcessGroup _0, Tensor[] _1, int _2) -> (__torch__.torch.classes.dist_c10d.Work _0)
Dec 22 01:16:19 processing existing schema:  barrier(__torch__.torch.classes.dist_c10d.ProcessGroup _0) -> (__torch__.torch.classes.dist_c10d.Work _0)
Dec 22 01:16:19 processing existing schema:  __init__(__torch__.torch.classes.dist_rpc.WorkerInfo _0, str _1, int _2) -> (None _0)
Dec 22 01:16:19 The PR is introducing backward incompatible changes to the operator library. Please contact PyTorch team to confirm whether this change is wanted or not. 
Dec 22 01:16:19 
Dec 22 01:16:19 Broken ops: [
Dec 22 01:16:19 	aten::_cudnn_rnn_backward(Tensor input, Tensor[] weight, int weight_stride0, Tensor weight_buf, Tensor hx, Tensor? cx, Tensor output, Tensor? grad_output, Tensor? grad_hy, Tensor? grad_cy, int mode, int hidden_size, int proj_size, int num_layers, bool batch_first, float dropout, bool train, bool bidirectional, int[] batch_sizes, Tensor? dropout_state, Tensor reserve, bool[4] output_mask) -> (Tensor, Tensor, Tensor, Tensor[])
Dec 22 01:16:19 	aten::_cudnn_rnn(Tensor input, Tensor[] weight, int weight_stride0, Tensor? weight_buf, Tensor hx, Tensor? cx, int mode, int hidden_size, int proj_size, int num_layers, bool batch_first, float dropout, bool train, bool bidirectional, int[] batch_sizes, Tensor? dropout_state) -> (Tensor, Tensor, Tensor, Tensor, Tensor)
Dec 22 01:16:19 	aten::_cudnn_rnn_flatten_weight(Tensor[] weight_arr, int weight_stride0, int input_size, int mode, int hidden_size, int proj_size, int num_layers, bool batch_first, bool bidirectional) -> (Tensor)
Dec 22 01:16:19 	aten::dequantize.list(Tensor[] qtensors) -> (Tensor[])
Dec 22 01:16:19 	aten::to_dense(Tensor self, int? dtype=None) -> (Tensor)
Dec 22 01:16:19 	aten::sinc_(Tensor(a!) self) -> (Tensor(a!))
Dec 22 01:16:19 	aten::tensor_split.tensor_indices_or_sections(Tensor(a) self, Tensor tensor_indices_or_sections, int dim=0) -> (Tensor[])
Dec 22 01:16:19 	aten::native_layer_norm(Tensor input, int[] normalized_shape, Tensor? weight, Tensor? bias, float eps) -> (Tensor, Tensor, Tensor)

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 36 times.

@codecov
Copy link
Copy Markdown

codecov bot commented Dec 16, 2020

Codecov Report

Merging #49366 (a531ec8) into onnx_ms_1 (5912316) will increase coverage by 0.01%.
The diff coverage is 96.44%.

@@              Coverage Diff              @@
##           onnx_ms_1   #49366      +/-   ##
=============================================
+ Coverage      80.63%   80.65%   +0.01%     
=============================================
  Files           1875     1875              
  Lines         202714   202966     +252     
=============================================
+ Hits          163453   163695     +242     
- Misses         39261    39271      +10     

auto shape_ref = input_tensor_type->symbolic_sizes().sizes();
TORCH_CHECK(
shape_ref.has_value(), "Input tensor shape should have value.");
auto shape = shape_ref.value();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have tests to verify this part?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, currently we don't have a proper designed test for it. (1) All existing tests passed (2) Will add the test once we find a proper one.

smessmer and others added 6 commits December 20, 2020 21:52
Summary:
Pull Request resolved: pytorch#49013

I don't know why this works. I know, this is never a good way to start a PR description :P
I know that Generator is a dispatch relevant argument when called from an unboxed API and is ignored
for dispatch purposes when called from a boxed API. This should break something, but maybe we don't
have test cases for that.

We likely need to align the unboxed and boxed dispatch behavior before landing this.
The best solution would be to make Generator not dispatch relevant in unboxing. But that might be a bigger change.
An acceptable solution could be to make Generator dispatch relevant in boxing, but that needs perf measurements.

This PR needs further discussion.
ghstack-source-id: 118619230

(Note: this ignores all push blocking failures!)

Test Plan: waitforsandcastle

Reviewed By: bhosmer

Differential Revision: D25394998

fbshipit-source-id: f695c659ee6e3738f74cdf0af1a514ac0c30ebff
@jiafatom
Copy link
Copy Markdown
Contributor Author

Unexpected conflict issue, closing this PR, duplicate one here.

@jiafatom jiafatom closed this Dec 23, 2020
@jiafatom jiafatom deleted the shape_axis_ms_1 branch December 23, 2020 04:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed oncall: jit Add this issue/PR to JIT oncall triage queue open source

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants