[Onnx] Add constant folding to ONNX graph during export (Resubmission) #18698

spandantiwari · 2019-04-01T20:40:53Z

Rewritten version of #17771 using graph C++ APIs.

This PR adds the ability to do constant folding on ONNX graphs during PT->ONNX export. This is done mainly to optimize the graph and make it leaner. The two attached snapshots show a multiple-node LSTM model before and after constant folding.
A couple of notes:

Constant folding is by default turned off for now. The goal is to turn it on by default once we have validated it through all the tests.
Support for folding in nested blocks is not in place, but will be added in the future, if needed.

Original Model:

Constant-folded model:

spandantiwari · 2019-04-05T00:55:54Z

I see a failing test in one of the runs, but it does not seem related to this change. Also, I cannot repro it locally. @suo - Could you please help take a quick look?

houseroad

briefly scanned the pr. This is really useful. It makes the onnx graph much simpler (and maybe much faster). Thanks!

Will do a thorough pass tomorrow or weekend.

houseroad · 2019-04-05T04:54:42Z

test/onnx/test_pytorch_onnx_caffe2.py

I am wondering if we enable the flag, how many tests will pass?

@spandantiwari do you want to give it a try?

@houseroad, maybe as the next PR. :-)

i mean locally first, i am wondering how stable it is :-)

houseroad · 2019-04-05T04:55:54Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

new line character

torch/onnx/utils.py

houseroad · 2019-04-05T04:58:54Z

btw, the failing test should be unrelaed, i have restarted it

suo

Looks like a great start! I have only reviewed the constant folding pass so far (not yet at the runTorchBackendForOnnx() or testing stuff).

The pass looks generally correct, so my comments are mostly about making things more robust/idiomatic with the rest of the JIT code. I am going to put some general themes here and refer to them inline.

On uniqueName(): we should avoid using uniqueName() to identify a value. This is fragile because if you ever call setUniqueName("foo"), any value that previously had the name "foo" will get renamed something else (like "foo.1". So values can get arbitrarily renamed by graph transformations, making the name an unreliable way to reference any given value.

We typically use the direct Value* to reference a value. It's guaranteed to be a unique and stable identifier.

On paramDict: This uses uniqueName(), which is generally an anti-pattern. However, because the onnx export pass manually assigns names to the parameter values, we can be sure that at the beginning of the pass at least, the names refer to the right values in the graph.

So in keeping with using Value* to reference values (as mentioned in [1]), I think the first thing the pass should do is map that input Value* to its corresponding (name, tensor), and use that mapping as a source of truth. So isParam becomes isParam(Value* v), which just checks whether v is in the std::unordered_map<Value*, std::pair<std::string, at::Tensor>> (feel free to make that data structure whatever you prefer).

At the end of the pass, you can use that same mapping to fix up paramDict to erase the folded parameters.

It seems like the fact that constants could be onnx::constant or a parameter value is adding a lot of complexity to the pass. We're doing a lot of tracking for which input is which constant kind.

If you had the data structure I mentioned in [2], isParam() is an easy map lookup and thus you don't have to "save" that info when checking node inputs for constants, which eliminates the need for kindOfLeafNode and its cousin, ConstantLeafNodeKind.

suo · 2019-04-05T02:29:23Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

I think Block::param_node() is what you want here. Confusingly, it is called param_node, which overloads the word "parameter" in this pass 😢

That's useful. Thanks for pointing to it.

suo · 2019-04-05T02:29:59Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

as mentioned below, blocks have direct accessors for their param nodes: b.param_node()

suo · 2019-04-05T02:31:21Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

you could AT_ASSERT this instead of returning; it should be an invariant that every block has a param node.

suo · 2019-04-05T02:33:29Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

this does not appear to be used anywhere. In general, we discourage converting a Symbol (basically an integer) to its display string unless for printing purposes.

suo · 2019-04-05T04:34:40Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

I think it's clearer to do inputNode->kindOf(attr::value) == AttributeKind::t

Yes. Fixed.

suo · 2019-04-05T05:42:07Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

I think instead of adding more outputs to the block's source node (which will add inputs to the graph), you should just insert a new onnx::constant node. Unless you want to expose them as parameters later instead of embedding them in the graph?

As we discussed, the use of initializers is preferred from ONNX point of view (to distinguish between learnable parameters and real constants.), hence keeping it.

suo · 2019-04-05T05:44:01Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

yeah this is fine. Or you can run an EliminateDeadCode() pass after this and it should clean up the nodes for you

suo · 2019-04-05T05:48:06Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

We are using names/indices a lot here when it's more idiomatic to use Value* directly. (see [1])

suo · 2019-04-05T05:50:50Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

If you have the data structure described in [2] this check would just be:

for input in node: if isParam(input): toRemove.push_back(input)

Agreed. Fixed.

suo · 2019-04-05T05:54:40Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

Similarly, if you had the data structure described in [2], this would be

for v in toRemove: if !v->hasUses(): name = data_structure[v].first paramsDict.erase(name)

spandantiwari · 2019-04-09T21:54:03Z

@suo - thanks for your thoughtful review. I have revised the PR extensively based on your feedback. Could you please take a look and see if it looks good.

spandantiwari · 2019-04-10T18:10:45Z

@houseroad, @suo - a lot of the failures, maybe most of them, are unrelated.

spandantiwari · 2019-04-10T18:49:59Z

Doing another run.

houseroad · 2019-04-10T18:50:07Z

@spandantiwari circle ci is down now, let's wait a bit and restart all the tests.

spandantiwari · 2019-04-10T20:12:23Z

@houseroad - OK. Thanks for letting me know.

suo

Looks pretty good from my side! Not sure about the onnx->aten implementations, so I've asked @houseroad to eyeball those.

suo · 2019-04-10T22:46:53Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

These changes look good! My one remaining nit is that we shouldn't operate on an entity called sourceNode, since it more clearly expresses your intent to operate on block inputs (as a reminder, block->inputs() is equivalent to param_node()->outputs()).

Good point. Updated.

suo · 2019-04-10T22:48:00Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

@houseroad, can you take a look at these cases? I'm not familiar with the onnx->aten semantics

Yes, that will be great. Just fyi - I have tested this part on multiple large production models, including BERT, to make sure that these computations are correct, and the numbers match exactly.

Updated based on @houseroad 's feedback on this section.

suo · 2019-04-10T22:48:56Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

I think it would be clearer for you to return a c10::optionalat::Tensor from this function, and return c10::nullopt in the case that node is not a supported op for constant folding.

suo · 2019-04-10T22:49:24Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

same as above; returning c10::nullopt instead of an empty tensor will express your intent more clearly.

facebook-github-bot

@houseroad has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

houseroad · 2019-04-11T05:31:47Z

Import to check whether it breaks anything internal :-)

houseroad

Great work. This is very helpful. For testing, we should create test cases to cover 1) Slice, 2) Concat, 3) Transpose, 4) Unsqueeze, 5) No optimization.

It can be added here: https://github.com/pytorch/pytorch/blob/master/test/onnx/test_utility_funs.py
We can just run the optimization pass, and iterate over the graph to check the change after the optimization. Here are some examples how we check the optimizer work in ONNX: https://github.com/onnx/onnx/blob/master/onnx/test/optimizer_test.py#L137

houseroad · 2019-04-11T05:35:20Z

test/onnx/verify.py

can we just call _jit_flatten unconditionally?

I tried all the tests in test_pytorch_onnx_caffe2.py and nothing seems to break when we call _jit_flatten unconditionally. Conditional removed.

houseroad · 2019-04-11T05:38:04Z

torch/onnx/utils.py

Document the argument do_constant_folding like other arugments?

houseroad · 2019-04-11T05:40:44Z

test/onnx/test_pytorch_onnx_caffe2.py

Can we change run_model_test's interface as well? both run_actual_test and run_debug_test will need to add do_constant_folding argument.

Yes, sure. Updated.

houseroad · 2019-04-11T05:41:54Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

why do we need this?

We don't. Removed.

houseroad · 2019-04-11T05:48:25Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

what's this for?

Vestige from older version. Removed.

houseroad · 2019-04-11T05:53:09Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

usually, the optimization pass should not directly throw exception. And axes are optional... we should handle such cases. Probably just skip the node.

OK. Removed the exception, just skipping node instead.

houseroad · 2019-04-11T05:57:12Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

check axis exist before running?

houseroad · 2019-04-11T06:09:30Z

torch/onnx/utils.py

We can also check whether the opset is supported by our constant folding. Op spec may change, and the optimization may not be applicable any more.

Yes, good point. Now checking for opset version (opset 9).

houseroad · 2019-04-11T06:18:23Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

can we extract this as a helper function?

houseroad · 2019-04-11T06:21:55Z

torch/csrc/jit/passes/onnx/constant_fold.cpp

We can put everything into anonymous namespace except ConstantFoldONNX. So they don't need to be static i think.

I think both are viable and have the same effect. But your suggestion is cleaner. Updated with anonymous namespace.

spandantiwari · 2019-04-12T22:00:36Z

@houseroad and @suo - Thanks for your feedbacks. I have updated the PR. Could you please review and see if this is good for merge.

spandantiwari · 2019-04-12T22:13:16Z

Great work. This is very helpful. For testing, we should create test cases to cover 1) Slice, 2) Concat, 3) Transpose, 4) Unsqueeze, 5) No optimization.

@houseroad - As per your suggestion, I have added tests in test_utility_funs.py that test constant folding at the graph level. Between the two test points that I have added, I am testing all the nodes that are currently supported in onnx constant folding pass.

I could not add test for individual ops because if we create a minimal model with a given op, then the torch._C._jit_pass_constant_propagation pass jit pass optimizes it at the jit graph level, and it does not show up in the onnx graph at all. Therefore, we cannot write a viable test using those models. The constant folding pass is aimed at folding those nodes that are added during one of the onnx passes. Therefore, in our tests we are limited to using those ops that produce constant-foldable nodes during the onnx passes. LSTM/GRU ops is one such example. That is why I have used these ops in my test points. Additionally, the test points are checking the end-to-end model also, as you had recommended.

Between the two test points added to test_utility_funs.py, all the nodes that support constant-folding in onnx graphs are being tested.

facebook-github-bot

@houseroad has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

spandantiwari · 2019-04-15T23:13:20Z

@houseroad - as per your suggestion, I have now added unit tests for each individual node type supported for constant folding (Concat, Unsqueeze, Transpose, Slice). In addition, the end-to-end tests for larger models (e.g. using LSTM) are also there. As you suggested, for testing purpose only, I have added a flag _disable_torch_constant_prop to _model_to_graph() that I use in the test points. Also, the new test points required support for constant folding onnx::Cast nodes also, which is also now added.

@houseroad and @suo - could you please review and see if this is ready for merging. Thanks!

facebook-github-bot

@houseroad has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

suo · 2019-04-16T20:42:07Z

The changes look good to me overall, we just need to figure out why caffe2_onnx_py2_gcc5_ubuntu16_04_test is failing. @zrphercule can you help debug it?

spandantiwari · 2019-04-16T22:43:31Z

Wasn't able to repro the test failure on my side. But based on a hunch, I updated the PR with a fix related to _export_onnx_opset_version. Also added some debug info to the failing asserts, which I plan to remove once all the tests start to pass.

spandantiwari · 2019-04-17T00:39:09Z

@houseroad, @suo - Fingers crossed, but the failing platform from previous runs is passing now. The issue seems to have been in setting the opset version using torch.onnx.symbolic._set_opset_version, which I have updated.

I will wait for the all the runs to finish, and if all green I will update the PR to clean up the debugging instrumentation I added in last commit.

houseroad · 2019-04-17T04:49:56Z

@spandantiwari the tests are passing now :-)

spandantiwari · 2019-04-17T06:05:53Z

@spandantiwari the tests are passing now :-)

@houseroad - Yes. And I have updated the PR to clean up any debug-related messages too.

houseroad · 2019-04-17T06:12:01Z

lint please

facebook-github-bot

@houseroad has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

houseroad

Looks good to me.

houseroad · 2019-04-18T04:48:41Z

@suo, friendly ping for another review :-)

suo · 2019-04-18T04:56:01Z

This looks good to merge

facebook-github-bot · 2019-04-18T07:15:47Z

@houseroad merged this pull request in a64cce3.

Summary: Rewritten version of pytorch/pytorch#17771 using graph C++ APIs. This PR adds the ability to do constant folding on ONNX graphs during PT->ONNX export. This is done mainly to optimize the graph and make it leaner. The two attached snapshots show a multiple-node LSTM model before and after constant folding. A couple of notes: 1. Constant folding is by default turned off for now. The goal is to turn it on by default once we have validated it through all the tests. 2. Support for folding in nested blocks is not in place, but will be added in the future, if needed. **Original Model:** ![multiple_lstm_original](https://user-images.githubusercontent.com/23646532/53987630-6ac53980-40d6-11e9-9702-1ccfee124a83.JPG) **Constant-folded model:** ![multiple_lstm_constant_folded](https://user-images.githubusercontent.com/23646532/53987632-6c8efd00-40d6-11e9-81c5-362c16f68861.JPG) Pull Request resolved: pytorch/pytorch#18698 Differential Revision: D14889768 Pulled By: houseroad fbshipit-source-id: b6616b1011de9668f7c4317c880cb8ad4c7b631a

spandantiwari · 2019-04-18T17:18:57Z

@houseroad @suo - thank you for your review and all the work on this. Much appreciated.

Summary: This is a follow up PR of pytorch#18698 to lint the code using clang-format. Pull Request resolved: pytorch#19398 Differential Revision: D14994517 Pulled By: houseroad fbshipit-source-id: 2ae9f93e66ce66892a1edc9543ea03932cd82bee

…rch#18698) Summary: Rewritten version of pytorch#17771 using graph C++ APIs. This PR adds the ability to do constant folding on ONNX graphs during PT->ONNX export. This is done mainly to optimize the graph and make it leaner. The two attached snapshots show a multiple-node LSTM model before and after constant folding. A couple of notes: 1. Constant folding is by default turned off for now. The goal is to turn it on by default once we have validated it through all the tests. 2. Support for folding in nested blocks is not in place, but will be added in the future, if needed. **Original Model:** ![multiple_lstm_original](https://user-images.githubusercontent.com/23646532/53987630-6ac53980-40d6-11e9-9702-1ccfee124a83.JPG) **Constant-folded model:** ![multiple_lstm_constant_folded](https://user-images.githubusercontent.com/23646532/53987632-6c8efd00-40d6-11e9-81c5-362c16f68861.JPG) Pull Request resolved: pytorch#18698 Differential Revision: D14889768 Pulled By: houseroad fbshipit-source-id: b6616b1011de9668f7c4317c880cb8ad4c7b631a

Summary: This is a follow up PR of pytorch#18698 to lint the code using clang-format. Pull Request resolved: pytorch#19398 Differential Revision: D14994517 Pulled By: houseroad fbshipit-source-id: 2ae9f93e66ce66892a1edc9543ea03932cd82bee

ezyang · 2019-10-29T14:25:38Z

test/onnx/test_utility_funs.py

+            assert node.kind() != "onnx::Concat"
+            assert node.kind() != "onnx::Cast"
+            assert node.kind() != "onnx::Constant"
+        assert len(list(graph.nodes())) == 1


@spandantiwari, this is a very late comment, but I don't understand what this test is doing. To me, it looks like you are tracing a program that returns a single constant, and what I expect to happen here is to get a graph with a single node that returns a constant. But what I get instead is:

graph(%8 : Float(1, 3), %9 : Float(2, 3)): %6 : Float(2, 3) = onnx::Add(%8, %9), scope: ConcatModule # test/onnx/test_utility_funs.py:143:0 return (%6)

This is extremely unexpected! What happened to the constant folding?

After filing #28834 I think I understand why the output is the way it is. But I still think the way this test is written, versus the way the optimization pass is written, is very strange. According to the optimization pass, you never move just Constant nodes to the initialization graph. But then you assert that the optimized output should never have Constant node in it. Which one is it? In the current master implementation, this "happens" to work because every constant node always has a cast immediately after it. But they really shouldn't!

@ezyang - I would love to be able to improve this test point (or constant folding for that matter), but I am not sure I fully understand the question.

First, as I asked in #28834 also (I saw that issue before I saw your comment here), I am not sure I understand what is meant by initialization graph.

The goal of this test is simple. I have a graph where I have two constants feeding into a Concat node, followed by an Add op. Regarding the Cast op, I recall that they were not intentionally part of this test, but appeared in the graph when we set _disable_torch_constant_prop=True. However, since Cast is supported in constant folding they get folded too. Anyway, this test was supposed to test that the Concat node is constant folded (together with Cast op). The new folded constant after collapsing the onnx::Constant and the onnx::Concat node shows up as onnx initializer and not onnx::Constant in the final ONNX graph. This is why we check that there are no onnx::Constant, cast, or concat nodes.

As an aside - looking back at this I feel this test would be better if I use the input x to the forward() method in the model. I think it will make things clearer. I will try to update the test.

what I expect to happen here is to get a graph with a single node that returns a constant.

Forgot to mention - we do not support Add in constant folding yet, but if we did you are right in that the Add op should have been folded too. You would be left with a single intializer in the final ONNX graph.

LEt's move this discussion to the issue

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Apr 1, 2019

suo self-requested a review April 1, 2019 21:07

eellison self-requested a review April 4, 2019 23:01

houseroad reviewed Apr 5, 2019

View reviewed changes

suo requested changes Apr 5, 2019

View reviewed changes

spandantiwari force-pushed the spandantiwari/constant_folding_cpp2 branch from b2af37f to 31d73d7 Compare April 9, 2019 21:45

spandantiwari force-pushed the spandantiwari/constant_folding_cpp2 branch from 31d73d7 to 6501c49 Compare April 10, 2019 00:47

suo self-requested a review April 10, 2019 22:10

suo reviewed Apr 10, 2019

View reviewed changes

facebook-github-bot reviewed Apr 11, 2019

View reviewed changes

houseroad reviewed Apr 11, 2019

View reviewed changes

spandantiwari force-pushed the spandantiwari/constant_folding_cpp2 branch from fabd6d3 to fed8e27 Compare April 12, 2019 00:18

facebook-github-bot reviewed Apr 15, 2019

View reviewed changes

spandantiwari force-pushed the spandantiwari/constant_folding_cpp2 branch from 7514985 to 4954cb0 Compare April 15, 2019 23:05

Spandan Tiwari added 5 commits April 15, 2019 18:10

Add constant folding optimization to ONNX graph export.

622a9af

Removed print statement from entrypoint in init.cpp.

645662b

Refactoring - initial checkpoint.

93a8918

Refactored to move away from name-based logic.

ec334e3

Rebasing to latest master. Fixing conflict, and minor lint changes.

deffae9

facebook-github-bot reviewed Apr 16, 2019

View reviewed changes

Fixing bug in new unit tests.

1bcf0ee

joohoonl-nv mentioned this pull request Apr 16, 2019

ONNX graph exporter: Gather operator on Axis 0 #17697

Closed

Spandan Tiwari added 2 commits April 16, 2019 14:34

Fixing issue with setting _export_onnx_opset_version in symbolic.py.

39270ac

Adding messages with assert in unit tests to root cause CI failure.

d9eb93e

Cleaning debugging messages.

cb56265

Fixing lint.

94197b4

facebook-github-bot reviewed Apr 17, 2019

View reviewed changes

houseroad approved these changes Apr 17, 2019

View reviewed changes

suo approved these changes Apr 18, 2019

View reviewed changes

facebook-github-bot closed this in a64cce3 Apr 18, 2019

facebook-github-bot added the merged label Apr 18, 2019

houseroad mentioned this pull request Apr 18, 2019

[ONNX] Clean the onnx constant fold code a bit #19398

Closed

ezyang added the open source label Jun 24, 2019

ezyang reviewed Oct 29, 2019

View reviewed changes

[Onnx] Add constant folding to ONNX graph during export (Resubmission) #18698

[Onnx] Add constant folding to ONNX graph during export (Resubmission) #18698

Uh oh!

Conversation

spandantiwari commented Apr 1, 2019

Uh oh!

spandantiwari commented Apr 5, 2019

Uh oh!

houseroad left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

houseroad commented Apr 5, 2019

Uh oh!

suo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

spandantiwari commented Apr 9, 2019

Uh oh!

spandantiwari commented Apr 10, 2019

Uh oh!

spandantiwari commented Apr 10, 2019

Uh oh!

houseroad commented Apr 10, 2019

Uh oh!

spandantiwari commented Apr 10, 2019

Uh oh!

suo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

suo left a comment •

edited

Loading