Unify IR operator representation (stop using attributes in the JIT) #9807

apaszke · 2018-07-25T04:43:22Z

Based on top of #9763 (first 3 commits belong to that PR). The first commits from this PR are "Stop using attributes ..."

I tried to separate the changes into fairly meaningful commits. I can't split them up into smaller PRs, because everything starts working and all tests pass only after the whole sequence, but hopefully this will make reviewing somewhat easier.

Known issues/regressions/future tasks:

aten::lerp and aten::clamp are no longer fusable
CreateAutodiffSubgraphs needs a rewrite
- It is much more strict now, and will miss a lot of opportunities, especially when viewing ops are involved. Our previous approach was "ignore the assumption on shape availability in gradient formulas to determine differentiability, and hope that shape prop will be robust enough to actually deliver them before we differentiate", which obviously doesn't scale well to more complex cases. We should either work on reducing the size dependency of grad formulas (feasible e.g. for view/reshape, unfeasible for squeeze/unsqueeze), or make CreateAutodiffSubgraphs integrate some kind of "I could integrate this node into an AD subgraph, but will I be able to infer the shape of its input" reasoning (kind of like a limited shape prop, that doesn't infer anything, and only tells if it could infer something).
- It sometimes creates constant-only (or constants + one node) graphs, which is useless
Broken aten::add in auto-batching, because it gained a non-tensor input. I changed the test for pointwise operations to use aten::mul instead, but I needed to disable the LSTM cell test. I'm not sure how scalar constants should be implemented in this case, because I don't fully understand our format. cc: @ChunliF
Graph import does some hacks to recover type of constants. This code should be removed once we'll gain the ability to export the IR along with value types.
There's still a fair amount of dead code that can be removed. I didn't want to make this diff any bigger, and removing it is an easy task.
Graph fuser could be improved to use signature matching (possibly using OperatorSet) instead of basing on node kinds.
Manual constant propagation for the ListConstruct node in torch/onnx/utils.py should be replaced with a proper constant propagation pass (or we should ensure that the one we have handles at least this case before we remove this code).

@zdevito

ezyang · 2018-07-25T14:14:42Z

@pytorchbot retest this please

ezyang · 2018-07-25T21:28:59Z

@pytorchbot retest this please

apaszke · 2018-07-25T21:30:29Z

@pytorchbot retest this please

facebook-github-bot

@apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

zdevito

This looks good. I only had minor comments. I like the approach to fixing the ONNX symbolics. It keeps them working without burdening us with expanding them to non-const inputs yet.

tools/autograd/gen_variable_type.py

torch/csrc/jit/autodiff.cpp

torch/csrc/jit/fusion_compiler.cpp

torch/csrc/jit/script/compiler.cpp

torch/onnx/symbolic.py

torch/onnx/utils.py

facebook-github-bot

@apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot

@apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

- Adapt AD to work with nodes without attributes. - Add OperatorSet for efficient matching of nodes with sets of schemas. - Start using schema matching to determine differentiability and select gradient formulas

facebook-github-bot

apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

apaszke · 2018-07-27T02:04:08Z

@pytorchbot retest this please

facebook-github-bot

@apaszke has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

torch/onnx/symbolic.py

 # ---------------------------------------------------------------------


+def _parse_arg(value, desc):


torch/onnx/symbolic.py

-    if dim is not None:
-        keepdim = kwargs.get("keepdim", False)
-        # TODO: export it as ReduceMin
+def min(g, self, dim_or_y, keepdim=None):


torch/onnx/symbolic.py



 def pow(g, self, exponent):
+    exponent = _maybe_get_scalar(exponent)


…ytorch#9807) Summary: Based on top of pytorch#9763 (first 3 commits belong to that PR). The first commits from this PR are "Stop using attributes ..." I tried to separate the changes into fairly meaningful commits. I can't split them up into smaller PRs, because everything starts working and all tests pass only after the whole sequence, but hopefully this will make reviewing somewhat easier. Known issues/regressions/future tasks: - `aten::lerp` and `aten::clamp` are no longer fusable - `CreateAutodiffSubgraphs` needs a rewrite - It is much more strict now, and will miss a lot of opportunities, especially when viewing ops are involved. Our previous approach was "ignore the assumption on shape availability in gradient formulas to determine differentiability, and hope that shape prop will be robust enough to actually deliver them before we differentiate", which obviously doesn't scale well to more complex cases. We should either work on reducing the size dependency of grad formulas (feasible e.g. for `view`/`reshape`, unfeasible for `squeeze`/`unsqueeze`), or make `CreateAutodiffSubgraphs` integrate some kind of "I could integrate this node into an AD subgraph, but will I be able to infer the shape of its input" reasoning (kind of like a limited shape prop, that doesn't infer anything, and only tells if it *could* infer something). - It sometimes creates constant-only (or constants + one node) graphs, which is useless - Broken `aten::add` in auto-batching, because it gained a non-tensor input. I changed the test for pointwise operations to use `aten::mul` instead, but I needed to disable the LSTM cell test. I'm not sure how scalar constants should be implemented in this case, because I don't fully understand our format. cc: ChunliF - Graph import does some hacks to recover type of constants. This code should be removed once we'll gain the ability to export the IR along with value types. - There's still a fair amount of dead code that can be removed. I didn't want to make this diff any bigger, and removing it is an easy task. - Graph fuser could be improved to use signature matching (possibly using `OperatorSet`) instead of basing on node kinds. - Manual constant propagation for the `ListConstruct` node in `torch/onnx/utils.py` should be replaced with a proper constant propagation pass (or we should ensure that the one we have handles at least this case before we remove this code). zdevito Pull Request resolved: pytorch#9807 Reviewed By: ezyang Differential Revision: D9004285 Pulled By: apaszke fbshipit-source-id: fe88026a765f6b687354add034c86402362508b7

apaszke requested review from Yangqing, anderspapitto, bddppq, colesbury, dzhulgakov, ezyang, gchanan, houseroad, jamesr66a, smessmer, soumith and zdevito as code owners July 25, 2018 04:43

apaszke force-pushed the jit_attributes branch 3 times, most recently from f0130b0 to 9ee72f1 Compare July 25, 2018 21:01

facebook-github-bot reviewed Jul 25, 2018

View reviewed changes

zdevito added the oncall: jit Add this issue/PR to JIT oncall triage queue label Jul 25, 2018

zdevito approved these changes Jul 26, 2018

View reviewed changes

apaszke force-pushed the jit_attributes branch from 9ee72f1 to 5bcba71 Compare July 26, 2018 13:57

facebook-github-bot reviewed Jul 26, 2018

View reviewed changes

apaszke force-pushed the jit_attributes branch 4 times, most recently from bd1a7e1 to 38162eb Compare July 26, 2018 19:09

facebook-github-bot reviewed Jul 26, 2018

View reviewed changes

apaszke added 14 commits July 26, 2018 18:33

Stop emitting attributes in SymbolicVariable

ff0225c

Stop emitting attributes in the compiler

f1f88fd

Improve JIT autodiff

898282e

- Adapt AD to work with nodes without attributes. - Add OperatorSet for efficient matching of nodes with sets of schemas. - Start using schema matching to determine differentiability and select gradient formulas

Make AD more conservative

b4413b4

Fix ONNX export

bf7527a

Minor assorted fixes all over JIT code

4e2f9c6

Fix the graph fuser

1edb4ea

Temporarily path up IR import/export

b72a6e4

Patch up autobatching tests

3d2cf20

TestJit expects

3f5808b

Script expects

94d1130

Reivew comments

95ee694

ONNX fixes

3e34c81

One more ONNX fix

5a92d47

apaszke force-pushed the jit_attributes branch from 1058aaa to 6e0b10b Compare July 27, 2018 01:43

Rebase fixes

b553826

apaszke force-pushed the jit_attributes branch from 6e0b10b to b553826 Compare July 27, 2018 01:43

facebook-github-bot reviewed Jul 27, 2018

View reviewed changes

ONNX FIX

1f52f39

facebook-github-bot reviewed Jul 27, 2018

View reviewed changes

ezyang reviewed Jul 27, 2018

View reviewed changes

torch/onnx/symbolic.py

# ---------------------------------------------------------------------

def _parse_arg(value, desc):

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Jul 27, 2018

View reviewed changes

torch/onnx/symbolic.py

if dim is not None:

keepdim = kwargs.get("keepdim", False)

# TODO: export it as ReduceMin

def min(g, self, dim_or_y, keepdim=None):

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Jul 27, 2018

View reviewed changes

torch/onnx/symbolic.py

def pow(g, self, exponent):

exponent = _maybe_get_scalar(exponent)

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

facebook-github-bot closed this in 8cb1eef Jul 27, 2018

apaszke deleted the jit_attributes branch July 27, 2018 12:40

ezyang added open source merged labels Jun 24, 2019

		# ---------------------------------------------------------------------


		def _parse_arg(value, desc):



		def pow(g, self, exponent):
		exponent = _maybe_get_scalar(exponent)

Unify IR operator representation (stop using attributes in the JIT) #9807

Unify IR operator representation (stop using attributes in the JIT) #9807

Uh oh!

Conversation

apaszke commented Jul 25, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ezyang commented Jul 25, 2018

Uh oh!

ezyang commented Jul 25, 2018

Uh oh!

apaszke commented Jul 25, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

zdevito left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

apaszke commented Jul 27, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

apaszke commented Jul 25, 2018 •

edited

Loading