[export] basic auto dynamic shapes #133620

pianpwk · 2024-08-15T20:25:30Z

Starter version of automatic dynamic shapes for export.

Creates enums DIM.AUTO, DIM.STATIC, allowing user to specify AUTO for dims in dynamic_shapes specs, meaning that corresponding dims are treated as dynamic, and relevant guards will do what's necessary (e.g. refine ValueRanges, set replacements based on equality, or even set static) without raising ConstraintViolationErrors. Basically allows the user to say, "a bunch of these dims can be dynamic, let export do model analysis and return the program with maximum possible dynamism, without complaining".

The usage for specifying dynamic_shapes is now:

AUTO -> dynamic by default, return whatever produce_guards() says, even if it's static
None/int/STATIC -> static
Dim/DerivedDim -> same as before - will complain if the min/max range is invalid, or if dims related to this are unspecified.

Caveat 1: specifying AUTO for a dim won't guarantee it'll be dynamic:

specifying AUTO for a dim will return the maximum possible dynamism given your program and other specified constraints, but this can still mean you'll get a static program. For example, with the program below, x is specified dynamic, but it's equal to y, which is specified static, and with how we currently do things we won't promote y to dynamic, but will demote(?) x to static. So this can be surprising if you don't fully know your model, and/or missed one of your other inputs when specifying auto-dynamic shapes.

class Foo(torch.nn.Module):
    def forward(self, x, y):
        return x + y
inputs = (torch.randn(6), torch.randn(6))
export(Foo(), inputs, dynamic_shapes={"x": (DIM.AUTO,), "y": None})

Caveat 2: specifying AUTO and Dims in the same spec is still problematic:

The way Dims/DerivedDims are currently handled is very strict. A Dim represents a symbol, and we require a user to specify the symbol for all dims governed by the symbol - that's why we've seen errors in the past like The values of x must always be related to y by ..., asking the user to specify the exact relation as in the program. We also require the specified min/max range to be a subset of the valid range from model analysis. All this doesn't compose well with specifying AUTO just yet - for example in the program below, ideal behavior could be to return a dynamic program, where dx = x.size(0) = y.size(0) has range (3,6). Unfortunately this crashes, and correct behavior is to specify dx for both inputs. So currently we raise a UserError and crash if both Dims + AUTO are present in the spec.

class Foo(torch.nn.Module):
    def forward(self, x, y):
        return x + y
inputs = (torch.randn(6), torch.randn(6))
export(Foo(), inputs, dynamic_shapes={"x": (DIM.AUTO,), "y": {0: Dim("dx", min=3, max=6)}})  # this doesn't work, because x & y and related

Implementation details:

This is done by setting assume_static_by_default=False, and doing a transform on the dynamic_shapes spec to preserve semantics. assume_static_by_default=False will treat unspecified dims or Nones as dynamic. This is the opposite of what export.export() currently does - unspecified Dims/Nones are treated as static. Historically this static-by-default behavior, where the user deals with fewer guards, has been desirable, and we would like to respect that in this implementation. So this internal spec transformation is added, _transform_shapes_for_default_dynamic(), does the spec conversion necessary to be compatbile with dynamic by default. Specifically, AUTOs are converted into Nones, and Nones/unspecified dims are filled in with explicitly static constraints.

For example, this would look like, for a 3-d tensor: {0: DIM.AUTO, 1: None, 2: Dim("dx")} -> {0: None, 1: 32, 2: Dim("dx")}

This does seem overly complicated, but it's done to preserve dynamic shapes semantics for torch._dynamo.export(), which already uses assume_static_by_default=False, and follows the same process for generating shape constraints , via _process_dynamic_shapes. There the semantics are:

None/unspecified: dynamic by default
Dim/DerivedDim: also a strict assertion

If we don't care about BC for _dynamo.export(dynamic_shapes), then we can just modify semantics for _process_dynamic_shapes() and change all the relevant tests in test/dynamo/test_export.py.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @rec

pytorch-bot · 2024-08-15T20:25:34Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133620

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8dce93b with merge base f8fbfe5 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-08-15T20:25:43Z

This pull request was exported from Phabricator. Differential Revision: D61352432

facebook-github-bot · 2024-08-15T23:42:06Z

@pianpwk has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames Pull Request resolved: #133620 Differential Revision: D61352432 Pulled By: pianpwk

facebook-github-bot · 2024-08-15T23:53:01Z

This pull request was exported from Phabricator. Differential Revision: D61352432

facebook-github-bot · 2024-08-16T18:22:36Z

@pianpwk has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

pianpwk · 2024-08-16T18:24:47Z

test/export/test_export.py

    @testing.expectedFailureSerDer  # We don't preserve metadata on graph module
-    @testing.expectedFailureNonStrict
-    @testing.expectedFailureTrainingIRToRunDecompNonStrict
    def test_retrace_graph_level_meta_preservation(self):


It seems like now we just consistently store static constraints for unspecified dims, like `{"t_id": ..., "dim": ..., "min": 4, "max": 4} so this doesn't break anymore. Any thoughts on this?

Oof, why do we even have this metadata lol? Please kill it, I don't want to keep this format for too much longer. grep shows no users (aside from tests), and only Dynamo populates it.

pianpwk · 2024-08-16T18:25:56Z

test/export/test_export.py

        ep = export(m, ())
        self.assertEqual(ep.graph_signature.lifted_tensor_constants, ["x"])

+    @unittest.expectedFailure


This PR seems to expose some already existing problem between explicitly static constraints + dataclasses (or equality constraints)? Marking this as a failure to open the rest of the PR for review, but will work on this and resolve it before this PR is merged in.

OK, as long as you get to it. I'd guess info on unused inputs could easily be perturbed.

about to add a fix in the next commit

pianpwk · 2024-08-16T18:26:24Z

test/export/test_export.py

            if node.op == "call_function":
                self.assertTrue(False)

+    def test_automatic_dynamic_shapes_simple_equality(self):


these 3 tests explain a lot of the behavior for this API

torch/_dynamo/eval_frame.py

torch/_export/non_strict_utils.py

torch/export/_trace.py

pianpwk · 2024-08-16T18:29:25Z

torch/export/dynamic_shapes.py

+            )
+
+
+def _transform_shapes_for_default_dynamic(


This entire function is unnecessary if we could ignore BC and completely change _dynamo.export() behavior, but this seems safer to start with.

I think it is better to make the full spec explicit early in the export pathway anyway, rather than carry the meanings of defaults and interpret them everywhere. Makes it easier to adapt to automated use cases where we can just generate complete specs from side info (in addition to continuing to support user-defined specs, which have some pressure to be minimal).

facebook-github-bot · 2024-08-16T18:36:14Z

This pull request was exported from Phabricator. Differential Revision: D61352432

Summary: Starter version of automatic dynamic shapes for export. Allows user to specify `True` in dynamic_shapes specs, meaning that corresponding dims are treated as dynamic, and relevant guards will do what's necessary (e.g. refine ValueRanges, set replacements based on equality, or even set static) without raising ConstraintViolationErrors. Basically allows the user to say, "a bunch of these dims can be dynamic, let export do model analysis and return the program with maximum possible dynamism, without complaining". The usage for specifying `dynamic_shapes` is now: ``` True -> dynamic by default, return whatever produce_guards() says, even if it's static None -> static Dim/DerivedDim -> same as before - will complain if the min/max range is invalid, or if dims related to this are unspecified. ``` Similarly to `None`, specifying dynamism `dynamic_shapes=True` for the entire program, or `{"x": True}` for an entire tensor, is possible. Caveat 1: specifying `True` for a dim won't guarantee it'll be dynamic: - specifying `True` will return the maximum possible dynamism given your program and other specified constraints, but this can still mean you'll get a static program. For example, with the program below, x is specified dynamic, but it's equal to y, which is specified static, and with how we currently do things we won't promote y to dynamic, but will demote(?) x to static. So this can be surprising if you don't fully know your model, and/or missed one of your other inputs when specifying auto-dynamic shapes. ``` class Foo(torch.nn.Module): def forward(self, x, y): return x + y inputs = (torch.randn(6), torch.randn(6)) export(Foo(), inputs, dynamic_shapes={"x": True, "y": None}) ``` Caveat 2: specifying `True` and Dims in the same spec is still problematic: - The way Dims/DerivedDims are currently handled is very strict. A Dim represents a symbol, and we require a user to specify the symbol for all dims governed by the symbol - that's why we've seen errors in the past like `The values of x must always be related to y by ...`, asking the user to specify the exact relation as in the program. We also require the specified min/max range to be a subset of the valid range from model analysis. All this doesn't compose well with specifying `True` just yet - for example in the program below, ideal behavior could be to return a dynamic program, where `dx = x.size(0) = y.size(0)` has range (3,6). Unfortunately this crashes, and correct behavior is to specify `dx` for both inputs. So currently we raise a UserError and crash if both Dims + `True` are present in the spec. ``` class Foo(torch.nn.Module): def forward(self, x, y): return x + y inputs = (torch.randn(6), torch.randn(6)) export(Foo(), inputs, dynamic_shapes={"x": True, "y": {0: Dim("dx", min=3, max=6)}}) # this doesn't work, because x & y and related ``` Implementation details: This is done by setting `assume_static_by_default=False`, and doing a transform on the `dynamic_shapes` spec to preserve semantics. `assume_static_by_default=False` will treat unspecified dims or Nones as dynamic. This is the opposite of what `export.export()` currently does - unspecified Dims/Nones are treated as static. Historically this static-by-default behavior, where the user deals with fewer guards, has been desirable, and we would like to respect that in this implementation. So this internal spec transformation is added, `_transform_shapes_for_default_dynamic()`, does the spec conversion necessary to be compatbile with dynamic by default. Specifically, Trues are converted into Nones, and Nones/unspecified dims are filled in with explicitly static constraints. For example, this would look like, for a 3-d tensor: `{0: True, 1: None, 2: Dim("dx")} -> {0: None, 1: 32, 2: Dim("dx")}` This does seem overly complicated, but it's done to preserve dynamic shapes semantics for `torch._dynamo.export()`, which already uses `assume_static_by_default=False`, and follows the same process for generating shape constraints , via `_process_dynamic_shapes`. There the semantics are: ``` None/unspecified: dynamic by default Dim/DerivedDim: also a strict assertion ``` If we don't care about BC for `_dynamo.export(dynamic_shapes)`, then we can just modify semantics for `_process_dynamic_shapes()` and change all the relevant tests in `test/dynamo/test_export.py`. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames Pull Request resolved: #133620 Differential Revision: D61352432 Pulled By: pianpwk

avikchaudhuri

Haven't looked at the code yet, but just want to record some high-level comments so far:

Great summary!
I don't like True, and I hate None. For True, instead I want something like torch.export.Dim.AUTO (ideally an enum we can grow). For None, you can do this in a follow up but I'd really like to have a distinction between the two roles played by it, something like torch.export.Dim.STATIC in the future, and the fact that you can not bother with specifying a subtree (which can remain None), and it would align with not specifying dynamic shapes at all (implicitly None).
For interactions of True with static, say in x + y, I think we should move to a design where the exported program records the dynamic shape spec explicitly and any specialization would just show up there for those who care. Warnings are a bonus.
For interactions of True with Dim, likewise we should aspire to constrain the True to match Dim. This is kind of like how the Python type Any would merge with a more informative type. Then the interaction with static would match the semantics in principle: True starts of with minimal assumptions and gets more constrained over time.

SherlockNoMad · 2024-08-16T21:40:10Z

test/export/test_export.py

+        )
+
+    def test_automatic_dynamic_shapes_constant_relation(self):
+        # case 2: related by constant: s0 + 4 = s1


facebook-github-bot · 2024-08-23T18:58:42Z

This pull request was exported from Phabricator. Differential Revision: D61352432

facebook-github-bot · 2024-08-23T22:54:48Z

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

pytorchmergebot · 2024-08-23T22:56:29Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Summary: Pull Request resolved: pytorch#134484 Recently pytorch#133620 added support for automatic dynamic shapes, where a new enum, `DIM`, was introduced to provide hints like `AUTO` and `STATIC`. This PR is a nominal change where we expose the hints via the existing public `Dim` API, and remove `DIM` from the public API. The main motivation is to avoid having users need to import too many things. Test Plan: existing Reviewed By: angelayi Differential Revision: D61807361

Summary: Recently #133620 added support for automatic dynamic shapes, where a new enum, `DIM`, was introduced to provide hints like `AUTO` and `STATIC`. This PR is a nominal change where we expose the hints via the existing public `Dim` API, and remove `DIM` from the public API. The main motivation is to avoid having users need to import too many things. Test Plan: existing Differential Revision: D61807361 Pull Request resolved: #134484 Approved by: https://github.com/angelayi

Summary: Recently pytorch#133620 added support for automatic dynamic shapes, where a new enum, `DIM`, was introduced to provide hints like `AUTO` and `STATIC`. This PR is a nominal change where we expose the hints via the existing public `Dim` API, and remove `DIM` from the public API. The main motivation is to avoid having users need to import too many things. Test Plan: existing Differential Revision: D61807361 Pull Request resolved: pytorch#134484 Approved by: https://github.com/angelayi

Summary: Removing `_transform_shapes_for_default_dynamic` and `assume_static_by_default=False` as added in #133620. This reverts back to `assume_static_by_default=True` with the use of dynamo decorators (e.g. `maybe_mark_dynamic, mark_static`, instead) for handling Dim.AUTO & Dim.STATIC instead. This is easier to maintain, as it doesn't requiring reasoning about "inverting" the dynamic_shapes specs, and also opens up usage of other decorators (`mark_dynamic, mark_unbacked`). On the user side this change has no effect, but internally this means dynamic behavior is determined only by the `dynamic_shapes` specs (ignoring user-side input decorators following #135536), but transferring this information for _DimHints via decorators, for Dynamo/non-strict to create symbolic_contexts accordingly, e.g. https://github.com/pytorch/pytorch/blob/7c6d543a5b3d65d8f49420e60cda150faaa5b8a0/torch/_dynamo/variables/builder.py#L2646-L2666 One caveat is we don't raise errors for dynamic decorators on the user side, since we don't know if they're from user markings, or from re-exporting with inputs we've previously marked. Pull Request resolved: #136591 Differential Revision: D63358628 Pulled By: pianpwk

Summary: Removing `_transform_shapes_for_default_dynamic` and `assume_static_by_default=False` as added in #133620. This reverts back to `assume_static_by_default=True` with the use of dynamo decorators (e.g. `maybe_mark_dynamic, mark_static`, instead) for handling Dim.AUTO & Dim.STATIC instead. This is easier to maintain, as it doesn't requiring reasoning about "inverting" the dynamic_shapes specs, and also opens up usage of other decorators (`mark_dynamic, mark_unbacked`). On the user side this change has no effect, but internally this means dynamic behavior is determined only by the `dynamic_shapes` specs (ignoring user-side input decorators following #135536), but transferring this information for _DimHints via decorators, for Dynamo/non-strict to create symbolic_contexts accordingly, e.g. https://github.com/pytorch/pytorch/blob/7c6d543a5b3d65d8f49420e60cda150faaa5b8a0/torch/_dynamo/variables/builder.py#L2646-L2666 One caveat is we don't raise errors for dynamic decorators on the user side, since we don't know if they're from user markings, or from re-exporting with inputs we've previously marked. Differential Revision: D63358628 Pulled By: pianpwk

Removing `_transform_shapes_for_default_dynamic` and `assume_static_by_default=False` as added in #133620. This reverts back to `assume_static_by_default=True` with the use of dynamo decorators (e.g. `maybe_mark_dynamic, mark_static`, instead) for handling Dim.AUTO & Dim.STATIC instead. This is easier to maintain, as it doesn't requiring reasoning about "inverting" the dynamic_shapes specs, and also opens up usage of other decorators (`mark_dynamic, mark_unbacked`). On the user side this change has no effect, but internally this means dynamic behavior is determined only by the `dynamic_shapes` specs (ignoring user-side input decorators following #135536), but transferring this information for _DimHints via decorators, for Dynamo/non-strict to create symbolic_contexts accordingly, e.g. https://github.com/pytorch/pytorch/blob/7c6d543a5b3d65d8f49420e60cda150faaa5b8a0/torch/_dynamo/variables/builder.py#L2646-L2666 One caveat is we don't raise errors for dynamic decorators on the user side, since we don't know if they're from user markings, or from re-exporting with inputs we've previously marked. Differential Revision: D63358628 Pull Request resolved: #136591 Approved by: https://github.com/avikchaudhuri

…h.export.Dim.AUTO (#143158) With #133620 introducing Dim.AUTO, we can now automatically convert dynamic_axes to dynamic_shapes without specifying min and max. However, exporting still could be crashed when there are same specs shared between inputs and there is no guarantee that the axes will be dynamic (see PR description). ~~Therefore, a~~ follow-up PR should create a post-processing ONNX side pass to ~~enable the missed dynamic axes~~ rename the dynamic shapes (s0, s1, ...) to dynamic_axes (user setting names). This PR does: (1) Apply torch.export.Dim.AUTO to dynamic_axes when dynamic_shapes is not provided. (2) Convert args/kwargs to tuple inputs, which follows the generated dynamic_shapes format to avoid errors during torch.export.export. (3) Avoid KeyError in _rename_dynamic_shapes_with_model_inputs funtion. (4) Add real world case of a HF model with kv_cache to test on ONNX exporter. Pull Request resolved: #143158 Approved by: https://github.com/xadupre, https://github.com/shubhambhokare1

pianpwk requested review from angelayi, avikchaudhuri, tugsbayasgalan, ydwu4 and zhxchen17 as code owners August 15, 2024 20:25

pytorch-bot bot added ciflow/inductor module: dynamo release notes: fx release notes category labels Aug 15, 2024

facebook-github-bot added the fb-exported label Aug 15, 2024

pianpwk force-pushed the export-D61352432 branch from 83345cf to efa7737 Compare August 15, 2024 23:52