Skip to content

Conversation

@pianpwk
Copy link
Contributor

@pianpwk pianpwk commented Aug 15, 2024

Starter version of automatic dynamic shapes for export.

Creates enums DIM.AUTO, DIM.STATIC, allowing user to specify AUTO for dims in dynamic_shapes specs, meaning that corresponding dims are treated as dynamic, and relevant guards will do what's necessary (e.g. refine ValueRanges, set replacements based on equality, or even set static) without raising ConstraintViolationErrors. Basically allows the user to say, "a bunch of these dims can be dynamic, let export do model analysis and return the program with maximum possible dynamism, without complaining".

The usage for specifying dynamic_shapes is now:

AUTO -> dynamic by default, return whatever produce_guards() says, even if it's static
None/int/STATIC -> static
Dim/DerivedDim -> same as before - will complain if the min/max range is invalid, or if dims related to this are unspecified.

Caveat 1: specifying AUTO for a dim won't guarantee it'll be dynamic:

  • specifying AUTO for a dim will return the maximum possible dynamism given your program and other specified constraints, but this can still mean you'll get a static program. For example, with the program below, x is specified dynamic, but it's equal to y, which is specified static, and with how we currently do things we won't promote y to dynamic, but will demote(?) x to static. So this can be surprising if you don't fully know your model, and/or missed one of your other inputs when specifying auto-dynamic shapes.
class Foo(torch.nn.Module):
    def forward(self, x, y):
        return x + y
inputs = (torch.randn(6), torch.randn(6))
export(Foo(), inputs, dynamic_shapes={"x": (DIM.AUTO,), "y": None})

Caveat 2: specifying AUTO and Dims in the same spec is still problematic:

  • The way Dims/DerivedDims are currently handled is very strict. A Dim represents a symbol, and we require a user to specify the symbol for all dims governed by the symbol - that's why we've seen errors in the past like The values of x must always be related to y by ..., asking the user to specify the exact relation as in the program. We also require the specified min/max range to be a subset of the valid range from model analysis. All this doesn't compose well with specifying AUTO just yet - for example in the program below, ideal behavior could be to return a dynamic program, where dx = x.size(0) = y.size(0) has range (3,6). Unfortunately this crashes, and correct behavior is to specify dx for both inputs. So currently we raise a UserError and crash if both Dims + AUTO are present in the spec.
class Foo(torch.nn.Module):
    def forward(self, x, y):
        return x + y
inputs = (torch.randn(6), torch.randn(6))
export(Foo(), inputs, dynamic_shapes={"x": (DIM.AUTO,), "y": {0: Dim("dx", min=3, max=6)}})  # this doesn't work, because x & y and related

Implementation details:

This is done by setting assume_static_by_default=False, and doing a transform on the dynamic_shapes spec to preserve semantics. assume_static_by_default=False will treat unspecified dims or Nones as dynamic. This is the opposite of what export.export() currently does - unspecified Dims/Nones are treated as static. Historically this static-by-default behavior, where the user deals with fewer guards, has been desirable, and we would like to respect that in this implementation. So this internal spec transformation is added, _transform_shapes_for_default_dynamic(), does the spec conversion necessary to be compatbile with dynamic by default. Specifically, AUTOs are converted into Nones, and Nones/unspecified dims are filled in with explicitly static constraints.

For example, this would look like, for a 3-d tensor: {0: DIM.AUTO, 1: None, 2: Dim("dx")} -> {0: None, 1: 32, 2: Dim("dx")}

This does seem overly complicated, but it's done to preserve dynamic shapes semantics for torch._dynamo.export(), which already uses assume_static_by_default=False, and follows the same process for generating shape constraints , via _process_dynamic_shapes. There the semantics are:

None/unspecified: dynamic by default
Dim/DerivedDim: also a strict assertion

If we don't care about BC for _dynamo.export(dynamic_shapes), then we can just modify semantics for _process_dynamic_shapes() and change all the relevant tests in test/dynamo/test_export.py.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @rec

@pytorch-bot
Copy link

pytorch-bot bot commented Aug 15, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133620

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8dce93b with merge base f8fbfe5 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D61352432

@facebook-github-bot
Copy link
Contributor

@pianpwk has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

pianpwk added a commit that referenced this pull request Aug 15, 2024
Summary:
cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames

Pull Request resolved: #133620

Differential Revision: D61352432

Pulled By: pianpwk
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D61352432

@facebook-github-bot
Copy link
Contributor

@pianpwk has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@testing.expectedFailureSerDer # We don't preserve metadata on graph module
@testing.expectedFailureNonStrict
@testing.expectedFailureTrainingIRToRunDecompNonStrict
def test_retrace_graph_level_meta_preservation(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like now we just consistently store static constraints for unspecified dims, like `{"t_id": ..., "dim": ..., "min": 4, "max": 4} so this doesn't break anymore. Any thoughts on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, why do we even have this metadata lol? Please kill it, I don't want to keep this format for too much longer. grep shows no users (aside from tests), and only Dynamo populates it.

ep = export(m, ())
self.assertEqual(ep.graph_signature.lifted_tensor_constants, ["x"])

@unittest.expectedFailure
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR seems to expose some already existing problem between explicitly static constraints + dataclasses (or equality constraints)? Marking this as a failure to open the rest of the PR for review, but will work on this and resolve it before this PR is merged in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, as long as you get to it. I'd guess info on unused inputs could easily be perturbed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

about to add a fix in the next commit

if node.op == "call_function":
self.assertTrue(False)

def test_automatic_dynamic_shapes_simple_equality(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these 3 tests explain a lot of the behavior for this API

)


def _transform_shapes_for_default_dynamic(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire function is unnecessary if we could ignore BC and completely change _dynamo.export() behavior, but this seems safer to start with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to make the full spec explicit early in the export pathway anyway, rather than carry the meanings of defaults and interpret them everywhere. Makes it easier to adapt to automated use cases where we can just generate complete specs from side info (in addition to continuing to support user-defined specs, which have some pressure to be minimal).

@pianpwk pianpwk requested a review from SherlockNoMad August 16, 2024 18:30
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D61352432

pianpwk added a commit that referenced this pull request Aug 16, 2024
Summary:
Starter version of automatic dynamic shapes for export. Allows user to specify `True` in dynamic_shapes specs, meaning that corresponding dims are treated as dynamic, and relevant guards will do what's necessary (e.g. refine ValueRanges, set replacements based on equality, or even set static) without raising ConstraintViolationErrors. Basically allows the user to say, "a bunch of these dims can be dynamic, let export do model analysis and return the program with maximum possible dynamism, without complaining".

The usage for specifying `dynamic_shapes` is now:
```
True -> dynamic by default, return whatever produce_guards() says, even if it's static
None -> static
Dim/DerivedDim -> same as before - will complain if the min/max range is invalid, or if dims related to this are unspecified.
```
Similarly to `None`, specifying dynamism `dynamic_shapes=True` for the entire program, or `{"x": True}` for an entire tensor, is possible.

Caveat 1: specifying `True` for a dim won't guarantee it'll be dynamic:

- specifying `True` will return the maximum possible dynamism given your program and other specified constraints, but this can still mean you'll get a static program. For example, with the program below, x is specified dynamic, but it's equal to y, which is specified static, and with how we currently do things we won't promote y to dynamic, but will demote(?) x to static. So this can be surprising if you don't fully know your model, and/or missed one of your other inputs when specifying auto-dynamic shapes.
```
class Foo(torch.nn.Module):
    def forward(self, x, y):
        return x + y
inputs = (torch.randn(6), torch.randn(6))
export(Foo(), inputs, dynamic_shapes={"x": True, "y": None})
```

Caveat 2: specifying `True` and Dims in the same spec is still problematic:

- The way Dims/DerivedDims are currently handled is very strict. A Dim represents a symbol, and we require a user to specify the symbol for all dims governed by the symbol - that's why we've seen errors in the past like `The values of x must always be related to y by ...`, asking the user to specify the exact relation as in the program. We also require the specified min/max range to be a subset of the valid range from model analysis. All this doesn't compose well with specifying `True` just yet - for example in the program below, ideal behavior could be to return a dynamic program, where `dx = x.size(0) = y.size(0)` has range (3,6). Unfortunately this crashes, and correct behavior is to specify `dx` for both inputs. So currently we raise a UserError and crash if both Dims + `True` are present in the spec.
```
class Foo(torch.nn.Module):
    def forward(self, x, y):
        return x + y
inputs = (torch.randn(6), torch.randn(6))
export(Foo(), inputs, dynamic_shapes={"x": True, "y": {0: Dim("dx", min=3, max=6)}})  # this doesn't work, because x & y and related
```

Implementation details:

This is done by setting `assume_static_by_default=False`, and doing a transform on the `dynamic_shapes` spec to preserve semantics. `assume_static_by_default=False` will treat unspecified dims or Nones as dynamic. This is the opposite of what `export.export()` currently does - unspecified Dims/Nones are treated as static. Historically this static-by-default behavior, where the user deals with fewer guards, has been desirable, and we would like to respect that in this implementation. So this internal spec transformation is added, `_transform_shapes_for_default_dynamic()`, does the spec conversion necessary to be compatbile with dynamic by default. Specifically, Trues are converted into Nones, and Nones/unspecified dims are filled in with explicitly static constraints.

For example, this would look like, for a 3-d tensor: `{0: True, 1: None, 2: Dim("dx")} -> {0: None, 1: 32, 2: Dim("dx")}`

This does seem overly complicated, but it's done to preserve dynamic shapes semantics for `torch._dynamo.export()`, which already uses `assume_static_by_default=False`, and follows the same process for generating shape constraints , via `_process_dynamic_shapes`. There the semantics are:
```
None/unspecified: dynamic by default
Dim/DerivedDim: also a strict assertion
```

If we don't care about BC for `_dynamo.export(dynamic_shapes)`, then we can just modify semantics for `_process_dynamic_shapes()` and change all the relevant tests in `test/dynamo/test_export.py`.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames

Pull Request resolved: #133620

Differential Revision: D61352432

Pulled By: pianpwk
Copy link
Contributor

@avikchaudhuri avikchaudhuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't looked at the code yet, but just want to record some high-level comments so far:

  • Great summary!
  • I don't like True, and I hate None. For True, instead I want something like torch.export.Dim.AUTO (ideally an enum we can grow). For None, you can do this in a follow up but I'd really like to have a distinction between the two roles played by it, something like torch.export.Dim.STATIC in the future, and the fact that you can not bother with specifying a subtree (which can remain None), and it would align with not specifying dynamic shapes at all (implicitly None).
  • For interactions of True with static, say in x + y, I think we should move to a design where the exported program records the dynamic shape spec explicitly and any specialization would just show up there for those who care. Warnings are a bonus.
  • For interactions of True with Dim, likewise we should aspire to constrain the True to match Dim. This is kind of like how the Python type Any would merge with a more informative type. Then the interaction with static would match the semantics in principle: True starts of with minimal assumptions and gets more constrained over time.

)

def test_automatic_dynamic_shapes_constant_relation(self):
# case 2: related by constant: s0 + 4 = s1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D61352432

@facebook-github-bot
Copy link
Contributor

@pytorchbot merge -f 'Landed internally'

(Initiating merge automatically since Phabricator Diff has merged, using force because this PR might not pass merge_rules.json but landed internally)

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

avikchaudhuri added a commit to avikchaudhuri/pytorch that referenced this pull request Aug 26, 2024
Summary:
Pull Request resolved: pytorch#134484

Recently pytorch#133620 added support for automatic dynamic shapes, where a new enum, `DIM`, was introduced to provide hints like `AUTO` and `STATIC`. This PR is a nominal change where we expose the hints via the existing public `Dim` API, and remove `DIM` from the public API. The main motivation is to avoid having users need to import too many things.

Test Plan: existing

Reviewed By: angelayi

Differential Revision: D61807361
avikchaudhuri added a commit to avikchaudhuri/pytorch that referenced this pull request Aug 26, 2024
Summary:
Pull Request resolved: pytorch#134484

Recently pytorch#133620 added support for automatic dynamic shapes, where a new enum, `DIM`, was introduced to provide hints like `AUTO` and `STATIC`. This PR is a nominal change where we expose the hints via the existing public `Dim` API, and remove `DIM` from the public API. The main motivation is to avoid having users need to import too many things.

Test Plan: existing

Reviewed By: angelayi

Differential Revision: D61807361
avikchaudhuri added a commit to avikchaudhuri/pytorch that referenced this pull request Aug 28, 2024
Summary:
Pull Request resolved: pytorch#134484

Recently pytorch#133620 added support for automatic dynamic shapes, where a new enum, `DIM`, was introduced to provide hints like `AUTO` and `STATIC`. This PR is a nominal change where we expose the hints via the existing public `Dim` API, and remove `DIM` from the public API. The main motivation is to avoid having users need to import too many things.

Test Plan: existing

Reviewed By: angelayi

Differential Revision: D61807361
pytorchmergebot pushed a commit that referenced this pull request Aug 28, 2024
Summary: Recently #133620 added support for automatic dynamic shapes, where a new enum, `DIM`, was introduced to provide hints like `AUTO` and `STATIC`. This PR is a nominal change where we expose the hints via the existing public `Dim` API, and remove `DIM` from the public API. The main motivation is to avoid having users need to import too many things.

Test Plan: existing

Differential Revision: D61807361

Pull Request resolved: #134484
Approved by: https://github.com/angelayi
Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Sep 20, 2024
Summary: Recently pytorch#133620 added support for automatic dynamic shapes, where a new enum, `DIM`, was introduced to provide hints like `AUTO` and `STATIC`. This PR is a nominal change where we expose the hints via the existing public `Dim` API, and remove `DIM` from the public API. The main motivation is to avoid having users need to import too many things.

Test Plan: existing

Differential Revision: D61807361

Pull Request resolved: pytorch#134484
Approved by: https://github.com/angelayi
pianpwk added a commit that referenced this pull request Sep 25, 2024
Summary:
Removing `_transform_shapes_for_default_dynamic` and `assume_static_by_default=False` as added in #133620.

This reverts back to `assume_static_by_default=True` with the use of dynamo decorators (e.g. `maybe_mark_dynamic, mark_static`, instead) for handling Dim.AUTO & Dim.STATIC instead. This is easier to maintain, as it doesn't requiring reasoning about "inverting" the dynamic_shapes specs, and also opens up usage of other decorators (`mark_dynamic, mark_unbacked`).

On the user side this change has no effect, but internally this means dynamic behavior is determined only by the `dynamic_shapes` specs (ignoring user-side input decorators following #135536), but transferring this information for _DimHints via decorators, for Dynamo/non-strict to create symbolic_contexts accordingly, e.g. https://github.com/pytorch/pytorch/blob/7c6d543a5b3d65d8f49420e60cda150faaa5b8a0/torch/_dynamo/variables/builder.py#L2646-L2666

One caveat is we don't raise errors for dynamic decorators on the user side, since we don't know if they're from user markings, or from re-exporting with inputs we've previously marked.

Pull Request resolved: #136591

Differential Revision: D63358628

Pulled By: pianpwk
facebook-github-bot pushed a commit that referenced this pull request Sep 26, 2024
Summary:
Removing `_transform_shapes_for_default_dynamic` and `assume_static_by_default=False` as added in #133620.

This reverts back to `assume_static_by_default=True` with the use of dynamo decorators (e.g. `maybe_mark_dynamic, mark_static`, instead) for handling Dim.AUTO & Dim.STATIC instead. This is easier to maintain, as it doesn't requiring reasoning about "inverting" the dynamic_shapes specs, and also opens up usage of other decorators (`mark_dynamic, mark_unbacked`).

On the user side this change has no effect, but internally this means dynamic behavior is determined only by the `dynamic_shapes` specs (ignoring user-side input decorators following #135536), but transferring this information for _DimHints via decorators, for Dynamo/non-strict to create symbolic_contexts accordingly, e.g. https://github.com/pytorch/pytorch/blob/7c6d543a5b3d65d8f49420e60cda150faaa5b8a0/torch/_dynamo/variables/builder.py#L2646-L2666

One caveat is we don't raise errors for dynamic decorators on the user side, since we don't know if they're from user markings, or from re-exporting with inputs we've previously marked.


Differential Revision: D63358628

Pulled By: pianpwk
pytorchmergebot pushed a commit that referenced this pull request Sep 27, 2024
Removing `_transform_shapes_for_default_dynamic` and `assume_static_by_default=False` as added in #133620.

This reverts back to `assume_static_by_default=True` with the use of dynamo decorators (e.g. `maybe_mark_dynamic, mark_static`, instead) for handling Dim.AUTO & Dim.STATIC instead. This is easier to maintain, as it doesn't requiring reasoning about "inverting" the dynamic_shapes specs, and also opens up usage of other decorators (`mark_dynamic, mark_unbacked`).

On the user side this change has no effect, but internally this means dynamic behavior is determined only by the `dynamic_shapes` specs (ignoring user-side input decorators following #135536), but transferring this information for _DimHints via decorators, for Dynamo/non-strict to create symbolic_contexts accordingly, e.g. https://github.com/pytorch/pytorch/blob/7c6d543a5b3d65d8f49420e60cda150faaa5b8a0/torch/_dynamo/variables/builder.py#L2646-L2666

One caveat is we don't raise errors for dynamic decorators on the user side, since we don't know if they're from user markings, or from re-exporting with inputs we've previously marked.

Differential Revision: D63358628

Pull Request resolved: #136591
Approved by: https://github.com/avikchaudhuri
@github-actions github-actions bot deleted the export-D61352432 branch October 1, 2024 02:16
pytorchmergebot pushed a commit that referenced this pull request Dec 18, 2024
…h.export.Dim.AUTO (#143158)

With #133620 introducing Dim.AUTO, we can now automatically convert dynamic_axes to dynamic_shapes without specifying min and max. However, exporting still could be crashed when there are same specs shared between inputs and there is no guarantee that the axes will be dynamic (see PR description).

~~Therefore, a~~ follow-up PR should create a post-processing ONNX side pass to ~~enable the missed dynamic axes~~ rename the dynamic shapes (s0,  s1, ...) to dynamic_axes (user setting names).

This PR does:
(1) Apply torch.export.Dim.AUTO to dynamic_axes when dynamic_shapes is not provided.
(2) Convert args/kwargs to tuple inputs, which follows the generated dynamic_shapes format to avoid errors during torch.export.export.
(3) Avoid KeyError in _rename_dynamic_shapes_with_model_inputs funtion.
(4) Add real world case of a HF model with kv_cache to test on ONNX exporter.
Pull Request resolved: #143158
Approved by: https://github.com/xadupre, https://github.com/shubhambhokare1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants