Skip to content

[ONNX] Fix float point detection for optional tensor (with unknown rank) within a list#81386

Closed
thiagocrepaldi wants to merge 3 commits intopytorch:masterfrom
thiagocrepaldi:thiagofc/fix-squeeze-11
Closed

[ONNX] Fix float point detection for optional tensor (with unknown rank) within a list#81386
thiagocrepaldi wants to merge 3 commits intopytorch:masterfrom
thiagocrepaldi:thiagofc/fix-squeeze-11

Conversation

@thiagocrepaldi
Copy link
Copy Markdown
Collaborator

@thiagocrepaldi thiagocrepaldi commented Jul 13, 2022

In some scenarios, by combining a traced model with a scripted function in it, a %74 : Tensor?[] = prim::ListConstruct(%35, %y_int, %x_int) (aka List of Optional Tensor) can be generated, which will make symbolic_helper._is_fp() fail to read the data type of the specified input.

In such scenario, something like type = value.type().scalarType() raises RuntimeError: r INTERNAL ASSERT FAILED at "/github/pytorch/aten/src/ATen/core/jit_type_base.h":545, please report a bug to PyTorch. that refers to

  template <typename T>
  T& expectRef() {
    auto* r = castRaw<T>();
    AT_ASSERT(r);
    return *r;
  }

What happens is that for torch._C.TypeList in this repro, input.type() is torch._C.TypeList which does not have scalarType() method. Instead, value.type().getElementType().dtype() should be used to get the underlying type.

This PR tries to use value.type().getElementType().dtype() when isinstance(value.type(), torch.ListType).
A unit test is proposed along with the fix.

@facebook-github-bot
Copy link
Copy Markdown
Contributor

facebook-github-bot commented Jul 13, 2022

🔗 Helpful links

✅ No Failures (0 Pending)

As of commit 9e5b44e (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚


This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@thiagocrepaldi thiagocrepaldi force-pushed the thiagofc/fix-squeeze-11 branch 3 times, most recently from 19b47dc to 633af41 Compare July 13, 2022 16:53
@thiagocrepaldi thiagocrepaldi changed the title [WIP] Improve shape inference for squeeze-11 and float point detection [WIP][ONNX] Fix float point detection during ONNX export Jul 13, 2022
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Jul 13, 2022
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
@thiagocrepaldi thiagocrepaldi changed the title [WIP][ONNX] Fix float point detection during ONNX export [ONNX] Fix float point detection during ONNX export Jul 13, 2022
@thiagocrepaldi thiagocrepaldi marked this pull request as ready for review July 13, 2022 23:31
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Jul 13, 2022
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
@ngimel ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Jul 14, 2022
@thiagocrepaldi thiagocrepaldi force-pushed the thiagofc/fix-squeeze-11 branch from 2a11f67 to 9ea1699 Compare July 14, 2022 04:34
@thiagocrepaldi thiagocrepaldi changed the title [ONNX] Fix float point detection during ONNX export [ONNX] Fix float point detection for optional tensor (with unknown rank) within a list Jul 14, 2022
@thiagocrepaldi thiagocrepaldi self-assigned this Jul 14, 2022
@thiagocrepaldi thiagocrepaldi linked an issue Jul 14, 2022 that may be closed by this pull request
@thiagocrepaldi thiagocrepaldi force-pushed the thiagofc/fix-squeeze-11 branch from 9ea1699 to dcbf595 Compare July 14, 2022 04:43
@thiagocrepaldi thiagocrepaldi added module: onnx Related to torch.onnx release notes: onnx torch.onnx related changes that should show up in the release notes topic: bug fixes topic category labels Jul 14, 2022
@BowenBao BowenBao self-assigned this Jul 15, 2022
@thiagocrepaldi thiagocrepaldi force-pushed the thiagofc/fix-squeeze-11 branch from 0b91ae4 to 982c9d9 Compare July 15, 2022 22:08
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Aug 11, 2022
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
@thiagocrepaldi thiagocrepaldi force-pushed the thiagofc/fix-squeeze-11 branch 2 times, most recently from a03a094 to 345bf26 Compare August 12, 2022 00:38
@justinchuby
Copy link
Copy Markdown
Collaborator

justinchuby commented Aug 12, 2022

Just making sure: Do we want to say a list of floats "is fp"? Or would it potentially make sense to treat lists differently and create something of an "is_float_list"?

@thiagocrepaldi thiagocrepaldi force-pushed the thiagofc/fix-squeeze-11 branch from 345bf26 to 9e5b44e Compare August 12, 2022 01:57
@thiagocrepaldi
Copy link
Copy Markdown
Collaborator Author

Just making sure: Do we want to say a list of floats "is fp"? Or would it potentially make sense to treat lists differently and create something of an "is_float_list"?

IIRC after discussing with Bowen, we concluded it was ok to treat both the same way

Copy link
Copy Markdown
Collaborator

@BowenBao BowenBao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@thiagocrepaldi
Copy link
Copy Markdown
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Copy Markdown
Collaborator

@pytorchbot successfully started a merge job. Check the current status here.
The merge job was triggered without a flag. This means that your change will be merged once all checks on your PR have passed (ETA: 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the wiki.
Please reach out to the PyTorch DevX Team with feedback or questions!

facebook-github-bot pushed a commit that referenced this pull request Aug 14, 2022
…nk) within a list (#81386) (#81386)

Summary:
In some scenarios, by combining a traced model with a scripted function in it, a `%74 : Tensor?[] = prim::ListConstruct(%35, %y_int, %x_int)` (aka List of Optional Tensor) can be generated, which will make `symbolic_helper._is_fp()` fail to read the data type of the specified input.

In such scenario, something like `type = value.type().scalarType()` raises `RuntimeError: r INTERNAL ASSERT FAILED at "/github/pytorch/aten/src/ATen/core/jit_type_base.h":545, please report a bug to PyTorch. ` that refers to

```
  template <typename T>
  T& expectRef() {
    auto* r = castRaw<T>();
    AT_ASSERT(r);
    return *r;
  }
```

What happens is that for `torch._C.TypeList` in this repro, `input.type()` is `torch._C.TypeList` which does not have `scalarType()` method. Instead, `value.type().getElementType().dtype()` should be used to get the underlying type.

This PR tries to use `value.type().getElementType().dtype()` when `isinstance(value.type(), torch.ListType)`.
A unit test is proposed along with the fix.

Pull Request resolved: #81386
Approved by: https://github.com/BowenBao

Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/2c089290b676a221817e48c7de42d1b2bd13609a

Reviewed By: atalman

Differential Revision: D38671263

fbshipit-source-id: 0e3162b393d3ae268bf1eaca08e3aafaf4a72f9a
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Aug 25, 2022
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Sep 7, 2022
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Sep 20, 2022
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Oct 3, 2022
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Oct 31, 2022
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Jan 19, 2023
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Jan 25, 2023
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Feb 3, 2023
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
thiagocrepaldi pushed a commit to thiagocrepaldi/detectron2 that referenced this pull request Feb 8, 2023
The fix involves changes to both PyTorch and detectron2:
	* Pytorch had a bug which prevented some tensors to be identified as float
	  (refer to pytorch/pytorch#81386)

	* `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally
	  does advanced indexing on a `squeeze`d `roi_map`.
	  The aforentioned `squeeze` fails rank inference due to the
	  presence of `onnx::If` on its composition to support dynamic
	  dims. By replacing `squeeze` by `replace` on
	  detectron's `heatmaps_to_keypoints`, shape inference succeeds,
	  allowing ONNX export to succeed with dynamic axes support.
facebook-github-bot pushed a commit to facebookresearch/detectron2 that referenced this pull request Feb 10, 2023
Summary:
In order to export `KRCNNConvDeconvUpsampleHead` to ONNX using `torch.jit.script`, changes to both PyTorch and detectron2:

- Pytorch has a bug which prevents a tensor wrapped by a list as float. Refer to the required [fix](pytorch/pytorch#81386)

- `detectron2/structures/keypoints.py::heatmaps_to_keypoints` internally does advanced indexing on a `squeeze`d tensor. The aforementioned `squeeze` fails rank inference due to the presence of `onnx::If` on its implementation (to support dynamic dims). The fix is replacing `squeeze` by `reshape`.  A possible fix to `squeeze` on PyTorch side might be done too (TBD and would take some time), but the proposed change here does not bring any consequence to detectron2 while it enables ONNX support with scriptable `KRCNNConvDeconvUpsampleHead `.

After the proposed changes, the `KRCNNConvDeconvUpsampleHead` does include a `Loop` node to represent a for-loop inside the model and `dynamic outputs`, as shown below:

![image](https://user-images.githubusercontent.com/5469809/179559001-f60fb8af-ec79-4758-b271-736467b5d96f.png)

This PR has been tested with ONNX Runtime (this [PR](#4205)) to ensure the ONNX output matches PyTorch's for different `gen_input(X, Y)` combinations and it succeeded. The model was converted to ONNX once with a particular input and tested with inputs of different shapes and compared to equality to PyTorch's
Depends on: pytorch/pytorch#81386 and #4291

Pull Request resolved: #4315

Reviewed By: newstzpz

Differential Revision: D42756423

fbshipit-source-id: dc410df18da07f48c14f4cae9a4a91530a0ec602
@thiagocrepaldi thiagocrepaldi deleted the thiagofc/fix-squeeze-11 branch May 4, 2023 17:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged module: onnx Related to torch.onnx open source release notes: onnx torch.onnx related changes that should show up in the release notes topic: bug fixes topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unable to export PointRend model (from detectron2) to ONNX

7 participants