[inductor] patterns to remove pointless view/permute pairs #139136

shunting314 · 2024-10-29T00:54:56Z

Stack from ghstack (oldest at bottom):

These are not artificial patterns I come up. They shows up in linear+CrossEntropyLoss graph.

Consider this snippet:

        class LinearAndCEL(nn.Module):
            def __init__(self):
                super().__init__()
                self.linear = nn.Linear(C, V)
                self.ce = nn.CrossEntropyLoss()

            def forward(self, x, y):
                return self.ce(self.linear(x).view(B * T, V), y.view(-1))

x passed to forward is a 3D tensor of shape [B, T, C].
The self.linear will view x as [BxT, C] shape tensor first, do the matmul and produce a [BxT, V] tensor, and then view this output back to a 3D tensor with shape [B, T, V]. User code is gonna add another view op to convert the tensor shape to [B x T, V]. This generates a pair of redundant views . A pair of redundant permute happens in the backward part when we compute gradients.

The view ops makes it hard to chunk linear+CEL. When the view op breaks up the dimension being chunked, what should the chunker do (even if we merge those dimension again later)? Removing these pointless view pairs makes the chunker simpler. And I think it's in general nice to do.

cc @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

[ghstack-poisoned]

pytorch-bot · 2024-10-29T00:55:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/139136

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 342e83a with merge base e6ff07f ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]

These are not artificial patterns I come up. They shows up in linear+CrossEntropyLoss graph. Consider this snippet: ``` class LinearAndCEL(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(C, V) self.ce = nn.CrossEntropyLoss() def forward(self, x, y): return self.ce(self.linear(x).view(B * T, V), y.view(-1)) ``` `x` passed to `forward` is a 3D tensor of shape [B, T, C]. The `self.linear` will view x as [BxT, C] shape tensor first, do the matmul and produce a [BxT, V] tensor, and then view this output back to a 3D tensor with shape [B, T, V]. User code is gonna add another view op to convert the tensor shape to [B x T, V]. This generates a pair of redundant views . A pair of redundant permute happens in the backward part when we compute gradients. The view ops makes it hard to chunk linear+CEL. When the view op breaks up the dimension being chunked, what should the chunker do (even if we merge those dimension again later)? Removing these pointless view pairs makes the chunker simpler. And I think it's in general nice to do. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]

ghstack-source-id: cc42b54 Pull Request resolved: #139136

eellison

Should this be part of remove_noops ?

torch/_inductor/fx_passes/joint_graph.py

shunting314 · 2024-10-29T18:36:58Z

Should this be part of remove_noops ?

Looks like remove_noops contains those single op patterns (and it does not use pattern matcher), but the patterns added in this PR involves a pair of ops.

eellison · 2024-10-29T18:53:45Z

@shunting314 I more so meant as part of the clean up canonicalization passes we run on generated patterns. but this is pretty rare so not necessary i think.

These are not artificial patterns I come up. They shows up in linear+CrossEntropyLoss graph. Consider this snippet: ``` class LinearAndCEL(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(C, V) self.ce = nn.CrossEntropyLoss() def forward(self, x, y): return self.ce(self.linear(x).view(B * T, V), y.view(-1)) ``` `x` passed to `forward` is a 3D tensor of shape [B, T, C]. The `self.linear` will view x as [BxT, C] shape tensor first, do the matmul and produce a [BxT, V] tensor, and then view this output back to a 3D tensor with shape [B, T, V]. User code is gonna add another view op to convert the tensor shape to [B x T, V]. This generates a pair of redundant views . A pair of redundant permute happens in the backward part when we compute gradients. The view ops makes it hard to chunk linear+CEL. When the view op breaks up the dimension being chunked, what should the chunker do (even if we merge those dimension again later)? Removing these pointless view pairs makes the chunker simpler. And I think it's in general nice to do. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]

ghstack-source-id: cae454f Pull Request resolved: #139136

eellison · 2024-10-29T19:14:28Z

torch/_inductor/fx_passes/joint_graph.py

    node = match.output_node()
    arg_size = list(node.args[0].meta["val"].shape)  # type: ignore[union-attr]
-    if size == arg_size:
+    if guard_size_oblivious(size == arg_size):


Hmm, I dont think this is sufficient. The individual elements will still be using the guarding version. Youd want len(size) == len(arg_size) and all(guard_size_oblivious(dim1, dim2) for zip(size, arg_size))

cc @ezyang for ergonomic pytree / guard_size_oblivious api

Not really sure about the subtlety of this API. If people need to take care when comparing a pair of size list, maybe an API like guard_sizes_oblivious will be helpful. It just do what @eellison added above

The api is it takes a symint and returns True if it is True without adding additional guards. Here you are calling List.__eq__ - that's not going to return a symint. It's going to recursively call the normal, non guard size obvlivious eq on all of its elements.

Got it. Do you want me add an guard_sizes_oblivious (plural form) API to symbolic_shapes.py? Since I need at least call this in two places (in two patterns).

SGTM - we could also do pytree recursive guard_size_oblivious impl. but sizes sounds good

These are not artificial patterns I come up. They shows up in linear+CrossEntropyLoss graph. Consider this snippet: ``` class LinearAndCEL(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(C, V) self.ce = nn.CrossEntropyLoss() def forward(self, x, y): return self.ce(self.linear(x).view(B * T, V), y.view(-1)) ``` `x` passed to `forward` is a 3D tensor of shape [B, T, C]. The `self.linear` will view x as [BxT, C] shape tensor first, do the matmul and produce a [BxT, V] tensor, and then view this output back to a 3D tensor with shape [B, T, V]. User code is gonna add another view op to convert the tensor shape to [B x T, V]. This generates a pair of redundant views . A pair of redundant permute happens in the backward part when we compute gradients. The view ops makes it hard to chunk linear+CEL. When the view op breaks up the dimension being chunked, what should the chunker do (even if we merge those dimension again later)? Removing these pointless view pairs makes the chunker simpler. And I think it's in general nice to do. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]

shunting314 · 2024-10-31T05:15:20Z

@pytorchbot merge

These are not artificial patterns I come up. They shows up in linear+CrossEntropyLoss graph. Consider this snippet: ``` class LinearAndCEL(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(C, V) self.ce = nn.CrossEntropyLoss() def forward(self, x, y): return self.ce(self.linear(x).view(B * T, V), y.view(-1)) ``` `x` passed to `forward` is a 3D tensor of shape [B, T, C]. The `self.linear` will view x as [BxT, C] shape tensor first, do the matmul and produce a [BxT, V] tensor, and then view this output back to a 3D tensor with shape [B, T, V]. User code is gonna add another view op to convert the tensor shape to [B x T, V]. This generates a pair of redundant views . A pair of redundant permute happens in the backward part when we compute gradients. The view ops makes it hard to chunk linear+CEL. When the view op breaks up the dimension being chunked, what should the chunker do (even if we merge those dimension again later)? Removing these pointless view pairs makes the chunker simpler. And I think it's in general nice to do. cc H-Huang awgu kwen2501 wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang aakhundov [ghstack-poisoned]

shunting314 · 2024-11-04T18:31:41Z

@pytorchbot merge

pytorchmergebot · 2024-11-04T18:33:24Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Partially fixing #138685 Add a (relatively safe?) heuristics to skip fusion if we can potentially increasing peak memory. The doc string mainly explains what this PR is doing: ``` The implementation is more like a heuristic since we don't really know if we are at peak or not when trying to fuse these two ndoes. The order of nodes may change later which makes the peak memory estimation hard. Here is how we decide the LOWER BOUND of extra memory allocation if we fuse these 2 nodes: 1. find all buffers read by each node with a single user. These buffers are supposed to be reused if we don't fuses these 2 nodes 2. find the intersection of these buffers for the two node and sum the total buffer size. If we don't fuse these two nodes, we can at lease avoid this much memory allocation. Note that the extra memory allocation is not necessarily causing peak memory increase. This is just a heuristic. We return true only if the saving for fusion can not trade off the extra memory allocation. ``` Pull Request resolved: #138756 Approved by: https://github.com/jansel ghstack dependencies: #139136

…39136) These are not artificial patterns I come up. They shows up in linear+CrossEntropyLoss graph. Consider this snippet: ``` class LinearAndCEL(nn.Module): def __init__(self): super().__init__() self.linear = nn.Linear(C, V) self.ce = nn.CrossEntropyLoss() def forward(self, x, y): return self.ce(self.linear(x).view(B * T, V), y.view(-1)) ``` `x` passed to `forward` is a 3D tensor of shape [B, T, C]. The `self.linear` will view x as [BxT, C] shape tensor first, do the matmul and produce a [BxT, V] tensor, and then view this output back to a 3D tensor with shape [B, T, V]. User code is gonna add another view op to convert the tensor shape to [B x T, V]. This generates a pair of redundant views . A pair of redundant permute happens in the backward part when we compute gradients. The view ops makes it hard to chunk linear+CEL. When the view op breaks up the dimension being chunked, what should the chunker do (even if we merge those dimension again later)? Removing these pointless view pairs makes the chunker simpler. And I think it's in general nice to do. Pull Request resolved: pytorch#139136 Approved by: https://github.com/Chillee, https://github.com/jansel

…ytorch#139136)" This reverts commit 2b86cd7. Reverted pytorch#139136 on behalf of https://github.com/ZainRizvi due to Sorry but this PR seems to have broken on trunk. The failure: distributed/_composable/test_replicate_with_compiler.py::ReplicateTest::test_bucketing_coalesced_op [GH job link](https://github.com/pytorch/pytorch/actions/runs/11615060962/job/32346609889) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/2b86cd74a60ca2483173ba3012506aeac85ab2d7) ([comment](pytorch#139136 (comment)))

huydhn · 2024-11-05T23:39:12Z

test/inductor/test_mkldnn_pattern_matcher.py

                # view + linear + view(joint_graph+freeze pass)
                match_count = match_count + 5 if is_inplace else match_count + 3
-                match_nodes = match_nodes + 7 if is_inplace else match_nodes + 5
+                match_nodes = match_nodes + 8 if is_inplace else match_nodes + 5


@shunting314 I think this line is failing in trunk for inductor/test_cpu_cpp_wrapper.py::DynamicShapesCppWrapperCpuTests::test_linear_binary_dynamic_shapes_cpp_wrapper GH job link HUD commit link

It's a slow test that requires ciflow/slow to run, so it was missed on your PR. The failure can be reproduced locally pytest -v test/inductor/test_cpu_cpp_wrapper.py -k test_linear_binary_dynamic_shapes_cpp_wrapper

To collect memory snapshot for a generated wrapper, run the wrapper with `--cuda-memory-snapshot`. E.g. ``` python /tmp/torchinductor_shunting/tmpyhtfwdlv/wp/cwpulanbieu4beruc6w5uc3podcs2x3rzdk5okftu37c4k3bnd4b.py --cuda-memory-snapshot ``` gives me: <img width="800" alt="Screenshot 2024-11-05 at 3 53 47 PM" src="https://github.com/user-attachments/assets/82edd2d6-df57-488e-a390-8fa5fc00ba5f"> Pull Request resolved: #138429 Approved by: https://github.com/eellison, https://github.com/jansel ghstack dependencies: #139136, #138756

@jgong5

I recently added a new pattern here #139136 to remove pointless view/permute pairs. At that PR, I've already updated the matched pattern/node count in `test_linear_binary` to account for the new pattern. But it looks like with cpp wrapper, one more pattern will be matched. ``` 7 patterns without cpp-wrapper: ========== pattern matched <code object pointless_view at 0x7f6d25c67aa0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object pointless_view_pair at 0x7f6d25c67b50, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.p y", line 581> ======= ========== pattern matched <code object pointless_view at 0x7f6d25c67aa0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object pointless_view at 0x7f6d25c67aa0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object linear at 0x7f6d176e5dc0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mkldnn_fusion.py", line 11 21> ======= ========== pattern matched <code object reshape_linear_reshape_pattern at 0x7f6d176e5210, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mk ldnn_fusion.py", line 732> ======= ========== pattern matched <code object fn at 0x7f6d176d3ec0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mkldnn_fusion.py", line 476> = ====== 8 patterns with cpp wrapper: ========== pattern matched <code object pointless_view at 0x7f8e78bf07c0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object pointless_view_pair at 0x7f8e78bf0870, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.p y", line 581> ======= ========== pattern matched <code object pointless_view at 0x7f8e78bf07c0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object pointless_view at 0x7f8e78bf07c0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object pointless_view at 0x7f8e78bf07c0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object linear at 0x7f8e59c04190, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mkldnn_fusion.py", line 11 21> ======= ========== pattern matched <code object reshape_linear_reshape_pattern at 0x7f8e59dfb520, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mk ldnn_fusion.py", line 732> ======= ========== pattern matched <code object fn at 0x7f8e59dfa290, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mkldnn_fusion.py", line 476> = ====== ``` I fixed this test by +1 to the expected number if cpp wrapper is enabled. But I think fundamentally can we not assert for the total number of patterns matched in the test? I think that makes the test very fragile. People adding new patterns may keep breaking these 'un-related' tests. One possible way to improve is, we have a counter for each specific pattern, in the tests, instead of check the total number of patterns matched, just check the match count for the ***RELEVANT*** patterns. That should reduce false-positive for broken tests. cc possible test creator @jgong5 Fixes #139812 (we need to have this to run this disabled test on your PR) Pull Request resolved: #139942 Approved by: https://github.com/huydhn, https://github.com/jgong5

To collect memory snapshot for a generated wrapper, run the wrapper with `--cuda-memory-snapshot`. E.g. ``` python /tmp/torchinductor_shunting/tmpyhtfwdlv/wp/cwpulanbieu4beruc6w5uc3podcs2x3rzdk5okftu37c4k3bnd4b.py --cuda-memory-snapshot ``` gives me: <img width="800" alt="Screenshot 2024-11-05 at 3 53 47 PM" src="https://github.com/user-attachments/assets/82edd2d6-df57-488e-a390-8fa5fc00ba5f"> Pull Request resolved: pytorch#138429 Approved by: https://github.com/eellison, https://github.com/jansel ghstack dependencies: pytorch#139136, pytorch#138756

@jgong5

…#139942) I recently added a new pattern here pytorch#139136 to remove pointless view/permute pairs. At that PR, I've already updated the matched pattern/node count in `test_linear_binary` to account for the new pattern. But it looks like with cpp wrapper, one more pattern will be matched. ``` 7 patterns without cpp-wrapper: ========== pattern matched <code object pointless_view at 0x7f6d25c67aa0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object pointless_view_pair at 0x7f6d25c67b50, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.p y", line 581> ======= ========== pattern matched <code object pointless_view at 0x7f6d25c67aa0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object pointless_view at 0x7f6d25c67aa0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object linear at 0x7f6d176e5dc0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mkldnn_fusion.py", line 11 21> ======= ========== pattern matched <code object reshape_linear_reshape_pattern at 0x7f6d176e5210, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mk ldnn_fusion.py", line 732> ======= ========== pattern matched <code object fn at 0x7f6d176d3ec0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mkldnn_fusion.py", line 476> = ====== 8 patterns with cpp wrapper: ========== pattern matched <code object pointless_view at 0x7f8e78bf07c0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object pointless_view_pair at 0x7f8e78bf0870, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.p y", line 581> ======= ========== pattern matched <code object pointless_view at 0x7f8e78bf07c0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object pointless_view at 0x7f8e78bf07c0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object pointless_view at 0x7f8e78bf07c0, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/joint_graph.py", l ine 568> ======= ========== pattern matched <code object linear at 0x7f8e59c04190, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mkldnn_fusion.py", line 11 21> ======= ========== pattern matched <code object reshape_linear_reshape_pattern at 0x7f8e59dfb520, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mk ldnn_fusion.py", line 732> ======= ========== pattern matched <code object fn at 0x7f8e59dfa290, file "/home/shunting/ws/pytorch/torch/_inductor/fx_passes/mkldnn_fusion.py", line 476> = ====== ``` I fixed this test by +1 to the expected number if cpp wrapper is enabled. But I think fundamentally can we not assert for the total number of patterns matched in the test? I think that makes the test very fragile. People adding new patterns may keep breaking these 'un-related' tests. One possible way to improve is, we have a counter for each specific pattern, in the tests, instead of check the total number of patterns matched, just check the match count for the ***RELEVANT*** patterns. That should reduce false-positive for broken tests. cc possible test creator @jgong5 Fixes pytorch#139812 (we need to have this to run this disabled test on your PR) Pull Request resolved: pytorch#139942 Approved by: https://github.com/huydhn, https://github.com/jgong5

…ess_view (#154154) The change is direct and clear, the optimizations removes pointless_view iff it all sizes are the same if not we want to return false, there is no need for size oblivious reasoning. this was added in #139136, run existing tests that are added in that PR. Pull Request resolved: #154154 Approved by: https://github.com/bobrenjc93

[inductor] patterns to remove pointless view/permute pairs

1eea7d9

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor module: inductor labels Oct 29, 2024

shunting314 requested review from Chillee, eellison and jansel October 29, 2024 01:02

Chillee approved these changes Oct 29, 2024

View reviewed changes

shunting314 added the topic: not user facing topic category label Oct 29, 2024

jansel approved these changes Oct 29, 2024

View reviewed changes

shunting314 added a commit that referenced this pull request Oct 29, 2024

[inductor] patterns to remove pointless view/permute pairs

4ebecf3

ghstack-source-id: cc42b54 Pull Request resolved: #139136

eellison reviewed Oct 29, 2024

View reviewed changes

torch/_inductor/fx_passes/joint_graph.py Outdated Show resolved Hide resolved

torch/_inductor/fx_passes/joint_graph.py Show resolved Hide resolved

shunting314 added a commit that referenced this pull request Oct 29, 2024

[inductor] patterns to remove pointless view/permute pairs

1e2e7c7

ghstack-source-id: cae454f Pull Request resolved: #139136

eellison reviewed Oct 29, 2024

View reviewed changes

This was referenced Oct 29, 2024

[inductor] don't fuse two nodes if likely increase peak memory #138756

Closed

[inductor] collect memory snapshort in the wrapper #138429

Closed

[Inductor] auto-chunker #136702

Open

pytorchmergebot added the merging label Nov 4, 2024

pytorchmergebot closed this in 4930c4b Nov 4, 2024

pytorchmergebot removed the merging label Nov 4, 2024

huydhn mentioned this pull request Nov 5, 2024

DISABLED test_linear_binary_dynamic_shapes_cpp_wrapper (__main__.DynamicShapesCppWrapperCpuTests) #139812

Closed

huydhn reviewed Nov 5, 2024

View reviewed changes

shunting314 mentioned this pull request Nov 7, 2024

[inductor] fix test_linear_binary_dynamic_shapes_cpp_wrapper #139942

Closed

github-actions bot deleted the gh/shunting314/182/head branch December 6, 2024 02:12

laithsakka mentioned this pull request May 22, 2025

[EASY] used guard_or_false instead of guard_sizes_oblivious in pointless_view #154154

Closed

[inductor] patterns to remove pointless view/permute pairs #139136

[inductor] patterns to remove pointless view/permute pairs #139136

Uh oh!

Conversation

shunting314 commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/139136

✅ No Failures

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

shunting314 commented Oct 29, 2024

Uh oh!

eellison commented Oct 29, 2024

Uh oh!

eellison Oct 29, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shunting314 Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

eellison Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

shunting314 Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

eellison Oct 29, 2024

Choose a reason for hiding this comment

Uh oh!

shunting314 commented Oct 31, 2024

Uh oh!

shunting314 commented Nov 4, 2024

Uh oh!

pytorchmergebot commented Nov 4, 2024

Merge started

Uh oh!

huydhn Nov 5, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

shunting314 commented Oct 29, 2024 •

edited

Loading

pytorch-bot bot commented Oct 29, 2024 •

edited

Loading

eellison Oct 29, 2024 •

edited

Loading