Skip to content

Conversation

@chunyuan-w
Copy link
Collaborator

@chunyuan-w chunyuan-w commented Sep 24, 2024

Stack from ghstack (oldest at bottom):

The template_buffer_has_other_users function checks the case where there're epilogue nodes and the template output has users other than these epilogue nodes. When there's no epilogue nodes, the function could return False directly.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 24, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136518

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 289f83b with merge base e9bfbf7 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]
[ghstack-poisoned]
@chunyuan-w chunyuan-w marked this pull request as ready for review September 25, 2024 02:37
@chunyuan-w chunyuan-w added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 25, 2024
@chunyuan-w
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

[ghstack-poisoned]
[ghstack-poisoned]
BoyuanFeng pushed a commit to BoyuanFeng/pytorch that referenced this pull request Sep 25, 2024
…no epilogue_nodes (pytorch#136518)

The `template_buffer_has_other_users` function checks the case where there're epilogue nodes and the template output has users other than these epilogue nodes.  When there's no epilogue nodes, the function could return `False` directly.

Pull Request resolved: pytorch#136518
Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5
ghstack dependencies: pytorch#136418
pytorchmergebot pushed a commit that referenced this pull request Sep 27, 2024
Fixes the max-autotune failure of `soft_actor_critic` of Torchbench in FP32 single thread dynamic shape case:
```log
  File "/home/user/inductor/pytorch/torch/_inductor/codegen/cpp_micro_gemm.py", line 136, in codegen_call
    C_ptr = f"&({kernel.index(C, [0, 0])})"
  File "/home/user/inductor/pytorch/torch/_inductor/codegen/cpp_template_kernel.py", line 135, in index
    else self.args.input(node.get_name())
  File "/home/user/inductor/pytorch/torch/_inductor/codegen/common.py", line 1251, in input
    assert name not in V.graph.removed_buffers, name
AssertionError: buf_GemmOut
```

The 1st and 2nd linear does not need to use local buffer while the 3rd linear needs to use local buffer.
The 3rd linear which uses local buffer will add its global buffer (named as `buf_GemmOut`) into `V.graph.removed_buffers`.

When scheduling the nodes, the 1st linear (won't use local buffer) will get its output buffer (also named as `buf_GemmOut`) from the input and found that it's in the `V.graph.removed_buffers` and raise AssertionError. The issue is that the output buffer of all these linears are all names with `buf_GemmOut`, which have a conflict.

Rename these buffers by adding the name of the `template_buffer` as the prefix.

Pull Request resolved: #136419
Approved by: https://github.com/leslie-fang-intel, https://github.com/jgong5
ghstack dependencies: #136418, #136518
@github-actions github-actions bot deleted the gh/chunyuan-w/33/head branch October 27, 2024 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants