[Unity][Dlight] Fix decode-GeMV rule when spatial-inner without broadcasting #15330

MasterJH5574 · 2023-07-16T05:48:10Z

This PR fixes a bug of the previous decode-GeMV dlight scheduling.

Previously, when the inner dimension of the largest tensor is spatial, in the end the fused epilogue block was not bound to any thread axis, which is wrong and will generate wrong GPU code with wrong numerical results. That is because after doing reverse-compute-at of the epilogue block, there are at lease one remaining spatial axis, and such axis is supposed to be bound to threadIdx.

This PR fixes this issue, and add three test cases which can cover both the reduction-inner and spatial-inner cases with or without broadcasting.

tvm-bot · 2023-07-16T05:48:13Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @quic-sanirudh _{See #10317 for details}

_{Generated by tvm-bot}

…casting This PR fixes a bug of the previous decode-GeMV dlight scheduling. Previously, when the inner dimension of the largest tensor is spatial, in the end the fused epilogue block was not bound to any thread axis, which is wrong and will generate wrong GPU code with wrong numerical results. That is because after doing reverse-compute-at of the epilogue block, there are at lease one remaining spatial axis, and such axis is supposed to be bound to threadIdx. This PR fixes this issue, and add three test cases which can cover both the reduction-inner and spatial-inner cases with or without broadcasting.

…casting (#15330) This PR fixes a bug of the previous decode-GeMV dlight scheduling. Previously, when the inner dimension of the largest tensor is spatial, in the end the fused epilogue block was not bound to any thread axis, which is wrong and will generate wrong GPU code with wrong numerical results. That is because after doing reverse-compute-at of the epilogue block, there are at lease one remaining spatial axis, and such axis is supposed to be bound to threadIdx. This PR fixes this issue, and add three test cases which can cover both the reduction-inner and spatial-inner cases with or without broadcasting.

Hzfengsy approved these changes Jul 16, 2023

View reviewed changes

MasterJH5574 added the branch: unity label Jul 16, 2023

MasterJH5574 mentioned this pull request Jul 16, 2023

[Bugfix][Dlight] Fix the schedule rule for decode-gemv #15279

Closed

MasterJH5574 force-pushed the unity-dev/2023-07-15-dlight-decode-gemv-spatial-inner branch from 3a31781 to 4c04c40 Compare July 16, 2023 10:12

tqchen merged commit cf401bc into apache:unity Jul 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Unity][Dlight] Fix decode-GeMV rule when spatial-inner without broadcasting #15330

[Unity][Dlight] Fix decode-GeMV rule when spatial-inner without broadcasting #15330

Uh oh!

MasterJH5574 commented Jul 16, 2023

Uh oh!

tvm-bot commented Jul 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Unity][Dlight] Fix decode-GeMV rule when spatial-inner without broadcasting #15330

[Unity][Dlight] Fix decode-GeMV rule when spatial-inner without broadcasting #15330

Uh oh!

Conversation

MasterJH5574 commented Jul 16, 2023

Uh oh!

tvm-bot commented Jul 16, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants