Skip to content

Conversation

@jgong5
Copy link
Collaborator

@jgong5 jgong5 commented May 13, 2024

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented May 13, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/126068

Note: Links to docs will display an error until the docs builds have been completed.

❌ 21 Cancelled Jobs, 70 Unrelated Failures

As of commit cd4decd with merge base 94fea82 (image):

CANCELLED JOBS - The following jobs were cancelled. Please retry:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

UNSTABLE - The following jobs failed but were likely due to flakiness present on trunk and has been marked as unstable:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jgong5 pushed a commit that referenced this pull request May 13, 2024
@jgong5 jgong5 marked this pull request as draft May 13, 2024 14:22
[ghstack-poisoned]
@jgong5 jgong5 changed the title [inductor][cpp] bf16/fp16 gemm template computed with fp32 [inductor][cpp] bf16/fp16 gemm template computed with fp32 w/o epilogue fusion May 14, 2024
[ghstack-poisoned]
jgong5 pushed a commit that referenced this pull request May 14, 2024
[ghstack-poisoned]
@jgong5 jgong5 marked this pull request as ready for review May 14, 2024 09:16
… w/o epilogue fusion"


This PR adds the initial bf16/fp16 gemm template support with micro-gemm implemented with fused type casting and fp32 computation. It doesn't provide epilogue fusion support yet which will be added in the next PR.

cc voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
… w/o epilogue fusion"


This PR adds the initial bf16/fp16 gemm template support with micro-gemm implemented with fused type casting and fp32 computation. It doesn't provide epilogue fusion support yet which will be added in the next PR.

cc voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
… w/o epilogue fusion"


This PR adds the initial bf16/fp16 gemm template support with micro-gemm implemented with fused type casting and fp32 computation. It doesn't provide epilogue fusion support yet which will be added in the next PR.

cc voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
@jgong5 jgong5 requested review from jansel and lezcano and removed request for jansel May 14, 2024 12:16
… w/o epilogue fusion"


This PR adds the initial bf16/fp16 gemm template support with micro-gemm implemented with fused type casting and fp32 computation. It doesn't provide epilogue fusion support yet which will be added in the next PR.

cc voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
… w/o epilogue fusion"


This PR adds the initial bf16/fp16 gemm template support with micro-gemm implemented with fused type casting and fp32 computation. It doesn't provide epilogue fusion support yet which will be added in the next PR.

cc voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

[ghstack-poisoned]
jgong5 pushed a commit that referenced this pull request May 14, 2024
… w/o epilogue fusion"


As part of #125683, this PR adds the initial bf16/fp16 gemm template support with micro-gemm implemented with fused type casting and fp32 computation. It doesn't provide epilogue fusion support yet which will be added in the next PR.

cc voznesenskym penguinwu EikanWang Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang

Differential Revision: [D58017580](https://our.internmc.facebook.com/intern/diff/D58017580)

[ghstack-poisoned]
[ghstack-poisoned]
@desertfire
Copy link
Contributor

@pytorchbot revert -m "failing internal tests" -c "ghfirst"

The cpu_select_algorithm tests you added/changed seem to be failing internally

Example of 1 error: pastebin.com/BB8E1Ngx

Generally the succinct lines are in the form of

222/torch/include/ATen/cpu/vec/vec_half.h:19:10: error: '__builtin_ia32_vcvtps2ph' needs target feature f16c

616/torch/include/ATen/cpu/vec/vec256/vec256_bfloat16.h:107:7: error: always_inline function '_mm256_cvtph_ps' requires target feature 'f16c', but would be inlined into function 'cvtfp16_fp32' that is compiled without support for 'f16c'

34/torch/include/ATen/cpu/vec/vec_half.h:37:10: error: always_inline function '_cvtsh_ss' requires target feature 'f16c', but would be inlined into function 'half2float_scalar' that is compiled without support for 'f16c'

etc.

@PaliC , fbcode and github are out of sync on this one. This PR is reverted but D58015598 isn't. Please help to solve the problem.

@desertfire
Copy link
Contributor

@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

[ghstack-poisoned]
@jgong5
Copy link
Collaborator Author

jgong5 commented Jun 5, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR has internal changes and must be landed via Phabricator

Details for Dev Infra team Raised by workflow job

petrex pushed a commit to petrex/pytorch that referenced this pull request Jun 5, 2024
[ghstack-poisoned]
@jgong5
Copy link
Collaborator Author

jgong5 commented Jun 7, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR has internal changes and must be landed via Phabricator

Details for Dev Infra team Raised by workflow job

[ghstack-poisoned]
@jgong5
Copy link
Collaborator Author

jgong5 commented Jun 8, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: This PR has internal changes and must be landed via Phabricator

Details for Dev Infra team Raised by workflow job

Jiong Gong added 2 commits June 8, 2024 01:27
[ghstack-poisoned]
[ghstack-poisoned]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants