[release/2.4] [ROCm] Enable vector size for 8 for half precision types in elementwise kernels (#1671) #1738

jerrymannil · 2024-11-20T21:34:02Z

Enable *_load_dwordx4 ISA for BFloat16 and Half by using vector size of 8

@akadutta

…se kernels (#1671) Enable *_load_dwordx4 ISA for BFloat16 and Half by using vector size of 8 Co-author: @akadutta

okakarpa · 2024-11-20T22:40:26Z

Jenkins build for 1626a877c9703e7ad341e8710b7bcf62e0ece7d7 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Detected error during Pytorch building:

Warning: Unused direct dependencies:
	/var/lib/jenkins/pytorch/build/lib/libshm.so
	/opt/rocm/lib/libhsa-runtime64.so.1
	/lib/x86_64-linux-gnu/libm.so.6
[8000/8635] Building HIPCC object caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o
FAILED: caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o 
cd /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip && /opt/conda/envs/py_3.10/bin/cmake -E make_directory /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/. && /opt/conda/envs/py_3.10/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=RELEASE -D generated_file:STRING=/var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/./torch_hip_generated_attention.hip.o -P /var/lib/jenkins/pytorch/build/caffe2/CMakeFiles/torch_hip.dir/__/aten/src/ATen/native/transformers/hip/torch_hip_generated_attention.hip.o.cmake
In file included from /var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/attention.hip:84:
/var/lib/jenkins/pytorch/aten/src/ATen/native/transformers/hip/aotriton_adapter.h:120:10: error: no matching constructor for initialization of 'aotriton::TensorView<0>'
  120 |   return aotriton::TensorView<0>(reinterpret_cast<intptr_t>(q.data_ptr()),
      |          ^                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

pruthvistony · 2024-11-21T17:12:40Z

@jerrymannil
Can we upstream this change.

jerrymannil · 2024-11-21T18:37:14Z

Can we upstream this change.
Let's wait for QA testing to complete.
Also @jeffdaily has asked me to do a full UT sute run. Also doing that now

jerrymannil · 2025-01-13T19:36:04Z

!cherry-pick --onto release/2.5

@akadutta

…se kernels (#1671) (#1738) Enable *_load_dwordx4 ISA for BFloat16 and Half by using vector size of 8 Co-author: @akadutta

rocm-mici · 2025-01-13T19:37:41Z

Created branch release/2.5_cherry-pick_pr-1738 and #1831

…f precision types in elementwise kernels (#1831) Cherry-pick of #1738 Co-authored-by: Jerry Mannil <[email protected]>

@akadutta

…se kernels (#1671) (#1738) Enable *_load_dwordx4 ISA for BFloat16 and Half by using vector size of 8 Co-author: @akadutta

[ROCm] Enable vector size for 8 for half precision types in elementwi…

1626a87

…se kernels (#1671) Enable *_load_dwordx4 ISA for BFloat16 and Half by using vector size of 8 Co-author: @akadutta

jerrymannil requested review from jeffdaily and pruthvistony November 20, 2024 21:34

pruthvistony approved these changes Nov 21, 2024

View reviewed changes

pruthvistony merged commit 8f9b9d3 into ROCm:release/2.4 Nov 21, 2024
1 check failed

jithunnair-amd changed the title ~~[ROCm] Enable vector size for 8 for half precision types in elementwise kernels (#1671)~~ [release/2.4] [ROCm] Enable vector size for 8 for half precision types in elementwise kernels (#1671) Nov 25, 2024

rocm-mici pushed a commit that referenced this pull request Jan 13, 2025

[ROCm] Enable vector size for 8 for half precision types in elementwi…

14d46ef

…se kernels (#1671) (#1738) Enable *_load_dwordx4 ISA for BFloat16 and Half by using vector size of 8 Co-author: @akadutta

rocm-mici mentioned this pull request Jan 13, 2025

[AUTOGENERATED] [release/2.5] [ROCm] Enable vector size for 8 for half precision types in elementwise kernels #1831

Merged

pruthvistony pushed a commit that referenced this pull request Jan 14, 2025

[AUTOGENERATED] [release/2.5] [ROCm] Enable vector size for 8 for hal…

c5667f0

…f precision types in elementwise kernels (#1831) Cherry-pick of #1738 Co-authored-by: Jerry Mannil <[email protected]>

jithunnair-amd pushed a commit that referenced this pull request Mar 17, 2025

[ROCm] Enable vector size for 8 for half precision types in elementwi…

bdd2f58

…se kernels (#1671) (#1738) Enable *_load_dwordx4 ISA for BFloat16 and Half by using vector size of 8 Co-author: @akadutta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[release/2.4] [ROCm] Enable vector size for 8 for half precision types in elementwise kernels (#1671) #1738

[release/2.4] [ROCm] Enable vector size for 8 for half precision types in elementwise kernels (#1671) #1738

Uh oh!

jerrymannil commented Nov 20, 2024

Uh oh!

okakarpa commented Nov 20, 2024

Uh oh!

pruthvistony commented Nov 21, 2024

Uh oh!

Uh oh!

jerrymannil commented Nov 21, 2024

Uh oh!

jerrymannil commented Jan 13, 2025

Uh oh!

rocm-mici commented Jan 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[release/2.4] [ROCm] Enable vector size for 8 for half precision types in elementwise kernels (#1671) #1738

[release/2.4] [ROCm] Enable vector size for 8 for half precision types in elementwise kernels (#1671) #1738

Uh oh!

Conversation

jerrymannil commented Nov 20, 2024

Uh oh!

okakarpa commented Nov 20, 2024

Uh oh!

pruthvistony commented Nov 21, 2024

Uh oh!

Uh oh!

jerrymannil commented Nov 21, 2024

Uh oh!

jerrymannil commented Jan 13, 2025

Uh oh!

rocm-mici commented Jan 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants