[ROCm] Check supported archs before setting preferred blas backend to hipblasLT #133359

pytorchbot · 2024-08-13T21:06:54Z

This PR is needed to resolve usability issues with PyTorch ROCm nightly wheels on non-gfx90a/gf94x architectures as a result of #127944.

Addresses #119081 (comment)

With this PR's changes, I get the following on a gfx908 (unsupported by hipblasLT) architecture:

Using setter function:

>>> torch.backends.cuda.preferred_blas_library(backend="cublaslt")
[W617 19:58:58.286088851 Context.cpp:280] Warning: torch.backends.cuda.preferred_blas_library is an experimental feature. If you see any error or unexpected behavior when this flag is set please file an issue on GitHub. (function operator())
[W617 19:59:02.125161985 Context.cpp:291] Warning: Attempting to use hipBLASLt on an unsupported architecture! Overriding blas backend to hipblas (function operator())
<_BlasBackend.Cublas: 0>

Using TORCH_BLAS_PREFER_HIPBLASLT env var:

root@9d47bf40d4d4:/tmp/pytorch# TORCH_BLAS_PREFER_CUBLASLT=1 python
>>> import torch
>>> torch.backends.cuda.preferred_blas_library()
[W619 06:14:11.627715807 Context.cpp:274] Warning: Attempting to use hipBLASLt on an unsupported architecture! Overriding blas backend to hipblas (function operator())
<_BlasBackend.Cublas: 0>

and the following on a gfx90a (supported by hipblasLT) architecture:

Using setter function:

>>> import torch
>>> torch.backends.cuda.preferred_blas_library()
<_BlasBackend.Cublaslt: 1>
>>> torch.backends.cuda.preferred_blas_library(backend="cublas")
<_BlasBackend.Cublas: 0>
>>> torch.backends.cuda.preferred_blas_library(backend="cublaslt")
[W620 18:38:29.404265518 Context.cpp:293] Warning: torch.backends.cuda.preferred_blas_library is an experimental feature. If you see any error or unexpected behavior when this flag is set please file an issue on GitHub. (function operator())
<_BlasBackend.Cublaslt: 1>

Using TORCH_BLAS_PREFER_HIPBLASLT env var:

root@9d47bf40d4d4:/tmp/pytorch# TORCH_BLAS_PREFER_HIPBLASLT=1 python
>>> import torch
>>> torch.backends.cuda.preferred_blas_library()
<_BlasBackend.Cublaslt: 1>

(Same result for Using TORCH_BLAS_PREFER_CUBLASLT env var:)

cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @jataylo @hongxiayang

… hipblasLT (#128753) This PR is needed to resolve usability issues with PyTorch ROCm nightly wheels on non-gfx90a/gf94x architectures as a result of #127944. Addresses #119081 (comment) ### With this PR's changes, I get the following on a gfx908 (unsupported by hipblasLT) architecture: _Using setter function:_ ``` >>> torch.backends.cuda.preferred_blas_library(backend="cublaslt") [W617 19:58:58.286088851 Context.cpp:280] Warning: torch.backends.cuda.preferred_blas_library is an experimental feature. If you see any error or unexpected behavior when this flag is set please file an issue on GitHub. (function operator()) [W617 19:59:02.125161985 Context.cpp:291] Warning: Attempting to use hipBLASLt on an unsupported architecture! Overriding blas backend to hipblas (function operator()) <_BlasBackend.Cublas: 0> ``` _Using `TORCH_BLAS_PREFER_HIPBLASLT` env var:_ ``` root@9d47bf40d4d4:/tmp/pytorch# TORCH_BLAS_PREFER_CUBLASLT=1 python >>> import torch >>> torch.backends.cuda.preferred_blas_library() [W619 06:14:11.627715807 Context.cpp:274] Warning: Attempting to use hipBLASLt on an unsupported architecture! Overriding blas backend to hipblas (function operator()) <_BlasBackend.Cublas: 0> ``` ### and the following on a gfx90a (supported by hipblasLT) architecture: _Using setter function:_ ``` >>> import torch >>> torch.backends.cuda.preferred_blas_library() <_BlasBackend.Cublaslt: 1> >>> torch.backends.cuda.preferred_blas_library(backend="cublas") <_BlasBackend.Cublas: 0> >>> torch.backends.cuda.preferred_blas_library(backend="cublaslt") [W620 18:38:29.404265518 Context.cpp:293] Warning: torch.backends.cuda.preferred_blas_library is an experimental feature. If you see any error or unexpected behavior when this flag is set please file an issue on GitHub. (function operator()) <_BlasBackend.Cublaslt: 1> ``` _Using `TORCH_BLAS_PREFER_HIPBLASLT` env var:_ ``` root@9d47bf40d4d4:/tmp/pytorch# TORCH_BLAS_PREFER_HIPBLASLT=1 python >>> import torch >>> torch.backends.cuda.preferred_blas_library() <_BlasBackend.Cublaslt: 1> ``` (Same result for _Using `TORCH_BLAS_PREFER_CUBLASLT` env var:_) Pull Request resolved: #128753 Approved by: https://github.com/malfet (cherry picked from commit e16276b)

pytorch-bot · 2024-08-13T21:06:57Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133359

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 79c1390 with merge base b66e3f0 ():

NEW FAILURES - The following jobs have failed:

pull / linux-focal-py3_8-clang9-xla / test (xla, 1, 1, linux.12xlarge) (gh)
test_repeat_truncated
pull / linux-focal-py3.12-clang10 / test (default, 1, 3, linux.2xlarge) (gh)
dynamo/test_dynamic_shapes.py::DynamicShapesMiscTests::test_outside_linear_module_free_dynamic_shapes

This comment was automatically generated by Dr. CI and updates every 15 minutes.

pruthvistony

This patch is required to control hipblasLT backend for few GPU arch where is NOT supported.

pruthvistony · 2024-08-15T03:41:23Z

@atalman
Can you please help on review and merge of this cherry-pick

pytorchbot requested a review from eqy as a code owner August 13, 2024 21:06

This was referenced Aug 13, 2024

[v2.4.1] Release Tracker #132400

Closed

[ROCm] Check supported archs before setting preferred blas backend to hipblasLT #128753

Closed

pytorch-bot bot added ciflow/rocm Trigger "default" config CI on ROCm module: rocm AMD GPU support for Pytorch labels Aug 13, 2024

pytorchbot added the open source label Aug 13, 2024

pruthvistony requested review from jeffdaily and pruthvistony August 13, 2024 22:22

pruthvistony approved these changes Aug 13, 2024

View reviewed changes

atalman approved these changes Aug 15, 2024

View reviewed changes

atalman merged commit 7e0ef34 into release/2.4 Aug 15, 2024

atalman deleted the cherry-pick-128753-by-pytorch_bot_bot_ branch August 15, 2024 19:33

atalman mentioned this pull request Aug 29, 2024

Release 2.4.1 validations checklist and cherry-picks #134694

Closed

40 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ROCm] Check supported archs before setting preferred blas backend to hipblasLT #133359

[ROCm] Check supported archs before setting preferred blas backend to hipblasLT #133359

Uh oh!

pytorchbot commented Aug 13, 2024 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Aug 13, 2024 •

edited

Loading

Uh oh!

pruthvistony left a comment

Uh oh!

pruthvistony commented Aug 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[ROCm] Check supported archs before setting preferred blas backend to hipblasLT #133359

[ROCm] Check supported archs before setting preferred blas backend to hipblasLT #133359

Uh oh!

Conversation

pytorchbot commented Aug 13, 2024 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

With this PR's changes, I get the following on a gfx908 (unsupported by hipblasLT) architecture:

and the following on a gfx90a (supported by hipblasLT) architecture:

Uh oh!

pytorch-bot bot commented Aug 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/133359

❌ 2 New Failures

Uh oh!

pruthvistony left a comment

Choose a reason for hiding this comment

Uh oh!

pruthvistony commented Aug 15, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pytorchbot commented Aug 13, 2024 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Aug 13, 2024 •

edited

Loading