Skip to content

Comments

Add safety check so that TransposeBigMLFloat16 test passes#77

Merged
TedThemistokleous merged 2 commits intorocm6.3_internal_testingfrom
fix_transpose_big_mlfloat16_test
Nov 19, 2024
Merged

Add safety check so that TransposeBigMLFloat16 test passes#77
TedThemistokleous merged 2 commits intorocm6.3_internal_testingfrom
fix_transpose_big_mlfloat16_test

Conversation

@sstamenk
Copy link
Member

Added a check that queries hipGetDeviceProperties and returns false if gridDim.y is larger than the device maximum supported gridDim.y.

This check existed inside CUDA EP but was absent from ROCM EP and as a result TransposeBigMLFloat16 test was failing.

…ap kernel so that TransposeBigMLFloat16 test passes
@TedThemistokleous
Copy link
Collaborator

fix format using their lintrunner -a tool

@sstamenk sstamenk changed the title Added maximum gridDim.y overflow check before calling transposeNoOverlap kernel so that TransposeBigMLFloat16 test passes Add safety check to that TransposeBigMLFloat16 test passes Nov 18, 2024
return rocblas_dgeam(handle, transa, transb, m, n, alpha, A, lda, beta, B, ldb, C, ldc);
}

inline bool CanUse_rocblasTransposeHelper_MLFloat16(int /*m*/, int /*n*/) { return true; } // CUDA has a limited grid size of 65536, ROCm has higher limits.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove the inline here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the implementation of the function inside fpgeneric.cu to mirror the way it was done inside CUDA EP.

@TedThemistokleous TedThemistokleous merged commit 061c493 into rocm6.3_internal_testing Nov 19, 2024
@TedThemistokleous
Copy link
Collaborator

Upstreaming this.

@sstamenk sstamenk deleted the fix_transpose_big_mlfloat16_test branch November 19, 2024 16:02
@sstamenk sstamenk changed the title Add safety check to that TransposeBigMLFloat16 test passes Add safety check so that TransposeBigMLFloat16 test passes Nov 20, 2024
TedThemistokleous pushed a commit that referenced this pull request Jan 3, 2025
* Added maximum gridDim.y overflow heck before calling transposeNoOverlap kernel so that TransposeBigMLFloat16 test passes

* Fix formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants