-
Notifications
You must be signed in to change notification settings - Fork 26.3k
ROCm contributions week 28 #9432
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Bring our fork up to date.
Merge from upstream master into our fork.
sync w/ upstream
add pyHIPIFY to PyTorch repo
Update the copyright dates and reflect the reality that this has been a collaborative effort between AMD and Facebook in the last years.
Script is now included in pytorch source.
first round of changes to update PR
merge from upstream
This reverts commit 864dbe4.
next round of fixes to address comments
merge from upstream
After discussion in review, disable flake8 on pyHIPIFY for now.
| smem_size = block.x == 1 ? 0 : block_threads * sizeof(accscalar_t); | ||
| int max_active_blocks; | ||
| #ifdef __HIP_PLATFORM_HCC__ | ||
| max_active_blocks = 16; |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| if (range > 1ULL << 32) { | ||
| generate_random_64<<<NUM_BLOCKS, BLOCK_SIZE, 0, THCState_getCurrentStream(state)>>>( | ||
| gen->state.gen_states, size, data, min_val, range); | ||
| gen->state.gen_states, static_cast<int32_t>(size), data, min_val, range); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| PyTorch specific translations. | ||
| Each of the entries in these maps translates a CUDA string to a tuple containing the | ||
| ROCm/HIP string, a type and API annotation and - optionally - an annotation if it is not | ||
| supported in ROCm/HIP yet. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
ezyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks basically reasonable.
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@iotamudelta Do you mind fixing those minor nits? Then I can merge it. |
As per review, change cast to just int.
…hin hipify-python
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Merge from upstream
Refactoring & Fixing the pyhipify script.
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
… work for multi-GPU setup anyway, and gives a seg fault on call to getNumGPUs()
Hardcode getNumGPUs() to 1 for ROCm builds …
Merge from upstream
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
Summary: This PR contains the ROCm contributions of last week: * documentation of pyHIPIFY data format originating from #8812 reviewing comments by ezyang * removal of most patch files from the `amd_build` directory and integration into the code base * enabling of previously disabled_features that do compile now * improvement to the static_cast feature in pyHIPIFY (it will only apply static_cast to kernel arguments, not launch arguments) * addition of two workarounds to pyHIPIFY for ROCm/HIP shortcomings: a) `__forceinline__` does not imply `static`, hence change to `__inline__`, b) `std::[exp,log,pow]` math functions cannot be selected in device code, use `::[exp,log,pow]` instead. Both of these workarounds will be removed once the issues are fixed upstream. Neither of these issues have surfaced on the CI but were reproduced internally. Pull Request resolved: pytorch/pytorch#9432 Differential Revision: D8887441 Pulled By: ezyang fbshipit-source-id: 71cf5c6b13772a66d10be369a45ebf06e4e268e1
Summary: This PR contains the ROCm contributions of last week: * documentation of pyHIPIFY data format originating from pytorch#8812 reviewing comments by ezyang * removal of most patch files from the `amd_build` directory and integration into the code base * enabling of previously disabled_features that do compile now * improvement to the static_cast feature in pyHIPIFY (it will only apply static_cast to kernel arguments, not launch arguments) * addition of two workarounds to pyHIPIFY for ROCm/HIP shortcomings: a) `__forceinline__` does not imply `static`, hence change to `__inline__`, b) `std::[exp,log,pow]` math functions cannot be selected in device code, use `::[exp,log,pow]` instead. Both of these workarounds will be removed once the issues are fixed upstream. Neither of these issues have surfaced on the CI but were reproduced internally. Pull Request resolved: pytorch#9432 Differential Revision: D8887441 Pulled By: ezyang fbshipit-source-id: 71cf5c6b13772a66d10be369a45ebf06e4e268e1
Summary: This PR contains the ROCm contributions of last week: * documentation of pyHIPIFY data format originating from pytorch#8812 reviewing comments by ezyang * removal of most patch files from the `amd_build` directory and integration into the code base * enabling of previously disabled_features that do compile now * improvement to the static_cast feature in pyHIPIFY (it will only apply static_cast to kernel arguments, not launch arguments) * addition of two workarounds to pyHIPIFY for ROCm/HIP shortcomings: a) `__forceinline__` does not imply `static`, hence change to `__inline__`, b) `std::[exp,log,pow]` math functions cannot be selected in device code, use `::[exp,log,pow]` instead. Both of these workarounds will be removed once the issues are fixed upstream. Neither of these issues have surfaced on the CI but were reproduced internally. Pull Request resolved: pytorch#9432 Differential Revision: D8887441 Pulled By: ezyang fbshipit-source-id: 71cf5c6b13772a66d10be369a45ebf06e4e268e1
This PR contains the ROCm contributions of last week:
amd_builddirectory and integration into the code base__forceinline__does not implystatic, hence change to__inline__, b)std::[exp,log,pow]math functions cannot be selected in device code, use::[exp,log,pow]instead. Both of these workarounds will be removed once the issues are fixed upstream. Neither of these issues have surfaced on the CI but were reproduced internally.