[ROCm] Missing host device #29547

iotamudelta · 2019-11-11T14:21:25Z

Missing __device__ and __host__ annotations in the complex case. Make it less UB.

Note that this still rather unsavory code: std::real is only constexpr from C++14 on onwards ( https://en.cppreference.com/w/cpp/numeric/complex/real2 ) which is the requirement for __device__.

What I am trying to say is: this particular piece of code should not have passed review and not been merged, IMHO, as it tries to codify UB.

Also note that the benchmarks referenced in source were CPU and CUDA-only.

ezyang · 2019-11-11T14:28:43Z

cc @dylanbespalko

ezyang · 2019-11-11T15:10:33Z

cc @zasdfgbnm

ezyang · 2019-11-11T15:11:25Z

waiting on CI

iotamudelta · 2019-11-11T15:18:36Z

Looks like this is already failing CI.

bddppq · 2019-11-11T16:43:17Z

master doesn't seem broken. The centos build failure looks to be infra issue:

14:48:11 error: failed to execute compile
14:48:11 caused by: error reading compile response from server
14:48:11 caused by: Failed to read response header
14:48:11 caused by: failed to fill whole buffer

I have kicked off another run.

facebook-github-bot

@bddppq has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

dylanbespalko · 2019-11-11T18:28:39Z

@iotamudelta,

I'm working on adding complex number support to PyTorch. I'm sorry about the issue, I have added it to the discussion on complex numbers #755. Please provide your feedback there.

zasdfgbnm · 2019-11-11T18:35:18Z

c10/util/TypeCast.h


 template<>
-inline void cast_and_store<std::complex<float>>(const ScalarType dest_type, void *ptr, std::complex<float> value_) {
+C10_HOST_DEVICE inline void cast_and_store<std::complex<float>>(const ScalarType dest_type, void *ptr, std::complex<float> value_) {


Hi @iotamudelta, is there any specific warning you are seeing? IIRC when I wrote this piece of code, I did this intentionally because this function shouldn't be called inside a __device__ or __device__ __host__ function (because we didn't support complex on cuda).

We are seeing issues with this with clang (not hcc) which does instantiate for complex. I agree that it's not UB if it's not doing complex.

I should add: with a vanilla PyTorch code base.

Strange, this could be a problem only if inside the kernel the dispatch has scalar_t being any of the complex types, which I don't think there is any.

zasdfgbnm

Please see my reply here: #755 (comment)

cast_and_store is not expected to be called anywhere inside a device code in current PyTorch, that is the reason why the C10_HOST_DEVICE qualifier was not there and there is no UB with current PyTorch. But since we are switching to C++14 soon, it is nice to have a C10_HOST_DEVICE qualifier which is necessary when we need to enable complex on GPU.

bddppq · 2019-11-11T23:33:16Z

The build fails at compiling the hipified BinaryOpsKernel.cu (unfortunately I don't see any useful error message from the build log):

23:27:31 CMake Error at torch_generated_BinaryOpsKernel.hip.o.cmake:174 (message):
23:27:31   Error generating file
23:27:31   /var/lib/jenkins/workspace/build/caffe2/CMakeFiles/torch.dir/__/aten/src/ATen/native/hip/./torch_generated_BinaryOpsKernel.hip.o
23:27:31 
23:27:31

This file has been split into smaller ones in #29428 (Thanks to @zasdfgbnm) because the compilation unit being too big. @iotamudelta Could you rebase this one to the current master to see if the build failure goes away?

…st_device

Summary: Currently, the dynamic casting mechanism is implemented assuming no support of complex on GPU. This will no longer be true in the soon future. #29547 could clear some clang warning but the complex support on GPU is still not complete: - fetch is not supported - casting between complex64 and complex128 is not supported - complex scalar types are not tested This PR is what should be done for type promotion in order to add support to complex dtype on GPU, as suggested in #755 (comment) Note that what is newly added here in this PR is not tested due to the lack of basic support of complex dtypes (I can not construct a complex tensor). But his PR shouldn't break any existing part of PyTorch. For the merge this PR, consider two options: - We could merge this PR now so that dylanbespalko could conveniently work based on master, if there is something wrong here not found by code review, dylanbespalko would find when adding complex integration. - Or, we could just leave this PR open, don't merge it. But then dylanbespalko might need to manually apply this to his branch in order to support type promotion of complex. Pull Request resolved: #29612 Differential Revision: D18451061 Pulled By: ezyang fbshipit-source-id: 6d4817e87f0cc2e844dc28c0355a7e53220933a6

bddppq · 2019-11-12T18:16:16Z

@iotamudelta Could you resolve merge conflicts?

…st_device

iotamudelta · 2019-11-12T22:20:15Z

@bddppq should be resolved. Looks like it minimized the required changes.

facebook-github-bot

@bddppq has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

iotamudelta · 2019-11-13T15:08:13Z

@bddppq any insight into the one failing internal test target?

ezyang · 2019-11-13T15:30:07Z

It's fake, you're good to go.

facebook-github-bot · 2019-11-13T19:46:37Z

@bddppq merged this pull request in 6d54c5d.

Add missing annotations.

a94a873

iotamudelta added module: rocm AMD GPU support for Pytorch open source labels Nov 11, 2019

iotamudelta requested review from bddppq and ezyang November 11, 2019 14:21

ezyang approved these changes Nov 11, 2019

View reviewed changes

facebook-github-bot reviewed Nov 11, 2019

View reviewed changes

dylanbespalko mentioned this pull request Nov 11, 2019

Integrating complex tensors #755

Closed

10 tasks

zasdfgbnm reviewed Nov 11, 2019

View reviewed changes

zasdfgbnm approved these changes Nov 11, 2019

View reviewed changes

zasdfgbnm mentioned this pull request Nov 11, 2019

Complex support on GPU for dynamic casting #29612

Closed

iotamudelta added 2 commits November 12, 2019 09:56

Merge remote-tracking branch 'rocm_upstream/upstream' into missing_ho…

f2c441e

…st_device

Add two more annotations.

5ad34a4

iotamudelta mentioned this pull request Nov 12, 2019

Add missing CUDA target attributes. #29651

Closed

Merge remote-tracking branch 'rocm_upstream/upstream' into missing_ho…

3cb7c5c

…st_device

facebook-github-bot reviewed Nov 12, 2019

View reviewed changes

bddppq approved these changes Nov 13, 2019

View reviewed changes

facebook-github-bot closed this in 6d54c5d Nov 13, 2019

facebook-github-bot added the merged label Nov 13, 2019

iotamudelta deleted the missing_host_device branch November 13, 2019 21:45

mruberry added the Merged label Oct 28, 2020

[ROCm] Missing host device #29547

[ROCm] Missing host device #29547

Uh oh!

Conversation

iotamudelta commented Nov 11, 2019

Uh oh!

ezyang commented Nov 11, 2019

Uh oh!

ezyang commented Nov 11, 2019

Uh oh!

ezyang commented Nov 11, 2019

Uh oh!

iotamudelta commented Nov 11, 2019

Uh oh!

bddppq commented Nov 11, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

dylanbespalko commented Nov 11, 2019

Uh oh!

zasdfgbnm Nov 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iotamudelta Nov 11, 2019

Choose a reason for hiding this comment

Uh oh!

iotamudelta Nov 11, 2019

Choose a reason for hiding this comment

Uh oh!

zasdfgbnm Nov 11, 2019

Choose a reason for hiding this comment

Uh oh!

zasdfgbnm left a comment

Choose a reason for hiding this comment

Uh oh!

bddppq commented Nov 11, 2019

Uh oh!

bddppq commented Nov 12, 2019

Uh oh!

iotamudelta commented Nov 12, 2019

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

iotamudelta commented Nov 13, 2019

Uh oh!

ezyang commented Nov 13, 2019

Uh oh!

facebook-github-bot commented Nov 13, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

zasdfgbnm Nov 11, 2019 •

edited

Loading