Skip to content

Conversation

@izdeby
Copy link
Contributor

@izdeby izdeby commented Jun 17, 2019

Stack from ghstack:

Differential Revision: D15856091


Enabled bfloat16 on CPU for all methods in declaration.cwrap which are supported by Half.
Tested via unit tests

@pytorchbot pytorchbot added module: cpu CPU specific problem (e.g., perf, algorithm) module: internals Related to internal abstractions in c10 and ATen module: operators module: pybind Related to our Python bindings / interactions with other Python libraries module: tests Issues related to tests (not the torch.testing module) labels Jun 17, 2019
@izdeby izdeby changed the title Added Bfloat16 tensor for cpu with very limited support [WIP] Added Bfloat16 tensor for cpu with very limited support Jun 17, 2019
izdeby added 5 commits June 17, 2019 17:38
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
@pytorchbot pytorchbot added the module: printing Issues related to the printing format of tensors label Jun 19, 2019
izdeby added 11 commits June 20, 2019 08:28
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
…ort"

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
@izdeby izdeby changed the title [WIP] Added Bfloat16 tensor for cpu with very limited support Added Bfloat16 tensor for cpu with very limited support Jun 28, 2019
@izdeby izdeby requested review from gchanan and jerryzh168 and removed request for gchanan and jerryzh168 June 28, 2019 18:01
Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
} \
}()

#define AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND(SCALARTYPE1, SCALARTYPE2, SCALARTYPE3, TYPE, NAME, ...) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is technically BC breaking, but probably no one uses it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should i create a separate version for AT_DISPATCH_ALL_TYPES_AND_COMPLEX_AND to have 2 and 3 scalar types to be safe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah, I think it's fine. I'd want to know if someone actually uses this so they can complain :).

all_dtypes = torch.testing.get_all_dtypes()
do_test_dtypes(self, all_dtypes, torch.strided, torch.device('cpu'))
if torch.cuda.is_available():
all_dtypes.remove(torch.bfloat16) # Remove once _th_zero_ is enabled on cuda for bfloat16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as with the unfold test below, it's better to run the actual test and assert it raises an exception so you force the test to get updated. As it is, we'll probably forget to actually test this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed for all the tests in this file but cant update this test due to a weird behavior in common_utils.py. This test will be updated in the next PR in the stack and CUDA path will work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cant update this test due to a weird behavior in common_utils.py

OOC, what kind of behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In common_utils.py, in do_test_dtypes i would add something like:
if dtype == torch.bfloat16 and device == 'cuda:0': but it will be ignored even when
dtype value is torch.bfloat16 and device is "cuda:0". The problem was with comparing the device variable value to 'cuda:0' - it would never pass.

Same check would work in test_torch.py with no problems.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not aware that device comparison with a string was a supported operation :) What I would expect to work is if you did device == torch.device('cuda:0')

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
@izdeby izdeby requested a review from gchanan July 8, 2019 23:25
This was referenced Jul 9, 2019
Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
Copy link
Contributor

@gchanan gchanan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some follow on issues to file/fix.

for dt in torch.testing.get_all_dtypes():
x = torch.tensor((1, 1), dtype=dt, device=device)
if (device == 'cuda' and dt == torch.bfloat16):
self.assertRaises(RuntimeError, lambda: x.clone())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for assertRaises, is there a message you can you set for when the tests fail? It would be helpful to put something like "placeholder for future functionality -- check if you should update the test" so it's clear we don't actually intend for this to be the perfect behavior.

Added Bfloat16 tensor for cpu with very limited support

gh-metadata: pytorch pytorch 21860 gh/izdeby/10/head
@zou3519 zou3519 deleted the gh/izdeby/10/head branch July 10, 2019 16:11
zdevito pushed a commit to zdevito/ATen that referenced this pull request Jul 10, 2019
Summary:
Pull Request resolved: pytorch/pytorch#21860
ghimport-source-id: 5290755b63033cdfdeb911a4ecf4aa282b3db02d

Test Plan: Imported from OSS

Differential Revision: D15856091

Pulled By: izdeby

fbshipit-source-id: 54e7e17be1b5c5a2e80a41feaeaeba75dbb8108f
@ailzhang
Copy link
Contributor

This PR broke lint.

@facebook-github-bot
Copy link
Contributor

@izdeby merged this pull request in 10c60b6.

@izdeby
Copy link
Contributor Author

izdeby commented Jul 10, 2019

This PR broke lint.

@ailzhang, im working of a fix PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: cpu CPU specific problem (e.g., perf, algorithm) module: internals Related to internal abstractions in c10 and ATen module: printing Issues related to the printing format of tensors module: pybind Related to our Python bindings / interactions with other Python libraries module: tests Issues related to tests (not the torch.testing module)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants