Skip to content

Conversation

@izdeby
Copy link
Contributor

@izdeby izdeby commented Jun 7, 2019

Stack from ghstack:

Introducing BFloat16 tensor type
BFoat16 is a float32 truncated to its first 16 bits. So it has the same 8 bits for exponent, and only 7 bits for mantissa. It is therefore easy to convert from and to float32, and because it has basically the same range as float32, it minimizes the risks of having NaNs or exploding/vanishing gradients when switching from float32. In addition, you can use the bfloat16 format to accurately represent all integers [-256, 256], which means you can encode an int8 in bfloat16 without loss of accuracy.

In this PR:

  • Introduced BFloat16 type

Differential Revision: D15819369

@pytorchbot pytorchbot added the module: internals Related to internal abstractions in c10 and ATen label Jun 7, 2019
@izdeby izdeby mentioned this pull request Jun 7, 2019
@izdeby izdeby changed the title Intoducing bfloat16 type [WIP] Intoducing bfloat16 type Jun 7, 2019
izdeby added 2 commits June 7, 2019 10:39
Intoducing bfloat16 type

gh-metadata: pytorch pytorch 21522 gh/izdeby/8/head
Intoducing bfloat16 type

gh-metadata: pytorch pytorch 21522 gh/izdeby/8/head
@izdeby izdeby changed the title [WIP] Intoducing bfloat16 type Intoducing bfloat16 type Jun 7, 2019
@izdeby izdeby changed the title Intoducing bfloat16 type [WIP] Intoducing bfloat16 type Jun 10, 2019
Intoducing bfloat16 type

gh-metadata: pytorch pytorch 21522 gh/izdeby/8/head
@izdeby izdeby changed the title [WIP] Intoducing bfloat16 type Intoducing bfloat16 type Jun 10, 2019
@izdeby izdeby requested a review from gchanan June 10, 2019 20:19
Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable enough

Copy link
Contributor

@gchanan gchanan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks pretty good!

It would be great if:

  1. you could do use memcpy, as suggested by @ezyang.
  2. you look into big endianness; we don't test it (which is fine for now, you can even assert in your bfloat16_test for now), but it should be only a few line change and you can just test it once.

Intoducing bfloat16 type

gh-metadata: pytorch pytorch 21522 gh/izdeby/8/head
@izdeby
Copy link
Contributor Author

izdeby commented Jun 12, 2019

@ezyang, @gchanan used memcpy and added endianness check.


namespace detail {
inline bool isSmallEndian() {
int num = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should test this statically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You link https://stackoverflow.com/questions/1583791/constexpr-and-endianness presumably because you wanted to implement this using constexpr. But I don't think we actually are requiring you to implement it with constexpr. All we are asking is that there isn't actually a conditional check in the generated machine code. I don't know if the current implementation you have here, it is managed to be compiled away; if it is, that would be sufficient for me (it's a good thing to put in a comment.)

Intoducing bfloat16 type

gh-metadata: pytorch pytorch 21522 gh/izdeby/8/head
@jgong5
Copy link
Collaborator

jgong5 commented Jun 17, 2019

Would this PR consider operator overloading like +-*/ for bfloat16 data type like what is being done for half?

@izdeby
Copy link
Contributor Author

izdeby commented Jun 28, 2019

@ezyang, @gchanan, @colesbury can you guys, please, take another look at this PR?
Due to a few build issues i had to update this PR several times.

Important to note:

  1. we cant use memcpy in f32_from_bits and bits_from_f32 as it will fail build in HIP.
  2. making isSmallEndian() a static check isn't 'clean'. Should i implement it?
  3. verified big/small endianess via unit tests for small endian and godbolt for a big one.

@izdeby izdeby changed the title [WIP] Intoducing bfloat16 type Intoducing bfloat16 type Jun 28, 2019
@izdeby izdeby requested review from colesbury, ezyang and gchanan June 28, 2019 17:24
@ezyang
Copy link
Contributor

ezyang commented Jun 28, 2019

LGTM modulo comments.

Intoducing bfloat16 type

gh-metadata: pytorch pytorch 21522 gh/izdeby/8/head
Intoducing bfloat16 type

gh-metadata: pytorch pytorch 21522 gh/izdeby/8/head
@izdeby izdeby requested a review from gchanan July 8, 2019 21:52
This was referenced Jul 9, 2019
@ngimel
Copy link
Collaborator

ngimel commented Jul 9, 2019

@gchanan
Copy link
Contributor

gchanan commented Jul 9, 2019

@ngimel that's discussed above (I think the discussion is hidden though).

@izdeby did you file an issue for tracking the rounding here?

c10::BFloat16 a = c10::BFloat16(in);
float out = c10::detail::f32_from_bits(a.x);

EXPECT_FLOAT_EQ(1.1663108e-38, out);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does this number come from?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd comment on what you are trying to test here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Killed this test as it not too representative now when i think about it.

Intoducing bfloat16 type

gh-metadata: pytorch pytorch 21522 gh/izdeby/8/head
@izdeby izdeby requested a review from colesbury July 9, 2019 21:10
@zou3519 zou3519 deleted the gh/izdeby/8/head branch July 10, 2019 04:16
@facebook-github-bot
Copy link
Contributor

@izdeby merged this pull request in e72b617.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@izdeby merged this pull request in e72b617.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: internals Related to internal abstractions in c10 and ATen

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants