Skip to content

Conversation

@bstriner
Copy link
Contributor

@bstriner bstriner commented May 4, 2018

Hi everybody,

I'm getting an unresolved external symbol ThenBlasGemm when compiling for GPU with CMake using MSVC 2015. This function isn't picked up by the script to create the def file for the DLL, so it isn't being exported. This PR just adds that function to the regex so the build works. Linker error below.

Does Windows CMake GPU not go through CI? I'm curious why this hasn't come up before.

Cheers

[D:\Projects\tensorflow\tensorflow\contrib\cmake\buildout\_gru_ops.vcxproj]
   199>blas_gemm.obj : error LNK2019: unresolved external symbol "public: class stream_executor::Stream & __cdecl stream_executor::Stream::ThenBlasGemm(enum stream_executor::blas::Transpose,enum stream_executor::blas::Transpose,unsigned __int64,unsigned __int64,unsigned __int64,double,class stream_executor::DeviceMemory<double> const &,int,class stream_executor::DeviceMemory<double> const &,int,double,class stream_executor::DeviceMemory<double> *,int)" (?ThenBlasGemm@Stream@stream_executor@@QEAAAEAV12@W4Transpose@blas@2@0_K11NAEBV?$DeviceMemory@N@2@H2HNPEAV52@H@Z) referenced in function "public: void __cdecl tensorflow::functor::TensorCuBlasGemm<double>::operator()(class tensorflow::OpKernelContext *,bool,bool,unsigned __int64,unsigned __int64,unsigned __int64,double,double const *,int,double const *,int,double,double *,int)" (??R?$TensorCuBlasGemm@N@functor@tensorflow@@QEAAXPEAVOpKernelContext@2@_N1_K22NPEBNH3HNPEANH@Z) 

@tensorflowbutler
Copy link
Member

Nagging Assignee @protoget: It has been 21 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

@bstriner
Copy link
Contributor Author

bstriner commented Jun 5, 2018

Related PR by @gunan resolves this issue: #19415

If it is so close to the export limit, would it make sense to just make multiple dlls? Would get to be less stingy about what gets exported and what doesn't.

@bstriner bstriner closed this Jun 5, 2018
@bstriner bstriner deleted the include_thenblasgemm branch June 5, 2018 08:12
@gunan
Copy link
Contributor

gunan commented Jun 5, 2018

It actually does. One other things we are planning is defining a C/C++ API that can be used when building kernels and restricting exported symbols to only those. Right now, kernels can use anything in core TF framework.

copybara-service bot pushed a commit that referenced this pull request Dec 3, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
fa539fbde987ff6421fd2937fade495baf633630 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
2c014035923e0394b2cfcb81eaf090a96621b0aa by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
e919ed54e825f2e905aaf0cc279dd21cd80f1ce9 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
ca16839096feb93e0454ec380c5c707c30199346 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
eedc079ca9a4db9e611d84877a25b3da21386f16 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
8e0305cd47002f0c1f8668a3cbcbce5428f2a4c6 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
aabe9c68d964609f78f29e17ee0680798ad0c6ac by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
87da2ebfab388f113482e852009401a9e416974a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
e0ee48c3a37018ba985c850931592d62eadf7c2e by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
be2e457922e2cddeaf5aca13dd022f3ac2a1393b by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 be2e457922e2cddeaf5aca13dd022f3ac2a1393b
PiperOrigin-RevId: 702273510
copybara-service bot pushed a commit that referenced this pull request Dec 19, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 19, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 19, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 19, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 19, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 20, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 20, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 20, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 20, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 20, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 20, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 20, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#19096 from openxla:skozub/e2m1 d4de0a369d9dc853f34f3cf3bf7dcc5a47502106
PiperOrigin-RevId: 707638099
copybara-service bot pushed a commit that referenced this pull request Dec 20, 2024
Imported from GitHub PR openxla/xla#19096

This PR adds F4E2M1FN primitive type (4-bit float with 2 bits exponent and 1 bit mantissa), F8E8M0FNU primitive type (8-bit float with 8 bits exponent, no mantissa and no sign) and enables loads/stores in the same way S4/U4 type is implemented.

This will enable using microscaling (MX) formats ([RFC](openxla/xla#18085)), such as MXFP4.

```c
F4E2M1FN
- Exponent bias: 1
- Maximum stored exponent value: 3 (binary 11)
- Maximum unbiased exponent value: 3 - 1 = 2
- Minimum stored exponent value: 1 (binary 01)
- Minimum unbiased exponent value: 1 − 1 = 0
- Has Positive and Negative zero
- Doesn't have infinity
- Doesn't have NaNs

Additional details:
- Zeros (+/-): S.00.0
- Max normal number: S.11.1 = ±2^(2) x (1 + 0.5) = ±6.0
- Min normal number: S.01.0 = ±2^(0) = ±1.0
- Min subnormal number: S.00.1 = ±2^(0) x 0.5 = ±0.5

F8E8M0FNU
- Exponent bias: 127
- Maximum stored exponent value: 254 (binary 1111'1110)
- Maximum unbiased exponent value: 254 - 127 = 127
- Minimum stored exponent value: 0 (binary 0000'0000)
- Minimum unbiased exponent value: 0 − 127 = -127
- Doesn't have zero
- Doesn't have infinity
- NaN is encoded as binary 1111'1111

Additional details:
- Zeros cannot be represented
- Negative values cannot be represented
- Mantissa is always 1
```

Related PRs:
- openxla/stablehlo#2582
- jax-ml/ml_dtypes#181
- llvm/llvm-project#95392
- llvm/llvm-project#108877
- jax-ml/ml_dtypes#166
- llvm/llvm-project#107127
- llvm/llvm-project#111028

The PR is split into multiple commits just to make the review easier, it is possible that some tests could fail if only some (i.e. not all) of these commits are applied.
Copybara import of the project:

--
f493e4803eaa5ff3da3ceb130e9348c014b4a2e8 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: import mxfloat.h

--
87d005630b310a355d7c30b22828c35237373f17 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: primitive type

--
70ca82093faeec98f2dc5e8b82f617d99ca96849 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: literal support

--
c479f0940da490e9668e2f48e14a7466f0c4a97f by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: conversion codegen

--
daaa3af3ce3af456f2ef44dbc291ebeb09e86d9b by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: python interface

--
1f0e19ff14733eff790726936b68ef0cf607a766 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: FFI

--
999bf96092e57c7b3039811f2887281f347ff17a by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: HLO evaluator

--
d7d5af74c5f8a94522779a121c0a4a962156fb64 by Sergey Kozub <[email protected]>:

Add F4E2M1FN type: add tests

--
9e8c7bc02849f241d0f05941221d99f1d08d9e67 by Sergey Kozub <[email protected]>:

Add F8E8M0FNU type

--
1e344174b931cea4978770ab740dfed67186c2f4 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments

--
d4de0a369d9dc853f34f3cf3bf7dcc5a47502106 by Sergey Kozub <[email protected]>:

Addressing PR#19096 review comments (round 2)

Merging this change closes #19096

PiperOrigin-RevId: 708390061
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants