-
Notifications
You must be signed in to change notification settings - Fork 15.5k
Description
Hi, using this sample code:
#include <cfenv>
#include <cstdio>
struct bc_t
{
float minDist, maxDist;
float hudSize;
float a1, a2, a3 = 0.0f;
float minSize;
float b1, b2, b3 = 0.0f;
float maxSize;
float c1, c2, c3 = 0.0f;
};
int main(int, char**)
{
feenableexcept(FE_DIVBYZERO);
bc_t bc;
sscanf("1 1 1", "%f %f %f", &bc.hudSize, &bc.minSize, &bc.maxSize);
bc.minDist = bc.hudSize / bc.maxSize;
bc.maxDist = bc.hudSize / bc.minSize;
printf("%f\n", bc.minDist);
return 0;
}By compiling it with -O2, the compiled code raises some division by zero exception when executed:
Process 3247973 stopped
* thread #1, name = 'a.out', stop reason = signal SIGFPE: floating point divide by zero
frame #0: 0x00005555555551c5 a.out`main + 101
a.out`main:
-> 0x5555555551c5 <+101>: divps %xmm2, %xmm0
0x5555555551c8 <+104>: movlps %xmm0, (%rsp)
0x5555555551cc <+108>: cvtss2sd %xmm0, %xmm0
0x5555555551d0 <+112>: leaq 0xe3c(%rip), %rdi
Process 3249378 stopped
* thread #1, name = 'a.out', stop reason = signal SIGFPE: floating point divide by zero
frame #0: 0x00005555555551c5 a.out`main((null)=<unavailable>, (null)=<unavailable>) at a.cpp:20:26
17 feenableexcept(FE_DIVBYZERO);
18 bc_t bc;
19 sscanf("1 1 1", "%f %f %f", &bc.hudSize, &bc.maxSize, &bc.minSize);
-> 20 bc.minDist = bc.hudSize / bc.maxSize;
21 bc.maxDist = bc.hudSize / bc.minSize;
22 printf("%f\n", bc.minDist);
23 return 0;
My guess is that the SSE optimized code raises division by zero exceptions from unused fields of the xmm registers, despite we discard those results.
The requirement of -O2 requiring either -ftrapping-math or -ffast-math is curious since -ffast-math sets -fno-trapping-math.
Also, I thought -ftrapping-math was the default, and that -fno-trapping-math is part of -ffast-math, but just using -O2 behaves like if -fno-trapping-math is used, so it behaves like if part of -ffast-math was enabled anyway… But maybe -ffast-math just modifies other behaviors that make -ftrapping-math or -fno-trapping-math irrelevant.
Using the Godbolt's compiler explorer (here Clang 19.1.0), it only works with -O2 -ffast-math (or depreacted -Ofast):
| compiler flags | status |
|---|---|
| ✅️ | |
-O0 |
✅️ |
-Os |
❌️ |
-Os -fno-trapping-math |
❌️ |
-Os -ftrapping-math |
✅️ |
-O1 |
✅️ |
-O2 |
❌️ |
-O2 -fno-trapping-math |
❌️ |
-O2 -ftrapping-math |
✅️ |
-ffast-math |
✅️ |
-O2 -ffast-math |
✅️ |
-O2 -ffast-math -ftrapping-math |
✅️ |
-O2 -ffast-math -fno-trapping-math |
✅️ |
-Ofast |
✅️ |
See: https://godbolt.org/z/395c8nMef
And (with -ftrapping-math added): https://godbolt.org/z/czbGTjs6E
And (with -fno-trapping-math added): https://godbolt.org/z/zYr5Pba3W
With 32-bit i686 I reproduce the bug when using SSE but get no bug when not using SSE.
| compiler flags | -m32 -msse |
-m32 -mno-sse |
|---|---|---|
| ✅️ | ✅️ | |
-Os |
❌️ | ✅️ |
-O1 |
✅️ | ✅️ |
-O2 |
❌️ | ✅️ |
-ffast-math |
✅️ | ✅️ |
-O2 -ffast-math |
✅️ | ✅️ |
See 32-bit i686 with SSE I get the same failure: https://godbolt.org/z/199eqhzGh
And 32-bit i686 build without SSE I get no failure: https://godbolt.org/z/sfd9TsTj4
On my end on Ubuntu 24.04 with amd64, I get same results with clang 19.1.5.
On Ubuntu 24.04 with amd64 and different versions of the clang compiler, I only get it working with clang 13 and 14, every later version breaks it:
-O0 |
-O1 |
-O2 |
-O2 -ftrapping-math |
-ffast-math |
-O2 -ffast-math |
|
|---|---|---|---|---|---|---|
| clang 13.0.1 | ✅️ | ✅️ | ✅️ | ✅️ | ✅️ | ✅️ |
| clang 14.0.6 | ✅️ | ✅️ | ✅️ | ✅️ | ✅️ | ✅️ |
| clang 15.0.7 | ✅️ | ✅️ | ❌️ | ✅️ | ✅️ | ✅️ |
| clang 16.0.6 | ✅️ | ✅️ | ❌️ | ✅️ | ✅️ | ✅️ |
| clang 17.0.6 | ✅️ | ✅️ | ❌️ | ✅️ | ✅️ | ✅️ |
| clang 18.1.3 | ✅️ | ✅️ | ❌️ | ✅️ | ✅️ | ✅️ |
| clang 19.1.5 | ✅️ | ✅️ | ❌️ | ✅️ | ✅️ | ✅️ |
On GCC 14.02 I get none of those issues (no one false positive division by zero error is raised whatever the compiler flags being used: https://godbolt.org/z/rxMWM68Mc
Note: Disabling SSE on amd64 just produces garbage computation (1.0/1.0 gives 0.0), but I don't know if that makes sense to disable SSE on amd64. GCC produces the same garbage (1.0/1.0 giving 0.0). Though I'm surprised to not get a warning if that's not legit to do, also I'm surprised the generated code runs if that's not legit to do. See: https://godbolt.org/z/77PYGTc5b (Clang) and: https://godbolt.org/z/cvW4Er3Kd (GCC).