Commit 1a9fe17
authored
libclc: Update remquo (#187998)
This was failing in the float case without -cl-denorms-are-zero
and failing for double. This now passes in all cases.
This was originally ported from rocm device libs in
8db45e4. This is mostly a port
in of more recent changes with a few changes.
- Templatification, which almost but doesn't quite enable
vectorization yet due to the outer branch and loop.
- Merging of the 3 types into one shared code path, instead of
duplicating per type with 3 different functions implemented together.
There are only some slight differences for the half case, which mostly
evaluates as float.
- Splitting out of the is_odd tracking, instead of deriving it from the
accumulated quotient. This costs an extra register, but saves several
instructions. This also enables automatic elimination of all of the quo
output handling when this code is reused for remainder. I'm guessing
this would be unnecessary if SimplifyDemandedBits handled phis.
- Removal of the slow FMA path. I don't see how this would ever be
faster with the number of instructions replacing it. This is really a
problem for the compiler to solve anyway.1 parent d6373b4 commit 1a9fe17
File tree
5 files changed
+316
-287
lines changed- libclc/clc
- include/clc
- math
- lib/generic/math
5 files changed
+316
-287
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
10 | | - | |
11 | | - | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
12 | 13 | | |
13 | | - | |
14 | | - | |
15 | | - | |
| 14 | + | |
16 | 15 | | |
17 | | - | |
18 | | - | |
19 | | - | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
20 | 30 | | |
21 | | - | |
22 | | - | |
23 | | - | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
24 | 34 | | |
Lines changed: 82 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
| 10 | + | |
9 | 11 | | |
| 12 | + | |
10 | 13 | | |
11 | | - | |
12 | | - | |
| 14 | + | |
| 15 | + | |
13 | 16 | | |
14 | 17 | | |
| 18 | + | |
15 | 19 | | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
16 | 23 | | |
17 | 24 | | |
18 | 25 | | |
19 | | - | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
20 | 38 | | |
21 | | - | |
22 | | - | |
23 | | - | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
24 | 44 | | |
25 | | - | |
26 | | - | |
27 | | - | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
28 | 50 | | |
29 | | - | |
30 | | - | |
31 | | - | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
32 | 56 | | |
33 | 57 | | |
34 | | - | |
35 | | - | |
36 | | - | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
37 | 63 | | |
0 commit comments