ARM: fix excessive immediate values for pc-relative ldr#994
ARM: fix excessive immediate values for pc-relative ldr#994whitequark wants to merge 1 commit intoocaml:trunkfrom
Conversation
|
You may have noticed I haven't attached a testcase. I do have a testcase, and it passes now. Unfortunately:
|
|
I can take a look at this. Can you try to construct a testcase from scratch? It doesn't sound like it should be too difficult. |
|
OK, this triggers the bug: https://hastebin.com/tejipomose.ml |
|
Can you add that into the testsuite as part of your patch? |
9d01926 to
9da2122
Compare
|
done |
|
Please would you do a Changes entry? Could the test file also include a single-line comment referring to this GPR (just so the motivation for the test is clear)? |
9da2122 to
24666db
Compare
|
@dra27 done However, don't merge this yet as I'm getting reports of crashes with this patch. |
24666db to
f1df6ca
Compare
|
Just in case it could give inspiration: the same problem showed up recently in CompCert (guess who wrote the two ARM code emitters in question?) and here is how @bschommer addressed it: AbsInt/CompCert#155 |
|
Thanks - I'll leave to Mark/Xavier to merge as and when you're ready! |
|
@mshinwell @xavierleroy I have now fixed the bug that caused the crashes (falling through into the constant island prior to a switch) and I believe this is mergeable. @xavierleroy I feel like my implementation is both simpler and more rigorous, not to mention easier to review. |
|
@whitequark The actual fix was commited with some unfortunate whitespace changes, you can see the fix without them here AbsInt/CompCert@ab6c84c?w=1 . Basically both fixes, but instead of introducing a special case for the switch I emit constants before the instruction if my estimated size is large enough, which is more conservative and not always optimal. |
asmcomp/arm/emit.mlp
Outdated
| let emit_instr_before_pool i = | ||
| match i.desc with | ||
| | Lswitch jumptbl -> 2 + Array.length jumptbl | ||
| | _ -> emit_instr i |
There was a problem hiding this comment.
It seems fragile to me to have three repeated matches all of which have to be in sync. How about changing emit_instr_before_pool to be called emit_or_defer_instr and return the size together with either None (no instruction deferred) or Some instr (for Lswitch). Then instead of changing fallthrough, which seems a bit misleading, you could just test for Some and avoid the third match; you just emit the instruction in the Some. It might not be worth having emit_instr_after_pool at that point.
There was a problem hiding this comment.
Then instead of changing fallthrough, which seems a bit misleading, you could just test for Some and avoid the third match; you just emit the instruction in the Some. It might not be worth having emit_instr_after_pool at that point.
I don't understand what you are saying here (quite ironic given that you evidently consider it so obvious that you've used "just" twice).
asmcomp/arm/emit.mlp
Outdated
| (* consider the degenerate case where a single literal is followed by | ||
| a jump table longer than 4KB; we have to emit the constant pool | ||
| before the jump table or it will be too late *) | ||
| let emit_instr_before_pool i = |
There was a problem hiding this comment.
I think this comment could be improved. Perhaps start with "It may be necessary to emit a constant pool before emitting the next instruction..." or something?
ARM does not have an instruction with a 32-bit immediate. Therefore, on ARM, to load arbitrary 32-bit constants, there are two main strategies: load the halves of the constant one by one (in Thumb, this would be done with movw/movt), or periodically emit the so-called constant islands, that is, small chunks of data inside executable code that are jumped around. Note that when loading constant islands, care must be taken to avoid the same problem as what they are solving: the ldr instruction only has a 12-bit offset, and so there may not be more than 4K of code between the load and the constant island it refers to. The OCaml ARM backend opts for the constant islands. It implements it as follows: after emitting an instruction, it checks whether, if the first instruction it has emitted since emitting a previous constant island, can still address the last constant it is going to emit. This works in most cases, but not when emitting Lswitch. Consider that a switch can have an arbitrarily large jump table. If the jump table is larger than 4K, then, by the time we have emitted it, all ldr instructions prior to the switch are irreparably broken. This commit changes the constant island emission logic to first calculate the size of an Lswitch, then consider emitting a constant island, and only afterwards emitting Lswitch. For all other instructions the logic is unchanged. To do this in a fully rigorous way, arguably it would be better to have a separate function that returns the size of an instruction, and a separate one that emits an instruction. However, emit_instr has quite a bit of logic, which can affect the size of the instruction, and so I have opted against duplicating that logic, on the grounds that this will make maintenance much trickier.
|
Updated. |
f1df6ca to
41bcd73
Compare
|
I'm not sure your code compiles (e.g. line 812). Could you try to add some more test cases to exercise each of the three cases in the switch code you've changed in the emitter (emitting of a constant pool without a branch over it; emitting of a constant pool with a branch over it; not emitting the constant pool right now)? This shouldn't take long and would give some more certainty to this change, although I think it is correct. |
|
I still don't find the proposed code readable enough, so I went ahead and wrote an alternate fix more in the style of @bschommer . See pull request #1022 . |
|
Also: the |
|
Superceded by #1022 |
* render README/CHANGELOG/LICENSE as .prose and on their own pages
ARM does not have an instruction with a 32-bit immediate. Therefore,
on ARM, to load arbitrary 32-bit constants, there are two main
strategies: load the halves of the constant one by one (in Thumb,
this would be done with movw/movt), or periodically emit
the so-called constant islands, that is, small chunks of data inside
executable code that are jumped around.
Note that when loading constant islands, care must be taken to avoid
the same problem as what they are solving: the ldr instruction only
has a 12-bit offset, and so there may not be more than 4K of code
between the load and the constant island it refers to.
The OCaml ARM backend opts for the constant islands. It implements
it as follows: after emitting an instruction, it checks whether,
if the first instruction it has emitted since emitting a previous
constant island, can still address the last constant it is going
to emit.
This works in most cases, but not when emitting Lswitch. Consider
that a switch can have an arbitrarily large jump table. If the jump
table is larger than 4K, then, by the time we have emitted it,
all ldr instructions prior to the switch are irreparably broken.
This commit changes the constant island emission logic to first
calculate the size of an Lswitch, then consider emitting a constant
island, and only afterwards emitting Lswitch. For all other
instructions the logic is unchanged.
To do this in a fully rigorous way, arguably it would be better to
have a separate function that returns the size of an instruction,
and a separate one that emits an instruction. However, emit_instr
has quite a bit of logic, which can affect the size of
the instruction, and so I have opted against duplicating that logic,
on the grounds that this will make maintenance much trickier.