ARM64: Enable Long Address #4896

kyulee1 · 2016-05-10T22:26:17Z

Fixes #3668
Currently ARM64 codegen can have reference within +/-1 MB due to encoding
restriction in b<cond>/adr/ldr instructions. This is normally okay
assuming each function is reasonably small, but certainly not working for large method which also
can be formed with an aggressive inlining probably like crossgen/corert scenarios.
In addition, for hot/cold code separation long address is a prerequisite
since reference can be across different regions which are arbitrary.
In fact, we need additional relocations which are not in this change yet.

In details, this supports long address for conditional jump/address loading/constant
loading operations by default while they can be shortened later by
emitJumpDistBind() if they can fit into the smaller encoding. Logically
those operations now can reach within +/-4GB address range.
Note I haven't extended unconditional jump in this change for simplicity
so it can reach within +/-128MB same as before.
emitOutputLJ is extended to finally encode these operations.

There are 3 pseudo instructions introduced. These can be expanded either
short/long form.

Conditional jump. See emitIns_J()
a. Short form(IF_BI_0B): b<cond> rel_addr
b. Long form(IF_LARGEJMP):
```
 b<rev cond> $LABEL
 b rel_addr (unconditional jump)
$LABEL:
```
Load label(address computation). See emitIns_R_L()
a. Short form(IF_DI_1E): adr x, [rel_addr]
b. Long form(IF_LARGEADR):
```
  adrp x, [rel_page_addr]
  add x, x, page_offs
```
Load constant (from JIT data). See emitIns_R_C()
a. Short form(IF_LS_1A): ldr x, [rel_addr]
b. Long form(IF_LARGLDC):
```
  adrp x, [rel_page_addr]
  ldr x, [x, page_offs]
 (fmov v, x in case loading vector constant)
```

In addition, JIT data is aligned on 8 byte to be accessible from large
load. Replaced JitLargeBranches by JitLongAddress to test stress on these
operations.

There is no asm diffs other than different label number when crossgening mscorlib.
Also validated all tests are passing with Complus_JitLongAddress=1 locally.

kyulee1 · 2016-05-11T13:47:12Z

@dotnet-bot test Windows_NT arm64 Checked
@dotnet-bot test Windows_NT arm64 Release

kyulee1 · 2016-05-11T14:04:36Z

src/jit/codegenarm64.cpp

    getEmitter()->emitIns_R_C(INS_lea,
        emitTypeSize(TYP_I_IMPL),
        treeNode->gtRegNum,
+        REG_NA,


JumpTable code path is dead (NYI in this function). Will revisit it when it's enabled.

kyulee1 · 2016-05-11T14:39:33Z

@dotnet/jit-contrib @dotnet/arm64-contrib PTAL

@rwd00

Fixes https://github.com/dotnet/coreclr/issues/3332 To validate various addressing in dotnet#4896, I just enable this. Previously, we only allow a load operation to JIT data (`ldr` or `IF_LARGELDC`). For switch expansion, jump table is also recorded into JIT data. In this case, we only get the address of jump table head, and load the right entry after computing offset. So, basically `adr` or `IF_LARGEADR` is used to not only load label within code but also refer to the location of JIT data. The typical code sequence for switch expansion is like this: ``` adr x8, [@rwd00] // load address of jump table head ldr w8, [x8, x0, LSL dotnet#2] // load jump entry from table addr + x0 * 4 adr x9, [G_M56320_IG02] // load address of current baisc block add x8, x8, x9 // Add them to compute the final target br x8 // Indirectly jump to the target ```

briansull · 2016-05-12T00:35:43Z

src/jit/emit.cpp

+            assert(dataOffs < emitDataSize());
+
+            // Conservately assume JIT data starts after the entire code size.
+            // TODO: we might consider only hot code size which will be computed later in emitComputeCodeSizes().


Ultimately we will want to layout the read-only data next to the hot code so that we can use the small instruction to access it. Accessing the read-only data from the cold section will typically use the large instructions.

We are already allocating read-only data next to the hot code, and in this change reference from cold code is always kept long. The only minor issue is that emitComputeCodeSizes() comes later than this although I can replicate the code to get hot code size. I'd like to reorder emitComputeCodeSizes() with emitJumpDistBind(), which appears okay, but given the risk of changing the common code toward RTM, I will put ARM64-TODO and follow it up later.

briansull · 2016-05-12T01:33:11Z

Very Nice,
Looks Good (with comments)

Fixes https://github.com/dotnet/coreclr/issues/3668 Currently ARM64 codegen can have reference within +/-1 MB due to encoding restriction in `b<cond>/adr/ldr` instructions. This is normally okay assuming each function is reasonably small, but certainly not working for large method which also can be formed with an aggressive inlining probably like crossgen/corert scenarios. In addition, for hot/cold code separation long address is a prerequisite since reference can be across different regions which are arbitrary. In fact, we need additional relocations which are not in this change yet. In details, this supports long address for conditional jump/address loading/constant loading operations by default while they can be shortened later by `emitJumpDistBind()` if they can fit into the smaller encoding. Logically those operations now can reach within +/-4GB address range. Note I haven't extended unconditional jump in this change for simplicity so it can reach within +/-128MB same as before. `emitOutputLJ` is extended to finally encode these operations. There are 3 pseudo instructions introduced. These can be expanded either short/long form. 1. Conditional jump. See `emitIns_J()` a. Short form(`IF_BI_0B`): `b<cond> rel_addr` b. Long form(`IF_LARGEJMP`): ``` b<rev cond> $LABEL b rel_addr (unconditional jump) $LABEL: ``` 2. Load label(address computation). See `emitIns_R_L()` a. Short form(`IF_DI_1E`): `adr x, [rel_addr]` b. Long form(`IF_LARGEADR`): ``` adrp x, [rel_page_addr] add x, x, page_offs ``` 3. Load constant (from JIT data). See `emitIns_R_C()` a. Short form(`IF_LS_1A`): `ldr x, [rel_addr]` b. Long form(`IF_LARGLDC`): ``` adrp x, [rel_page_addr] ldr x, [x, page_offs] (fmov v, x in case loading vector constant) ``` In addition, JIT data is aligned on 8 byte to be accessible from large load. Replaced JitLargeBranches by JitLongAddress to test stress on these operations.

kyulee1 · 2016-05-12T04:37:08Z

Thank you for quick review on the large change.
I updated changes per feedbacks and other major things are individually commented, which I will follow it up. So, I'm merging it.

@rwd00

Fixes #3332 To validate various addressing in dotnet#4896, I just enable this. Previously, we only allow a load operation to JIT data (`ldr` or `IF_LARGELDC`). For switch expansion, jump table is also recorded into JIT data. In this case, we only get the address of jump table head, and load the right entry after computing offset. So, basically `adr` or `IF_LARGEADR` is used to not only load label within code but also refer to the location of JIT data. The typical code sequence for switch expansion is like this: ``` adr x8, [@rwd00] // load address of jump table head ldr w8, [x8, x0, LSL dotnet#2] // load jump entry from table addr + x0 * 4 adr x9, [G_M56320_IG02] // load address of current baisc block add x8, x8, x9 // Add them to compute the final target br x8 // Indirectly jump to the target ```

@rwd00

Fixes #3332 To validate various addressing in dotnet#4896, I just enable this. Previously, we only allow a load operation to JIT data (`ldr` or `IF_LARGELDC`). For switch expansion, jump table is also recorded into JIT data. In this case, we only get the address of jump table head, and load the right entry after computing offset. So, basically `adr` or `IF_LARGEADR` is used to not only load label within code but also refer to the location of JIT data. The typical code sequence for switch expansion is like this: ``` adr x8, [@rwd00] // load address of jump table head ldr w8, [x8, x0, LSL dotnet#2] // load jump entry from table addr + x0 * 4 adr x9, [G_M56320_IG02] // load address of current baisc block add x8, x8, x9 // Add them to compute the final target br x8 // Indirectly jump to the target ```

ARM64: Enable Long Address Commit migrated from dotnet/coreclr@3e98666

@rwd00

Fixes dotnet/coreclr#3332 To validate various addressing in dotnet/coreclr#4896, I just enable this. Previously, we only allow a load operation to JIT data (`ldr` or `IF_LARGELDC`). For switch expansion, jump table is also recorded into JIT data. In this case, we only get the address of jump table head, and load the right entry after computing offset. So, basically `adr` or `IF_LARGEADR` is used to not only load label within code but also refer to the location of JIT data. The typical code sequence for switch expansion is like this: ``` adr x8, [@rwd00] // load address of jump table head ldr w8, [x8, x0, LSL dotnet/coreclr#2] // load jump entry from table addr + x0 * 4 adr x9, [G_M56320_IG02] // load address of current baisc block add x8, x8, x9 // Add them to compute the final target br x8 // Indirectly jump to the target ``` Commit migrated from dotnet/coreclr@a0c6144

dnfclas added the cla-already-signed label May 10, 2016

dotnet-bot added the 2 - In Progress label May 10, 2016

kyulee1 force-pushed the longjmp branch from fe04641 to c408f6c Compare May 11, 2016 13:46

kyulee1 reviewed May 11, 2016
View reviewed changes

kyulee1 changed the title ~~ARM64: Enable Long Address (Testing)~~ ARM64: Enable Long Address May 11, 2016

kyulee1 mentioned this pull request May 11, 2016

ARM64: Switch Expansion Using Jump Table #4919

Merged

briansull reviewed May 12, 2016
View reviewed changes

kyulee1 force-pushed the longjmp branch from c408f6c to 61fe464 Compare May 12, 2016 04:34

kyulee1 merged commit 3e98666 into dotnet:master May 12, 2016

dotnet-bot removed the 2 - In Progress label May 12, 2016

picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022

Merge pull request dotnet/coreclr#4896 from kyulee1/longjmp

b2fb7f7

ARM64: Enable Long Address Commit migrated from dotnet/coreclr@3e98666

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ARM64: Enable Long Address #4896

ARM64: Enable Long Address #4896

Uh oh!

kyulee1 commented May 10, 2016 •

edited

Loading

Uh oh!

kyulee1 commented May 11, 2016

Uh oh!

kyulee1 May 11, 2016

Uh oh!

kyulee1 commented May 11, 2016

Uh oh!

briansull May 12, 2016

Uh oh!

kyulee1 May 12, 2016

Uh oh!

briansull commented May 12, 2016

Uh oh!

kyulee1 commented May 12, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ARM64: Enable Long Address #4896

ARM64: Enable Long Address #4896

Uh oh!

Conversation

kyulee1 commented May 10, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kyulee1 commented May 11, 2016

Uh oh!

kyulee1 May 11, 2016

Choose a reason for hiding this comment

Uh oh!

kyulee1 commented May 11, 2016

Uh oh!

briansull May 12, 2016

Choose a reason for hiding this comment

Uh oh!

kyulee1 May 12, 2016

Choose a reason for hiding this comment

Uh oh!

briansull commented May 12, 2016

Uh oh!

kyulee1 commented May 12, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kyulee1 commented May 10, 2016 •

edited

Loading