[LLD][COFF] Align EC code ranges to page boundaries#168222
Conversation
We already ensure that code for different architectures is always placed in different pages in assignAddresses. We represent those ranges using their first and last chunks. However, the RVAs of those chunks may not be page-aligned, for example, due to extra padding for entry-thunk offsets. Align the chunk RVAs to the page boundary so that the emitted ranges correctly include the entire region. This change affects an existing test that checks corner cases triggered by merging a data section into a code section. We may now include such data in the code range. This differs from MSVC’s behavior, but it should not cause practical issues, and the new behavior is arguably more correct. Fixes llvm#168119.
|
@llvm/pr-subscribers-lld @llvm/pr-subscribers-lld-coff Author: Jacek Caban (cjacek) ChangesWe already ensure that code for different architectures is always placed in different pages in This change affects an existing test that checks corner cases triggered by merging a data section into a code section. We may now include such data in the code range. This differs from MSVC’s behavior, but it should not cause practical issues, and the new behavior is arguably more correct. Fixes #168119. 2 Files Affected:
diff --git a/lld/COFF/Chunks.cpp b/lld/COFF/Chunks.cpp
index 548d87bdaefe5..409491d4a1f89 100644
--- a/lld/COFF/Chunks.cpp
+++ b/lld/COFF/Chunks.cpp
@@ -946,7 +946,7 @@ void ECCodeMapChunk::writeTo(uint8_t *buf) const {
auto table = reinterpret_cast<chpe_range_entry *>(buf);
for (uint32_t i = 0; i < map.size(); i++) {
const ECCodeMapEntry &entry = map[i];
- uint32_t start = entry.first->getRVA();
+ uint32_t start = entry.first->getRVA() & ~0xfff;
table[i].StartOffset = start | entry.type;
table[i].Length = entry.last->getRVA() + entry.last->getSize() - start;
}
diff --git a/lld/test/COFF/arm64ec-codemap.test b/lld/test/COFF/arm64ec-codemap.test
index 050261117be2e..bbc682d19920f 100644
--- a/lld/test/COFF/arm64ec-codemap.test
+++ b/lld/test/COFF/arm64ec-codemap.test
@@ -7,6 +7,7 @@ RUN: llvm-mc -filetype=obj -triple=arm64ec-windows arm64ec-func-sym2.s -o arm64e
RUN: llvm-mc -filetype=obj -triple=arm64ec-windows data-sec.s -o data-sec.obj
RUN: llvm-mc -filetype=obj -triple=arm64ec-windows data-sec2.s -o data-sec2.obj
RUN: llvm-mc -filetype=obj -triple=arm64ec-windows empty-sec.s -o arm64ec-empty-sec.obj
+RUN: llvm-mc -filetype=obj -triple=arm64ec-windows entry-thunk.s -o entry-thunk.obj
RUN: llvm-mc -filetype=obj -triple=x86_64-windows x86_64-func-sym.s -o x86_64-func-sym.obj
RUN: llvm-mc -filetype=obj -triple=x86_64-windows empty-sec.s -o x86_64-empty-sec.obj
RUN: llvm-mc -filetype=obj -triple=aarch64-windows %S/Inputs/loadconfig-arm64.s -o loadconfig-arm64.obj
@@ -162,15 +163,17 @@ RUN: loadconfig-arm64ec.obj -dll -noentry -merge:test=.testdata -merge:
RUN: llvm-readobj --coff-load-config testcm.dll | FileCheck -check-prefix=CODEMAPCM %s
CODEMAPCM: CodeMap [
-CODEMAPCM-NEXT: 0x4008 - 0x4016 X64
+CODEMAPCM-NEXT: 0x4000 - 0x4016 X64
CODEMAPCM-NEXT: ]
RUN: llvm-objdump -d testcm.dll | FileCheck -check-prefix=DISASMCM %s
DISASMCM: Disassembly of section .testdat:
DISASMCM-EMPTY:
DISASMCM-NEXT: 0000000180004000 <.testdat>:
-DISASMCM-NEXT: 180004000: 00000001 udf #0x1
-DISASMCM-NEXT: 180004004: 00000000 udf #0x0
+DISASMCM-NEXT: 180004000: 01 00 addl %eax, (%rax)
+DISASMCM-NEXT: 180004002: 00 00 addb %al, (%rax)
+DISASMCM-NEXT: 180004004: 00 00 addb %al, (%rax)
+DISASMCM-NEXT: 180004006: 00 00 addb %al, (%rax)
DISASMCM-NEXT: 180004008: b8 03 00 00 00 movl $0x3, %eax
DISASMCM-NEXT: 18000400d: c3 retq
DISASMCM-NEXT: 18000400e: 00 00 addb %al, (%rax)
@@ -207,6 +210,14 @@ DISASMMS-NEXT: 0000000180006000 <test2>:
DISASMMS-NEXT: 180006000: 528000a0 mov w0, #0x5 // =5
DISASMMS-NEXT: 180006004: d65f03c0 ret
+Test the code map that includes an ARM64EC function padded by its entry-thunk offset.
+
+RUN: lld-link -out:testpad.dll -machine:arm64ec entry-thunk.obj loadconfig-arm64ec.obj -dll -noentry -include:func
+RUN: llvm-readobj --coff-load-config testpad.dll | FileCheck -check-prefix=CODEMAPPAD %s
+CODEMAPPAD: CodeMap [
+CODEMAPPAD: 0x1000 - 0x1010 ARM64EC
+CODEMAPPAD-NEXT: ]
+
#--- arm64-func-sym.s
.text
@@ -266,3 +277,22 @@ x86_64_func_sym2:
.section .empty1, "xr"
.section .empty2, "xr"
.section .empty3, "xr"
+
+#--- entry-thunk.s
+ .section .text,"xr",discard,func
+ .globl func
+ .p2align 2, 0x0
+func:
+ mov w0, #1
+ ret
+
+ .section .wowthk$aa,"xr",discard,thunk
+ .globl thunk
+ .p2align 2
+thunk:
+ ret
+
+ .section .hybmp$x,"yi"
+ .symidx func
+ .symidx thunk
+ .word 1 // entry thunk
|
mstorsjo
left a comment
There was a problem hiding this comment.
Looks ok, one minor comment.
| for (uint32_t i = 0; i < map.size(); i++) { | ||
| const ECCodeMapEntry &entry = map[i]; | ||
| uint32_t start = entry.first->getRVA(); | ||
| uint32_t start = entry.first->getRVA() & ~0xfff; |
There was a problem hiding this comment.
This looks reasonable, but do we use explicit 0xfff for page alignment here/elsewhere so far, or do we have some constant for that?
There was a problem hiding this comment.
Yes, we already use 0xfff in several places in LLD, mostly, but not exclusively, in Chunks.cpp.
|
/cherry-pick af45b02 |
Error: Command failed due to missing milestone. |
|
/pull-request #168369 |
We already ensure that code for different architectures is always placed in different pages in `assignAddresses`. We represent those ranges using their first and last chunks. However, the RVAs of those chunks may not be page-aligned, for example, due to extra padding for entry-thunk offsets. Align the chunk RVAs to the page boundary so that the emitted ranges correctly include the entire region. This change affects an existing test that checks corner cases triggered by merging a data section into a code section. We may now include such data in the code range. This differs from MSVC’s behavior, but it should not cause practical issues, and the new behavior is arguably more correct. Fixes llvm#168119. (cherry picked from commit af45b02)
We already ensure that code for different architectures is always placed in different pages in `assignAddresses`. We represent those ranges using their first and last chunks. However, the RVAs of those chunks may not be page-aligned, for example, due to extra padding for entry-thunk offsets. Align the chunk RVAs to the page boundary so that the emitted ranges correctly include the entire region. This change affects an existing test that checks corner cases triggered by merging a data section into a code section. We may now include such data in the code range. This differs from MSVC’s behavior, but it should not cause practical issues, and the new behavior is arguably more correct. Fixes llvm#168119. (cherry picked from commit af45b02)
We already ensure that code for different architectures is always placed in different pages in
assignAddresses. We represent those ranges using their first and last chunks. However, the RVAs of those chunks may not be page-aligned, for example, due to extra padding for entry-thunk offsets. Align the chunk RVAs to the page boundary so that the emitted ranges correctly include the entire region.This change affects an existing test that checks corner cases triggered by merging a data section into a code section. We may now include such data in the code range. This differs from MSVC’s behavior, but it should not cause practical issues, and the new behavior is arguably more correct.
Fixes #168119.