[llvm][AArch64] ARM64e ptrauth: constant lowering, tail call auth, coroutine resume signing#188650
[llvm][AArch64] ARM64e ptrauth: constant lowering, tail call auth, coroutine resume signing#188650oskarwirga wants to merge 4 commits intollvm:mainfrom
Conversation
Handle several edge cases in constant lowering that arise when pointer authentication is enabled: AsmPrinter.cpp: - Skip GOT PC-relative optimization for null or target-specific MCExprs (AArch64AuthMCExpr) that don't support this optimization - Defensively handle GOT equivalents with non-GlobalValue initializers (e.g., ConstantPtrAuth) to prevent crashes - Emit zeros for unsupported ptrauth patterns to preserve struct layout AArch64AsmPrinter.cpp: - Handle Swift's inttoptr(add(ptrtoint(@global), offset)) pattern in lowerConstantPtrAuth, which stripAndAccumulateConstantOffsets misses
When a tail call has a non-zero FPDiff (callee needs different stack argument space), SP has already been adjusted by the epilogue before the return address authentication. AUTIBSP authenticates LR using the current SP, which no longer matches the entry SP used by PACIBSP at function entry, causing EXC_ARM_PAC_FAIL. Fix by computing the entry SP into X16 and using explicit AUTIB x30, x16 instead of AUTIBSP when FPDiff != 0. Entry SP is reconstructed as: current_SP + (-FPDiff).
Sign resume function pointers in yield-once (retcon) coroutines when ptrauth.resume metadata is attached to the coro.id intrinsic. Swift IRGen attaches this metadata to modify accessors, specifying the PAC key, discriminator, and whether to use address discrimination. The signing is implemented using ptrauth_sign and ptrauth_blend intrinsics: the resume pointer is PtrToInt'd, signed with the specified key and discriminator (optionally blended with the continuation buffer address for address diversity), then IntToPtr'd back for storage. This enables Swift modify accessors (yield-once coroutines) to work correctly on arm64e, where the runtime authenticates resume pointers with blraa before calling them.
|
@llvm/pr-subscribers-coroutines @llvm/pr-subscribers-backend-aarch64 Author: Oskar Wirga (oskarwirga) ChangesThis is part of work being done in #188378 and #188638. This PR fixes three arm64e crashes I hit while testing Swift async/coroutine workloads. These are all LLVM codegen fixes. What this PR adds
Testing
AI DisclosureThis was all Claude, I am not even a novice in coroutines but I was able to get all changes tested on ARM64e hardware which is where I discovered these issues in the first place. 9 Files Affected:
diff --git a/llvm/docs/PointerAuth.md b/llvm/docs/PointerAuth.md
index 84e0af7577c7d..40c4e55f79faf 100644
--- a/llvm/docs/PointerAuth.md
+++ b/llvm/docs/PointerAuth.md
@@ -311,6 +311,34 @@ derived from the parent function's name, using the SipHash stable discriminator:
```
+### Coroutine Resume Pointer Signing
+
+The ``!ptrauth.resume`` metadata can be attached to ``@llvm.coro.id.retcon``
+and ``@llvm.coro.id.retcon.once`` intrinsic calls to request that the resume
+function pointer stored in the continuation be signed using pointer
+authentication.
+
+The metadata takes three operands:
+
+```
+!ptrauth.resume !{i32 <key>, i64 <discriminator>, i1 <addr_discriminated>}
+```
+
+- **key**: The ptrauth key to use for signing (e.g., 0 for IA).
+- **discriminator**: A constant discriminator value.
+- **addr_discriminated**: If true, the discriminator is blended with the
+ address of the continuation buffer using ``llvm.ptrauth.blend`` before
+ signing, providing address diversity.
+
+When this metadata is present, ``CoroSplit`` emits ``llvm.ptrauth.sign``
+(and optionally ``llvm.ptrauth.blend``) calls in the ramp function to sign
+the resume pointer before returning it to the caller.
+
+This is used by Swift IRGen to sign resume pointers for modify accessors
+(yield-once coroutines) on arm64e, where the runtime authenticates them
+with ``blraa`` before calling.
+
+
## AArch64 Support
AArch64 is currently the only architecture with full support of the pointer
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 083b83567e47f..405c6cb593a0b 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -4250,6 +4250,9 @@ static void emitGlobalConstantLargeInt(const ConstantInt *CI, AsmPrinter &AP) {
static void handleIndirectSymViaGOTPCRel(AsmPrinter &AP, const MCExpr **ME,
const Constant *BaseCst,
uint64_t Offset) {
+ // GOT PC-relative optimization doesn't apply to ptrauth (Target) MCExprs.
+ if (!*ME || (*ME)->getKind() == MCExpr::Target)
+ return;
// The global @foo below illustrates a global that uses a got equivalent.
//
// @bar = global i32 42
@@ -4318,7 +4321,11 @@ static void handleIndirectSymViaGOTPCRel(AsmPrinter &AP, const MCExpr **ME,
AsmPrinter::GOTEquivUsePair Result = AP.GlobalGOTEquivs[GOTEquivSym];
const GlobalVariable *GV = Result.first;
int NumUses = (int)Result.second;
+ if (!GV || !GV->hasInitializer())
+ return;
const GlobalValue *FinalGV = dyn_cast<GlobalValue>(GV->getOperand(0));
+ if (!FinalGV)
+ return;
const MCSymbol *FinalSym = AP.getSymbol(FinalGV);
*ME = AP.getObjFileLowering().getIndirectSymViaGOTPCRel(
FinalGV, FinalSym, MV, Offset, AP.MMI, *AP.OutStreamer);
@@ -4433,6 +4440,12 @@ static void emitGlobalConstantImpl(const DataLayout &DL, const Constant *CV,
// thread the streamer with EmitValue.
const MCExpr *ME = AP.lowerConstant(CV, BaseCV, Offset);
+ // Emit zeros if lowerConstant returned null to preserve struct layout.
+ if (!ME) {
+ AP.OutStreamer->emitZeros(Size);
+ return;
+ }
+
// Since lowerConstant already folded and got rid of all IR pointer and
// integer casts, detect GOT equivalent accesses by looking into the MCExpr
// directly.
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 2b8db27599d3c..47281791ff7a0 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -2680,9 +2680,13 @@ AArch64AsmPrinter::lowerConstantPtrAuth(const ConstantPtrAuth &CPA) {
MCContext &Ctx = OutContext;
// Figure out the base symbol and the addend, if any.
+ // LookThroughIntToPtr handles Swift patterns like:
+ // inttoptr (i64 add (i64 ptrtoint (ptr @global to i64), i64 2) to ptr)
APInt Offset(64, 0);
const Value *BaseGV = CPA.getPointer()->stripAndAccumulateConstantOffsets(
- getDataLayout(), Offset, /*AllowNonInbounds=*/true);
+ getDataLayout(), Offset, /*AllowNonInbounds=*/true,
+ /*AllowInvariantGroup=*/false, /*ExternalAnalysis=*/nullptr,
+ /*LookThroughIntToPtr=*/true);
auto *BaseGVB = dyn_cast<GlobalValue>(BaseGV);
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.h b/llvm/lib/Target/AArch64/AArch64FrameLowering.h
index 7ef2c4f388c7c..19cc93d01013d 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.h
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.h
@@ -87,6 +87,15 @@ class AArch64FrameLowering : public TargetFrameLowering {
/// Can this function use the red zone for local allocations.
bool canUseRedZone(const MachineFunction &MF) const;
+ /// Returns how much of the incoming argument stack area (in bytes) we should
+ /// clean up in an epilogue. For the C calling convention this will be 0, for
+ /// guaranteed tail call conventions it can be positive (a normal return or a
+ /// tail call to a function that uses less stack space for arguments) or
+ /// negative (for a tail call to a function that needs more stack space than
+ /// us for arguments).
+ int64_t getArgumentStackToRestore(MachineFunction &MF,
+ MachineBasicBlock &MBB) const;
+
bool hasReservedCallFrame(const MachineFunction &MF) const override;
bool
@@ -239,15 +248,6 @@ class AArch64FrameLowering : public TargetFrameLowering {
const AArch64InstrInfo &TII,
MachineInstr::MIFlag Flag) const;
- /// Returns how much of the incoming argument stack area (in bytes) we should
- /// clean up in an epilogue. For the C calling convention this will be 0, for
- /// guaranteed tail call conventions it can be positive (a normal return or a
- /// tail call to a function that uses less stack space for arguments) or
- /// negative (for a tail call to a function that needs more stack space than
- /// us for arguments).
- int64_t getArgumentStackToRestore(MachineFunction &MF,
- MachineBasicBlock &MBB) const;
-
// Find a scratch register that we can use at the start of the prologue to
// re-align the stack pointer. We avoid using callee-save registers since
// they may appear to be free when this is called from canUseAsPrologue
diff --git a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
index 517b8a4c1737b..a3f03f6068e93 100644
--- a/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
+++ b/llvm/lib/Target/AArch64/AArch64PointerAuth.cpp
@@ -9,6 +9,7 @@
#include "AArch64PointerAuth.h"
#include "AArch64.h"
+#include "AArch64FrameLowering.h"
#include "AArch64InstrInfo.h"
#include "AArch64MachineFunctionInfo.h"
#include "AArch64Subtarget.h"
@@ -196,9 +197,29 @@ void AArch64PointerAuth::authenticateLR(
.setMIFlag(MachineInstr::FrameDestroy);
emitPACCFI(MBB, MBBI, MachineInstr::FrameDestroy, EmitAsyncCFI);
}
- BuildMI(MBB, MBBI, DL,
- TII->get(UseBKey ? AArch64::AUTIBSP : AArch64::AUTIASP))
- .setMIFlag(MachineInstr::FrameDestroy);
+ // When a tail call has a non-zero FPDiff (callee needs different stack
+ // arg space), the epilogue adjusts SP before reaching here. SP no
+ // longer equals the entry SP used by PACI[AB]SP. Compute the entry SP
+ // into X16 and use explicit AUTI[AB] instead of AUTI[AB]SP.
+ // entry_SP = SP - FPDiff (FPDiff is negative when callee needs more
+ // space, positive when less).
+ auto &AFL = *static_cast<const AArch64FrameLowering *>(
+ MF.getSubtarget().getFrameLowering());
+ int64_t FPDiff = AFL.getArgumentStackToRestore(MF, MBB);
+ if (FPDiff != 0) {
+ emitFrameOffset(MBB, MBBI, DL, AArch64::X16, AArch64::SP,
+ StackOffset::getFixed(-FPDiff), TII,
+ MachineInstr::FrameDestroy);
+ unsigned AutOpc = UseBKey ? AArch64::AUTIB : AArch64::AUTIA;
+ BuildMI(MBB, MBBI, DL, TII->get(AutOpc), AArch64::LR)
+ .addUse(AArch64::LR)
+ .addUse(AArch64::X16)
+ .setMIFlag(MachineInstr::FrameDestroy);
+ } else {
+ BuildMI(MBB, MBBI, DL,
+ TII->get(UseBKey ? AArch64::AUTIBSP : AArch64::AUTIASP))
+ .setMIFlag(MachineInstr::FrameDestroy);
+ }
if (!MFnI->branchProtectionPAuthLR())
emitPACCFI(MBB, MBBI, MachineInstr::FrameDestroy, EmitAsyncCFI);
}
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index f83b6a601572d..b659fe199e7b8 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -1913,6 +1913,48 @@ void coro::AnyRetconABI::splitCoroutine(Function &F, coro::Shape &Shape,
auto *CastedContinuation =
Builder.CreateBitCast(ContinuationPhi, CastedContinuationTy);
+ // Sign the resume function pointer if !ptrauth.resume metadata is present.
+ if (Id) {
+ if (auto *MD = Id->getMetadata("ptrauth.resume")) {
+ auto *KeyMD = cast<ConstantAsMetadata>(MD->getOperand(0));
+ auto *DiscMD = cast<ConstantAsMetadata>(MD->getOperand(1));
+ auto *AddrDivMD = cast<ConstantAsMetadata>(MD->getOperand(2));
+ unsigned Key =
+ cast<ConstantInt>(KeyMD->getValue())->getZExtValue();
+ uint64_t Disc =
+ cast<ConstantInt>(DiscMD->getValue())->getZExtValue();
+ bool AddrDiv =
+ cast<ConstantInt>(AddrDivMD->getValue())->getZExtValue();
+
+ auto &Ctx = F.getContext();
+ auto *Int64Ty = Type::getInt64Ty(Ctx);
+ auto *Int32Ty = Type::getInt32Ty(Ctx);
+
+ auto *ContInt =
+ Builder.CreatePtrToInt(CastedContinuation, Int64Ty);
+
+ Value *Modifier;
+ if (AddrDiv) {
+ auto *BufAddr = Builder.CreatePtrToInt(
+ Id->getStorage(), Int64Ty);
+ auto *BlendFn = Intrinsic::getOrInsertDeclaration(
+ F.getParent(), Intrinsic::ptrauth_blend);
+ Modifier = Builder.CreateCall(
+ BlendFn, {BufAddr, ConstantInt::get(Int64Ty, Disc)});
+ } else {
+ Modifier = ConstantInt::get(Int64Ty, Disc);
+ }
+
+ auto *SignFn = Intrinsic::getOrInsertDeclaration(
+ F.getParent(), Intrinsic::ptrauth_sign);
+ auto *Signed = Builder.CreateCall(
+ SignFn,
+ {ContInt, ConstantInt::get(Int32Ty, Key), Modifier});
+ CastedContinuation =
+ Builder.CreateIntToPtr(Signed, CastedContinuationTy);
+ }
+ }
+
Value *RetV = CastedContinuation;
if (!ReturnPHIs.empty()) {
auto ValueIdx = 0;
diff --git a/llvm/test/CodeGen/AArch64/ptrauth-reloc.ll b/llvm/test/CodeGen/AArch64/ptrauth-reloc.ll
index 14f5571fc2deb..befb726bcdab8 100644
--- a/llvm/test/CodeGen/AArch64/ptrauth-reloc.ll
+++ b/llvm/test/CodeGen/AArch64/ptrauth-reloc.ll
@@ -113,6 +113,20 @@
@g.weird_ref.da.0 = constant i64 ptrtoint (ptr inttoptr (i64 ptrtoint (ptr ptrauth (ptr getelementptr (i8, ptr @g, i64 16), i32 2) to i64) to ptr) to i64)
+; Swift generates inttoptr(add(ptrtoint(@global), offset)) inside ptrauth.
+
+; CHECK-ELF-LABEL: .globl g.inttoptr_add.da.0
+; CHECK-ELF-NEXT: .p2align 3
+; CHECK-ELF-NEXT: g.inttoptr_add.da.0:
+; CHECK-ELF-NEXT: .xword (g+2)@AUTH(da,0)
+
+; CHECK-MACHO-LABEL: .globl _g.inttoptr_add.da.0
+; CHECK-MACHO-NEXT: .p2align 3
+; CHECK-MACHO-NEXT: _g.inttoptr_add.da.0:
+; CHECK-MACHO-NEXT: .quad (_g+2)@AUTH(da,0)
+
+@g.inttoptr_add.da.0 = constant ptr ptrauth (ptr inttoptr (i64 add (i64 ptrtoint (ptr @g to i64), i64 2) to ptr), i32 2)
+
; CHECK-ELF-LABEL: .globl g_weak.ref.ia.42
; CHECK-ELF-NEXT: .p2align 3
; CHECK-ELF-NEXT: g_weak.ref.ia.42:
diff --git a/llvm/test/CodeGen/AArch64/ptrauth-tail-call-autib.ll b/llvm/test/CodeGen/AArch64/ptrauth-tail-call-autib.ll
new file mode 100644
index 0000000000000..0f2f500ccc6ff
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/ptrauth-tail-call-autib.ll
@@ -0,0 +1,38 @@
+; RUN: llc -mtriple arm64e-apple-darwin -o - %s | FileCheck %s
+;
+; Crash 13 repro: In Swift async functions using swifttailcc, a tail call
+; with stack arguments adjusts SP in the epilogue before return address
+; authentication. AUTIBSP uses the current (adjusted) SP, not the entry
+; SP from PACIBSP, causing EXC_ARM_PAC_FAIL on arm64e.
+;
+; Fix: When FPDiff != 0, compute the entry SP into x16 and use explicit
+; autib x30, x16 instead of autibsp.
+
+declare swifttailcc void @callee_async(ptr swiftasync %ctx, i64, i64, i64, i64, i64, i64, i64, i64, i64)
+
+; FPDiff != 0: callee has stack args that this function doesn't.
+; Must use explicit autib with computed entry SP, NOT autibsp.
+define swifttailcc void @test_async_tail_call(ptr swiftasync %ctx) #0 {
+; CHECK-LABEL: _test_async_tail_call:
+; CHECK: pacibsp
+; CHECK-NOT: autibsp
+; CHECK: autib x30, x16
+; CHECK: b _callee_async
+ musttail call swifttailcc void @callee_async(ptr swiftasync %ctx, i64 1, i64 2, i64 3, i64 4, i64 5, i64 6, i64 7, i64 8, i64 9)
+ ret void
+}
+
+declare swifttailcc void @callee_no_stack_args(ptr swiftasync %ctx)
+
+; FPDiff == 0: callee has same stack arg layout. autibsp is correct here.
+define swifttailcc void @test_no_fpdiff_tail_call(ptr swiftasync %ctx) #0 {
+; CHECK-LABEL: _test_no_fpdiff_tail_call:
+; CHECK: pacibsp
+; CHECK: autibsp
+; CHECK-NOT: autib x30, x16
+; CHECK: b _callee_no_stack_args
+ musttail call swifttailcc void @callee_no_stack_args(ptr swiftasync %ctx)
+ ret void
+}
+
+attributes #0 = { nounwind "ptrauth-returns" "ptrauth-auth-traps" "sign-return-address"="all" "frame-pointer"="all" }
diff --git a/llvm/test/Transforms/Coroutines/coro-retcon-ptrauth.ll b/llvm/test/Transforms/Coroutines/coro-retcon-ptrauth.ll
new file mode 100644
index 0000000000000..0e999b7834b30
--- /dev/null
+++ b/llvm/test/Transforms/Coroutines/coro-retcon-ptrauth.ll
@@ -0,0 +1,86 @@
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S | FileCheck %s
+;
+; Crash 10 repro: Swift modify accessors use yield-once (retcon) coroutines.
+; The resume function pointer must be PAC-signed. On arm64e, the runtime
+; authenticates resume pointers with blraa using a key and discriminator.
+; Without signing, authentication fails -> EXC_ARM_PAC_FAIL.
+;
+; When !ptrauth.resume metadata is attached to coro.id.retcon.once,
+; CoroSplit must sign the resume pointer using llvm.ptrauth.sign.
+; Metadata format: !{i32 key, i64 discriminator, i1 addr_discriminated}
+
+target triple = "arm64e-apple-darwin"
+
+declare void @prototype(ptr, i1)
+declare noalias ptr @allocate(i32)
+declare void @deallocate(ptr)
+declare void @print(i32)
+
+; Test address-diversified signing (addr_div=true).
+; The discriminator is blended with the buffer address before signing.
+define {ptr, ptr} @test_ptrauth_resume(ptr %buffer, ptr %ptr) presplitcoroutine {
+entry:
+ %temp = alloca i32, align 4
+ %id = call token @llvm.coro.id.retcon.once(i32 8, i32 8, ptr %buffer, ptr @prototype, ptr @allocate, ptr @deallocate), !ptrauth.resume !0
+ %hdl = call ptr @llvm.coro.begin(token %id, ptr null)
+ %oldvalue = load i32, ptr %ptr
+ store i32 %oldvalue, ptr %temp
+ %unwind = call i1 (...) @llvm.coro.suspend.retcon.i1(ptr %temp)
+ br i1 %unwind, label %cleanup, label %cont
+
+cont:
+ %newvalue = load i32, ptr %temp
+ store i32 %newvalue, ptr %ptr
+ br label %cleanup
+
+cleanup:
+ call void @llvm.coro.end(ptr %hdl, i1 0, token none)
+ unreachable
+}
+
+; CHECK-LABEL: define { ptr, ptr } @test_ptrauth_resume(
+; Address-diversified: blend buffer address with discriminator, then sign.
+; CHECK: %[[BUFINT:.*]] = ptrtoint ptr %buffer to i64
+; CHECK: %[[BLEND:.*]] = call i64 @llvm.ptrauth.blend(i64 %[[BUFINT]], i64 3909)
+; CHECK: %[[SIGNED:.*]] = call i64 @llvm.ptrauth.sign(i64 ptrtoint (ptr @test_ptrauth_resume.resume.0 to i64), i32 0, i64 %[[BLEND]])
+; CHECK: %[[PTR:.*]] = inttoptr i64 %[[SIGNED]] to ptr
+; CHECK: ret { ptr, ptr }
+
+; Test non-address-diversified signing (addr_div=false).
+; The discriminator is used directly without blending.
+define {ptr, ptr} @test_ptrauth_resume_no_addrdiv(ptr %buffer, ptr %ptr) presplitcoroutine {
+entry:
+ %temp = alloca i32, align 4
+ %id = call token @llvm.coro.id.retcon.once(i32 8, i32 8, ptr %buffer, ptr @prototype, ptr @allocate, ptr @deallocate), !ptrauth.resume !1
+ %hdl = call ptr @llvm.coro.begin(token %id, ptr null)
+ %oldvalue = load i32, ptr %ptr
+ store i32 %oldvalue, ptr %temp
+ %unwind = call i1 (...) @llvm.coro.suspend.retcon.i1(ptr %temp)
+ br i1 %unwind, label %cleanup, label %cont
+
+cont:
+ %newvalue = load i32, ptr %temp
+ store i32 %newvalue, ptr %ptr
+ br label %cleanup
+
+cleanup:
+ call void @llvm.coro.end(ptr %hdl, i1 0, token none)
+ unreachable
+}
+
+; CHECK-LABEL: define { ptr, ptr } @test_ptrauth_resume_no_addrdiv(
+; Non-address-diversified: no blend, discriminator passed directly to sign.
+; CHECK-NOT: @llvm.ptrauth.blend
+; CHECK: %[[SIGNED2:.*]] = call i64 @llvm.ptrauth.sign(i64 ptrtoint (ptr @test_ptrauth_resume_no_addrdiv.resume.0 to i64), i32 0, i64 3909)
+; CHECK: %[[PTR2:.*]] = inttoptr i64 %[[SIGNED2]] to ptr
+; CHECK: ret { ptr, ptr }
+
+declare token @llvm.coro.id.retcon.once(i32, i32, ptr, ptr, ptr, ptr)
+declare ptr @llvm.coro.begin(token, ptr)
+declare i1 @llvm.coro.suspend.retcon.i1(...)
+declare void @llvm.coro.end(ptr, i1, token)
+
+; key=0 (IA), disc=3909 (0x0F45), addr_div=true
+!0 = !{i32 0, i64 3909, i1 true}
+; key=0 (IA), disc=3909 (0x0F45), addr_div=false
+!1 = !{i32 0, i64 3909, i1 false}
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
@oskarwirga How does this work in Apple downstream fork? (https://github.com/swiftlang/llvm-project/) |
I tested it on that fork specifically. I also needed some patches to upstream Swift here: swiftlang/swift#88115 and swiftlang/swift#88114 The latter one has a dependent PR I haven't yet submitted but all the commits are in https://github.com/oskarwirga/swift/tree/arm64e-upstream |
efriedma-quic
left a comment
There was a problem hiding this comment.
Please split this into three patches: one for the asmprinter, one for the epilogue lowering, and one for coroutines.
The changes to llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp look very suspicious.
|
As far as I understood, this was intended to fix ptrauth-related issues with Swift on arm64e. When re-submitting patches after splitting, please consider the swift fork of llvm-project https://github.com/swiftlang/llvm-project/ To the best of my knowledge, it works fine with arm64e, and you probably want to align your work with existing arm64e-specific changes in the fork (if any). |
…rAuth (#189474) This is part of work being done in #188378 and #188638, split out from #188650. `lowerConstantPtrAuth` silently miscompiled `ConstantPtrAuth` constants with non-`GlobalValue` pointer bases — emitting `0@AUTH(da,0)` instead of erroring. Changes: - Handle `ConstantPointerNull` bases explicitly - Error via `reportFatalUsageError` on any remaining unresolved base (e.g. nested ptrauth) instead of silently miscompiling This PR was mostly developed with LLM assistance
…rAuth (llvm#189474) This is part of work being done in llvm#188378 and llvm#188638, split out from llvm#188650. `lowerConstantPtrAuth` silently miscompiled `ConstantPtrAuth` constants with non-`GlobalValue` pointer bases — emitting `0@AUTH(da,0)` instead of erroring. Changes: - Handle `ConstantPointerNull` bases explicitly - Error via `reportFatalUsageError` on any remaining unresolved base (e.g. nested ptrauth) instead of silently miscompiling This PR was mostly developed with LLM assistance
…rAuth (llvm#189474) This is part of work being done in llvm#188378 and llvm#188638, split out from llvm#188650. `lowerConstantPtrAuth` silently miscompiled `ConstantPtrAuth` constants with non-`GlobalValue` pointer bases — emitting `0@AUTH(da,0)` instead of erroring. Changes: - Handle `ConstantPointerNull` bases explicitly - Error via `reportFatalUsageError` on any remaining unresolved base (e.g. nested ptrauth) instead of silently miscompiling This PR was mostly developed with LLM assistance
This is part of work being done in #188378 and #188638. This PR fixes three arm64e crashes I hit while testing Swift async/coroutine workloads. These are all LLVM codegen fixes.
What this PR adds
Constant lowering edge cases (
AsmPrinter.cpp,AArch64AsmPrinter.cpp): Swift generatesinttoptr(add(ptrtoint(@global), offset))inside ptrauth constants, whichstripAndAccumulateConstantOffsetsmissed because it wasn't looking through inttoptr so we just enableLookThroughIntToPtr.Tail call return address auth (
AArch64PointerAuth.cpp,AArch64FrameLowering.h): When a tail call hasFPDiff != 0, the epilogue adjusts SP before authenticating the return address.AUTI[AB]SPuses the current SP as context, but the return address was signed with the entry SP and the mismatch leads to anEXC_ARM_PAC_FAIL. Fix: compute entry SP into x16 and use explicitAUTI[AB] x30, x16. Handles both A-key and B-key. MovesgetArgumentStackToRestorefrom private to public soAArch64PointerAuth.cppcan call itCoroutine resume pointer signing (
CoroSplit.cpp): New!ptrauth.resumemetadata oncoro.id.retcon.once/coro.id.retcontells CoroSplit to sign the resume function pointer withllvm.ptrauth.sign, optionally blending the discriminator with the buffer address for address diversity. This is what Swift IRGen emits and without it, the runtime'sblraaon the resume pointer fails immediately.Testing
ptrauth-reloc.ll— newinttoptr(add(ptrtoint))case, both ELF and MachO outputptrauth-tail-call-autib.ll— FPDiff != 0 getsautib x30, x16; FPDiff == 0 still getsautibspcoro-retcon-ptrauth.ll— address-diversified and non-address-diversified resume signingAI Disclosure
This was all Claude, I am not even a novice in coroutines but I was able to get all changes tested on ARM64e hardware which is where I discovered these issues in the first place.