Skip to content

release/22.x: [AArch64][SME] Disable tail calls in new ZA/ZT0 functions (#177152)#177169

Merged
c-rhodes merged 1 commit into
llvm:release/22.xfrom
llvmbot:issue177152
Jan 22, 2026
Merged

release/22.x: [AArch64][SME] Disable tail calls in new ZA/ZT0 functions (#177152)#177169
c-rhodes merged 1 commit into
llvm:release/22.xfrom
llvmbot:issue177152

Conversation

@llvmbot

@llvmbot llvmbot commented Jan 21, 2026

Copy link
Copy Markdown
Member

Backport 10aca26

Requested by: @MacDue

@llvmbot

llvmbot commented Jan 21, 2026

Copy link
Copy Markdown
Member Author

@sdesmalen-arm What do you think about merging this PR to the release branch?

@llvmbot

llvmbot commented Jan 21, 2026

Copy link
Copy Markdown
Member Author

@llvm/pr-subscribers-backend-aarch64

Author: None (llvmbot)

Changes

Backport 10aca26

Requested by: @MacDue


Full diff: https://github.com/llvm/llvm-project/pull/177169.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+2-1)
  • (added) llvm/test/CodeGen/AArch64/sme-new-za-zt0-no-tail-call.ll (+78)
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 74ee8ff8ab5f5..093927049e9d1 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -9328,7 +9328,8 @@ bool AArch64TargetLowering::isEligibleForTailCallOptimization(
   if (CallAttrs.requiresSMChange() || CallAttrs.requiresLazySave() ||
       CallAttrs.requiresPreservingAllZAState() ||
       CallAttrs.requiresPreservingZT0() ||
-      CallAttrs.caller().hasStreamingBody())
+      CallAttrs.caller().hasStreamingBody() || CallAttrs.caller().isNewZA() ||
+      CallAttrs.caller().isNewZT0())
     return false;
 
   // Functions using the C or Fast calling convention that have an SVE signature
diff --git a/llvm/test/CodeGen/AArch64/sme-new-za-zt0-no-tail-call.ll b/llvm/test/CodeGen/AArch64/sme-new-za-zt0-no-tail-call.ll
new file mode 100644
index 0000000000000..3c76132556600
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/sme-new-za-zt0-no-tail-call.ll
@@ -0,0 +1,78 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 6
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sme2 -O3 -verify-machineinstrs < %s | FileCheck %s
+
+declare void @inout_za_zt0() "aarch64_inout_za" "aarch64_inout_zt0"
+
+define void @new_za_zt0() "aarch64_new_za" "aarch64_new_zt0" {
+; CHECK-LABEL: new_za_zt0:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    .cfi_offset w30, -16
+; CHECK-NEXT:    mrs x8, TPIDR2_EL0
+; CHECK-NEXT:    cbz x8, .LBB0_2
+; CHECK-NEXT:  // %bb.1: // %entry
+; CHECK-NEXT:    bl __arm_tpidr2_save
+; CHECK-NEXT:    msr TPIDR2_EL0, xzr
+; CHECK-NEXT:    zero {za}
+; CHECK-NEXT:    zero { zt0 }
+; CHECK-NEXT:  .LBB0_2: // %entry
+; CHECK-NEXT:    smstart za
+; CHECK-NEXT:    bl inout_za_zt0
+; CHECK-NEXT:    smstop za
+; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-NEXT:    ret
+entry:
+  tail call void @inout_za_zt0()
+  ret void
+}
+
+declare void @inout_za() "aarch64_inout_za"
+
+define void @new_za() "aarch64_new_za" {
+; CHECK-LABEL: new_za:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    .cfi_offset w30, -16
+; CHECK-NEXT:    mrs x8, TPIDR2_EL0
+; CHECK-NEXT:    cbz x8, .LBB1_2
+; CHECK-NEXT:  // %bb.1: // %entry
+; CHECK-NEXT:    bl __arm_tpidr2_save
+; CHECK-NEXT:    msr TPIDR2_EL0, xzr
+; CHECK-NEXT:    zero {za}
+; CHECK-NEXT:  .LBB1_2: // %entry
+; CHECK-NEXT:    smstart za
+; CHECK-NEXT:    bl inout_za
+; CHECK-NEXT:    smstop za
+; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-NEXT:    ret
+entry:
+  tail call void @inout_za()
+  ret void
+}
+
+declare void @inout_zt0() "aarch64_inout_zt0"
+
+define void @new_zt0() "aarch64_new_zt0" {
+; CHECK-LABEL: new_zt0:
+; CHECK:       // %bb.0: // %entry
+; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
+; CHECK-NEXT:    .cfi_def_cfa_offset 16
+; CHECK-NEXT:    .cfi_offset w30, -16
+; CHECK-NEXT:    mrs x8, TPIDR2_EL0
+; CHECK-NEXT:    cbz x8, .LBB2_2
+; CHECK-NEXT:  // %bb.1: // %entry
+; CHECK-NEXT:    bl __arm_tpidr2_save
+; CHECK-NEXT:    msr TPIDR2_EL0, xzr
+; CHECK-NEXT:    zero { zt0 }
+; CHECK-NEXT:  .LBB2_2: // %entry
+; CHECK-NEXT:    smstart za
+; CHECK-NEXT:    bl inout_zt0
+; CHECK-NEXT:    smstop za
+; CHECK-NEXT:    ldr x30, [sp], #16 // 8-byte Folded Reload
+; CHECK-NEXT:    ret
+entry:
+  tail call void @inout_zt0()
+  ret void
+}

@github-project-automation github-project-automation Bot moved this from Needs Triage to Needs Merge in LLVM Release Status Jan 21, 2026
Allowing this can result in invalid tail calls to shared ZA functions.

It may be possible to limit this to the case where the caller is private
ZA and the callee shares ZA, but for now it is generally disabled.

(cherry picked from commit 10aca26)
@c-rhodes c-rhodes merged commit 092c1fc into llvm:release/22.x Jan 22, 2026
1 check was pending
@github-project-automation github-project-automation Bot moved this from Needs Merge to Done in LLVM Release Status Jan 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

4 participants