Skip to content

[AArch64][GlobalISel] Add pattern to prevent scalar uqxtn fallback#201546

Merged
JoshdRod merged 3 commits into
llvm:mainfrom
JoshdRod:uqxtn-fallback
Jun 4, 2026
Merged

[AArch64][GlobalISel] Add pattern to prevent scalar uqxtn fallback#201546
JoshdRod merged 3 commits into
llvm:mainfrom
JoshdRod:uqxtn-fallback

Conversation

@JoshdRod
Copy link
Copy Markdown
Contributor

@JoshdRod JoshdRod commented Jun 4, 2026

Previously, attempting to select the intrinsic @llvm.aarch64.neon.scalar.uqxtn would cause GlobalISel to fall back to SDAG.
This was both due to:

  1. RegBankSelect placing the operands on gpr banks.
  2. No instruction selection patterns for the intrinsic.
    Add pattern, and fix RegBankSelect to place operands on the correct banks

JoshdRod added 3 commits June 4, 2026 10:28
Previously, these operands would be on gpr, which neon intrinsics cannot handle.
To fix, place these operands on fpr.
This still doesn't seem to fix the issue - I assume we need some more patterns.
neon_scalar_uqxtn was failing instruction selection as there was no pattern matching it to UQXTN. Add that pattern.
Add a GI run to the only test the checks scalar.uqxtn.
@llvmorg-github-actions
Copy link
Copy Markdown

@llvm/pr-subscribers-backend-aarch64

Author: Joshua Rodriguez (JoshdRod)

Changes

Previously, attempting to select the intrinsic @llvm.aarch64.neon.scalar.uqxtn would cause GlobalISel to fall back to SDAG.
This was both due to:

  1. RegBankSelect placing the operands on gpr banks.
  2. No instruction selection patterns for the intrinsic.
    Add pattern, and fix RegBankSelect to place operands on the correct banks

Full diff: https://github.com/llvm/llvm-project/pull/201546.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+3)
  • (modified) llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp (+1)
  • (modified) llvm/test/CodeGen/AArch64/arm64-arith-saturating.ll (+8-1)
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index f23ae6501b285..151cbd9bc5a7c 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -6696,6 +6696,9 @@ defm UQXTN  : SIMDTwoScalarMixedBHS<1, 0b10100, "uqxtn", int_aarch64_neon_scalar
 defm USQADD : SIMDTwoScalarBHSDTied< 1, 0b00011, "usqadd", AArch64usqadd,
                                     int_aarch64_neon_usqadd>;
 
+def : Pat<(i32 (int_aarch64_neon_scalar_uqxtn (i64 FPR64:$Rn))),
+          (i32 (UQXTNv1i32 FPR64:$Rn))>;
+
 // ssub_sat(0, R) -> sqneg(R)
 def : Pat<(v16i8 (ssubsat immAllZerosV, V128:$reg)),
           (v16i8 (SQNEGv16i8 V128:$reg))>;
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp b/llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
index 17350435ea0b8..ee9e57e705750 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
@@ -653,6 +653,7 @@ static bool isFPIntrinsic(const MachineRegisterInfo &MRI,
   case Intrinsic::aarch64_neon_uqrshrn:
   case Intrinsic::aarch64_neon_sqneg:
   case Intrinsic::aarch64_neon_sqabs:
+  case Intrinsic::aarch64_neon_scalar_uqxtn:
   case Intrinsic::aarch64_crypto_sha1h:
   case Intrinsic::aarch64_crypto_sha1c:
   case Intrinsic::aarch64_crypto_sha1p:
diff --git a/llvm/test/CodeGen/AArch64/arm64-arith-saturating.ll b/llvm/test/CodeGen/AArch64/arm64-arith-saturating.ll
index 07c4dbcf41096..15812d2d52d40 100644
--- a/llvm/test/CodeGen/AArch64/arm64-arith-saturating.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-arith-saturating.ll
@@ -1,5 +1,12 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc < %s -mtriple=arm64-eabi -mcpu=cyclone | FileCheck %s
+; RUN: llc < %s -mtriple=arm64-eabi -mcpu=cyclone | FileCheck %s --check-prefixes=CHECK
+; RUN: llc < %s -mtriple=arm64-eabi -mcpu=cyclone -global-isel -global-isel-abort=2 2>&1 | FileCheck %s --check-prefixes=CHECK,CHECK-GI
+
+; CHECK-GI: warning: Instruction selection used fallback path for vqmovund
+; CHECK-GI-NEXT: warning: Instruction selection used fallback path for vqmovnd_s
+; CHECK-GI-NEXT: warning: Instruction selection used fallback path for sqxtn_ins
+; CHECK-GI-NEXT: warning: Instruction selection used fallback path for sqxtun_insext
+; CHECK-GI-NEXT: warning: Instruction selection used fallback path for saddluse
 
 define i32 @qadds(<4 x i32> %b, <4 x i32> %c) nounwind readnone optsize ssp {
 ; CHECK-LABEL: qadds:

defm USQADD : SIMDTwoScalarBHSDTied< 1, 0b00011, "usqadd", AArch64usqadd,
int_aarch64_neon_usqadd>;

def : Pat<(i32 (int_aarch64_neon_scalar_uqxtn (i64 FPR64:$Rn))),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect to write a new pattern for an intrinsic which works with selectiondag already

Copy link
Copy Markdown
Contributor

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems OK to me. There is presumably a sqxtn and sqxtun too that could work the same way. Did you want to add them here or do it in a followup?

@JoshdRod
Copy link
Copy Markdown
Contributor Author

JoshdRod commented Jun 4, 2026

Seems OK to me. There is presumably a sqxtn and sqxtun too that could work the same way. Did you want to add them here or do it in a followup?

I was planning to get feedback here and do the rest in a followup - as I'm pretty sure the rest will be exactly the same fix.

I wouldn't expect to write a new pattern for an intrinsic which works with selectiondag already

The vector version of this intrinsic lowers using a generic opcode. I feel like it doesn't make sense to create a new GI node for this one case - what do you think?

@JoshdRod JoshdRod requested review from arsenm and davemgreen June 4, 2026 13:22
@JoshdRod JoshdRod merged commit 017380d into llvm:main Jun 4, 2026
12 checks passed
JoshdRod added a commit to JoshdRod/llvm-project that referenced this pull request Jun 4, 2026
Follow on from [llvm#201546])(llvm#201546).
Add patterns for signed versions of scalar extend intrinsics as well.
JoshdRod added a commit that referenced this pull request Jun 5, 2026
#201617)

Follow on from #201546
Add patterns for signed versions of scalar extend intrinsics as well.
llvm-upstreamsync Bot pushed a commit to qualcomm/cpullvm-toolchain that referenced this pull request Jun 5, 2026
…d intrinsics (#201617)

Follow on from llvm/llvm-project#201546
Add patterns for signed versions of scalar extend intrinsics as well.
llvm-sync Bot pushed a commit to arm/arm-toolchain that referenced this pull request Jun 5, 2026
…d intrinsics (#201617)

Follow on from llvm/llvm-project#201546
Add patterns for signed versions of scalar extend intrinsics as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants