Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PowerPC] Support conversion between f16 and f128 #130158

Merged
merged 7 commits into from
Mar 19, 2025
Merged

Conversation

lei137
Copy link
Contributor

@lei137 lei137 commented Mar 6, 2025

Enables conversion between f16 and f128.
Expanding on pre-Power9 targets and using HW instructions on Power9.

Fixes #92866
Commandeer of: #97677

@llvmbot
Copy link
Member

llvmbot commented Mar 6, 2025

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-powerpc

Author: Lei Huang (lei137)

Changes

Enables conversion between f16 and f128.
Expanding on pre-Power9 targets and using HW instructions on Power9.

Fixes #92866
Commandeer of: #97677


Patch is 58.66 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/130158.diff

4 Files Affected:

  • (modified) llvm/lib/Target/PowerPC/PPCISelLowering.cpp (+6)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrVSX.td (+4)
  • (modified) llvm/test/CodeGen/PowerPC/f128-conv.ll (+187-300)
  • (modified) llvm/test/CodeGen/PowerPC/fp128-libcalls.ll (+17)
diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 91df5f467e59c..ab78f33f5a630 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -223,13 +223,19 @@ PPCTargetLowering::PPCTargetLowering(const PPCTargetMachine &TM,
     setLoadExtAction(ISD::SEXTLOAD, VT, MVT::i8, Expand);
   }
 
+  setTruncStoreAction(MVT::f128, MVT::f16, Expand);
+  setOperationAction(ISD::FP_TO_FP16, MVT::f128, Expand);
+
   if (Subtarget.isISA3_0()) {
+    setLoadExtAction(ISD::EXTLOAD, MVT::f128, MVT::f16, Legal);
     setLoadExtAction(ISD::EXTLOAD, MVT::f64, MVT::f16, Legal);
     setLoadExtAction(ISD::EXTLOAD, MVT::f32, MVT::f16, Legal);
     setTruncStoreAction(MVT::f64, MVT::f16, Legal);
     setTruncStoreAction(MVT::f32, MVT::f16, Legal);
   } else {
     // No extending loads from f16 or HW conversions back and forth.
+    setLoadExtAction(ISD::EXTLOAD, MVT::f128, MVT::f16, Expand);
+    setOperationAction(ISD::FP16_TO_FP, MVT::f128, Expand);
     setLoadExtAction(ISD::EXTLOAD, MVT::f64, MVT::f16, Expand);
     setOperationAction(ISD::FP16_TO_FP, MVT::f64, Expand);
     setOperationAction(ISD::FP_TO_FP16, MVT::f64, Expand);
diff --git a/llvm/lib/Target/PowerPC/PPCInstrVSX.td b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
index d9e88b283a749..19448210f5db1 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrVSX.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrVSX.td
@@ -3995,6 +3995,8 @@ defm : ScalToVecWPermute<
   (SUBREG_TO_REG (i64 1), (VEXTSH2Ds (LXSIHZX ForceXForm:$src)), sub_64)>;
 
 // Load/convert and convert/store patterns for f16.
+def : Pat<(f128 (extloadf16 ForceXForm:$src)),
+          (f128 (XSCVDPQP (XSCVHPDP (LXSIHZX ForceXForm:$src))))>;
 def : Pat<(f64 (extloadf16 ForceXForm:$src)),
           (f64 (XSCVHPDP (LXSIHZX ForceXForm:$src)))>;
 def : Pat<(truncstoref16 f64:$src, ForceXForm:$dst),
@@ -4003,6 +4005,8 @@ def : Pat<(f32 (extloadf16 ForceXForm:$src)),
           (f32 (COPY_TO_REGCLASS (XSCVHPDP (LXSIHZX ForceXForm:$src)), VSSRC))>;
 def : Pat<(truncstoref16 f32:$src, ForceXForm:$dst),
           (STXSIHX (XSCVDPHP (COPY_TO_REGCLASS $src, VSFRC)), ForceXForm:$dst)>;
+def : Pat<(f128 (f16_to_fp i32:$A)),
+          (f128 (XSCVDPQP (XSCVHPDP (MTVSRWZ $A))))>;
 def : Pat<(f64 (f16_to_fp i32:$A)),
           (f64 (XSCVHPDP (MTVSRWZ $A)))>;
 def : Pat<(f32 (f16_to_fp i32:$A)),
diff --git a/llvm/test/CodeGen/PowerPC/f128-conv.ll b/llvm/test/CodeGen/PowerPC/f128-conv.ll
index d8eed1fb4092c..3ecf00bd84ae4 100644
--- a/llvm/test/CodeGen/PowerPC/f128-conv.ll
+++ b/llvm/test/CodeGen/PowerPC/f128-conv.ll
@@ -10,11 +10,11 @@
 @umem = global [5 x i64] [i64 560, i64 100, i64 34, i64 2, i64 5], align 8
 @swMem = global [5 x i32] [i32 5, i32 2, i32 3, i32 4, i32 0], align 4
 @uwMem = global [5 x i32] [i32 5, i32 2, i32 3, i32 4, i32 0], align 4
-@uhwMem = local_unnamed_addr global [5 x i16] [i16 5, i16 2, i16 3, i16 4, i16 0], align 2
-@ubMem = local_unnamed_addr global [5 x i8] c"\05\02\03\04\00", align 1
+@uhwMem = global [5 x i16] [i16 5, i16 2, i16 3, i16 4, i16 0], align 2
+@ubMem = global [5 x i8] c"\05\02\03\04\00", align 1
 
 ; Function Attrs: norecurse nounwind
-define void @sdwConv2qp(ptr nocapture %a, i64 %b) {
+define void @sdwConv2qp(ptr nocapture %a, i64 %b) nounwind {
 ; CHECK-LABEL: sdwConv2qp:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    mtvsrd v2, r4
@@ -25,9 +25,6 @@ define void @sdwConv2qp(ptr nocapture %a, i64 %b) {
 ; CHECK-P8-LABEL: sdwConv2qp:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -50,13 +47,10 @@ entry:
 }
 
 ; Function Attrs: norecurse nounwind
-define void @sdwConv2qp_01(ptr nocapture %a, i128 %b) {
+define void @sdwConv2qp_01(ptr nocapture %a, i128 %b) nounwind {
 ; CHECK-LABEL: sdwConv2qp_01:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    mflr r0
-; CHECK-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-NEXT:    .cfi_offset lr, 16
-; CHECK-NEXT:    .cfi_offset r30, -16
 ; CHECK-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-NEXT:    stdu r1, -48(r1)
 ; CHECK-NEXT:    mr r30, r3
@@ -75,9 +69,6 @@ define void @sdwConv2qp_01(ptr nocapture %a, i128 %b) {
 ; CHECK-P8-LABEL: sdwConv2qp_01:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -101,7 +92,7 @@ entry:
 }
 
 ; Function Attrs: norecurse nounwind
-define void @sdwConv2qp_02(ptr nocapture %a) {
+define void @sdwConv2qp_02(ptr nocapture %a) nounwind {
 ; CHECK-LABEL: sdwConv2qp_02:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    addis r4, r2, .LC0@toc@ha
@@ -114,9 +105,6 @@ define void @sdwConv2qp_02(ptr nocapture %a) {
 ; CHECK-P8-LABEL: sdwConv2qp_02:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -134,16 +122,16 @@ define void @sdwConv2qp_02(ptr nocapture %a) {
 ; CHECK-P8-NEXT:    mtlr r0
 ; CHECK-P8-NEXT:    blr
 entry:
-  %0 = load i64, ptr getelementptr inbounds
+  %i = load i64, ptr getelementptr inbounds
                         ([5 x i64], ptr @mem, i64 0, i64 2), align 8
-  %conv = sitofp i64 %0 to fp128
+  %conv = sitofp i64 %i to fp128
   store fp128 %conv, ptr %a, align 16
   ret void
 
 }
 
 ; Function Attrs: norecurse nounwind
-define void @sdwConv2qp_03(ptr nocapture %a, ptr nocapture readonly %b) {
+define void @sdwConv2qp_03(ptr nocapture %a, ptr nocapture readonly %b) nounwind {
 ; CHECK-LABEL: sdwConv2qp_03:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    lxsd v2, 0(r4)
@@ -154,9 +142,6 @@ define void @sdwConv2qp_03(ptr nocapture %a, ptr nocapture readonly %b) {
 ; CHECK-P8-LABEL: sdwConv2qp_03:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    std r0, 64(r1)
@@ -172,15 +157,15 @@ define void @sdwConv2qp_03(ptr nocapture %a, ptr nocapture readonly %b) {
 ; CHECK-P8-NEXT:    mtlr r0
 ; CHECK-P8-NEXT:    blr
 entry:
-  %0 = load i64, ptr %b, align 8
-  %conv = sitofp i64 %0 to fp128
+  %i = load i64, ptr %b, align 8
+  %conv = sitofp i64 %i to fp128
   store fp128 %conv, ptr %a, align 16
   ret void
 
 }
 
 ; Function Attrs: norecurse nounwind
-define void @sdwConv2qp_04(ptr nocapture %a, i1 %b) {
+define void @sdwConv2qp_04(ptr nocapture %a, i1 %b) nounwind {
 ; CHECK-LABEL: sdwConv2qp_04:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    andi. r4, r4, 1
@@ -195,9 +180,6 @@ define void @sdwConv2qp_04(ptr nocapture %a, i1 %b) {
 ; CHECK-P8-LABEL: sdwConv2qp_04:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -223,7 +205,7 @@ entry:
 }
 
 ; Function Attrs: norecurse nounwind
-define void @udwConv2qp(ptr nocapture %a, i64 %b) {
+define void @udwConv2qp(ptr nocapture %a, i64 %b) nounwind {
 ; CHECK-LABEL: udwConv2qp:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    mtvsrd v2, r4
@@ -234,9 +216,6 @@ define void @udwConv2qp(ptr nocapture %a, i64 %b) {
 ; CHECK-P8-LABEL: udwConv2qp:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -259,13 +238,10 @@ entry:
 }
 
 ; Function Attrs: norecurse nounwind
-define void @udwConv2qp_01(ptr nocapture %a, i128 %b) {
+define void @udwConv2qp_01(ptr nocapture %a, i128 %b) nounwind {
 ; CHECK-LABEL: udwConv2qp_01:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    mflr r0
-; CHECK-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-NEXT:    .cfi_offset lr, 16
-; CHECK-NEXT:    .cfi_offset r30, -16
 ; CHECK-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-NEXT:    stdu r1, -48(r1)
 ; CHECK-NEXT:    mr r30, r3
@@ -284,9 +260,6 @@ define void @udwConv2qp_01(ptr nocapture %a, i128 %b) {
 ; CHECK-P8-LABEL: udwConv2qp_01:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -310,7 +283,7 @@ entry:
 }
 
 ; Function Attrs: norecurse nounwind
-define void @udwConv2qp_02(ptr nocapture %a) {
+define void @udwConv2qp_02(ptr nocapture %a) nounwind {
 ; CHECK-LABEL: udwConv2qp_02:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    addis r4, r2, .LC1@toc@ha
@@ -323,9 +296,6 @@ define void @udwConv2qp_02(ptr nocapture %a) {
 ; CHECK-P8-LABEL: udwConv2qp_02:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -343,16 +313,16 @@ define void @udwConv2qp_02(ptr nocapture %a) {
 ; CHECK-P8-NEXT:    mtlr r0
 ; CHECK-P8-NEXT:    blr
 entry:
-  %0 = load i64, ptr getelementptr inbounds
+  %i = load i64, ptr getelementptr inbounds
                         ([5 x i64], ptr @umem, i64 0, i64 4), align 8
-  %conv = uitofp i64 %0 to fp128
+  %conv = uitofp i64 %i to fp128
   store fp128 %conv, ptr %a, align 16
   ret void
 
 }
 
 ; Function Attrs: norecurse nounwind
-define void @udwConv2qp_03(ptr nocapture %a, ptr nocapture readonly %b) {
+define void @udwConv2qp_03(ptr nocapture %a, ptr nocapture readonly %b) nounwind {
 ; CHECK-LABEL: udwConv2qp_03:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    lxsd v2, 0(r4)
@@ -363,9 +333,6 @@ define void @udwConv2qp_03(ptr nocapture %a, ptr nocapture readonly %b) {
 ; CHECK-P8-LABEL: udwConv2qp_03:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    std r0, 64(r1)
@@ -381,15 +348,15 @@ define void @udwConv2qp_03(ptr nocapture %a, ptr nocapture readonly %b) {
 ; CHECK-P8-NEXT:    mtlr r0
 ; CHECK-P8-NEXT:    blr
 entry:
-  %0 = load i64, ptr %b, align 8
-  %conv = uitofp i64 %0 to fp128
+  %i = load i64, ptr %b, align 8
+  %conv = uitofp i64 %i to fp128
   store fp128 %conv, ptr %a, align 16
   ret void
 
 }
 
 ; Function Attrs: norecurse nounwind
-define void @udwConv2qp_04(ptr nocapture %a, i1 %b) {
+define void @udwConv2qp_04(ptr nocapture %a, i1 %b) nounwind {
 ; CHECK-LABEL: udwConv2qp_04:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    clrlwi r4, r4, 31
@@ -401,9 +368,6 @@ define void @udwConv2qp_04(ptr nocapture %a, i1 %b) {
 ; CHECK-P8-LABEL: udwConv2qp_04:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -439,9 +403,6 @@ define ptr @sdwConv2qp_testXForm(ptr returned %sink,
 ; CHECK-P8-LABEL: sdwConv2qp_testXForm:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -459,11 +420,11 @@ define ptr @sdwConv2qp_testXForm(ptr returned %sink,
 ; CHECK-P8-NEXT:    ld r30, -16(r1) # 8-byte Folded Reload
 ; CHECK-P8-NEXT:    mtlr r0
 ; CHECK-P8-NEXT:    blr
-                                    ptr nocapture readonly %a) {
+                                 ptr nocapture readonly %a) nounwind {
 entry:
   %add.ptr = getelementptr inbounds i8, ptr %a, i64 73333
-  %0 = load i64, ptr %add.ptr, align 8
-  %conv = sitofp i64 %0 to fp128
+  %i = load i64, ptr %add.ptr, align 8
+  %conv = sitofp i64 %i to fp128
   store fp128 %conv, ptr %sink, align 16
   ret ptr %sink
 
@@ -483,9 +444,6 @@ define ptr @udwConv2qp_testXForm(ptr returned %sink,
 ; CHECK-P8-LABEL: udwConv2qp_testXForm:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -503,18 +461,18 @@ define ptr @udwConv2qp_testXForm(ptr returned %sink,
 ; CHECK-P8-NEXT:    ld r30, -16(r1) # 8-byte Folded Reload
 ; CHECK-P8-NEXT:    mtlr r0
 ; CHECK-P8-NEXT:    blr
-                                    ptr nocapture readonly %a) {
+                                 ptr nocapture readonly %a) nounwind {
 entry:
   %add.ptr = getelementptr inbounds i8, ptr %a, i64 73333
-  %0 = load i64, ptr %add.ptr, align 8
-  %conv = uitofp i64 %0 to fp128
+  %i = load i64, ptr %add.ptr, align 8
+  %conv = uitofp i64 %i to fp128
   store fp128 %conv, ptr %sink, align 16
   ret ptr %sink
 
 }
 
 ; Function Attrs: norecurse nounwind
-define void @swConv2qp(ptr nocapture %a, i32 signext %b) {
+define void @swConv2qp(ptr nocapture %a, i32 signext %b) nounwind {
 ; CHECK-LABEL: swConv2qp:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    mtvsrwa v2, r4
@@ -525,9 +483,6 @@ define void @swConv2qp(ptr nocapture %a, i32 signext %b) {
 ; CHECK-P8-LABEL: swConv2qp:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -550,7 +505,7 @@ entry:
 }
 
 ; Function Attrs: norecurse nounwind
-define void @swConv2qp_02(ptr nocapture %a, ptr nocapture readonly %b) {
+define void @swConv2qp_02(ptr nocapture %a, ptr nocapture readonly %b) nounwind {
 ; CHECK-LABEL: swConv2qp_02:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    lxsiwax v2, 0, r4
@@ -561,9 +516,6 @@ define void @swConv2qp_02(ptr nocapture %a, ptr nocapture readonly %b) {
 ; CHECK-P8-LABEL: swConv2qp_02:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    std r0, 64(r1)
@@ -579,15 +531,15 @@ define void @swConv2qp_02(ptr nocapture %a, ptr nocapture readonly %b) {
 ; CHECK-P8-NEXT:    mtlr r0
 ; CHECK-P8-NEXT:    blr
 entry:
-  %0 = load i32, ptr %b, align 4
-  %conv = sitofp i32 %0 to fp128
+  %i = load i32, ptr %b, align 4
+  %conv = sitofp i32 %i to fp128
   store fp128 %conv, ptr %a, align 16
   ret void
 
 }
 
 ; Function Attrs: norecurse nounwind
-define void @swConv2qp_03(ptr nocapture %a) {
+define void @swConv2qp_03(ptr nocapture %a) nounwind {
 ; CHECK-LABEL: swConv2qp_03:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    addis r4, r2, .LC2@toc@ha
@@ -601,9 +553,6 @@ define void @swConv2qp_03(ptr nocapture %a) {
 ; CHECK-P8-LABEL: swConv2qp_03:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -621,16 +570,16 @@ define void @swConv2qp_03(ptr nocapture %a) {
 ; CHECK-P8-NEXT:    mtlr r0
 ; CHECK-P8-NEXT:    blr
 entry:
-  %0 = load i32, ptr getelementptr inbounds
+  %i = load i32, ptr getelementptr inbounds
                         ([5 x i32], ptr @swMem, i64 0, i64 3), align 4
-  %conv = sitofp i32 %0 to fp128
+  %conv = sitofp i32 %i to fp128
   store fp128 %conv, ptr %a, align 16
   ret void
 
 }
 
 ; Function Attrs: norecurse nounwind
-define void @uwConv2qp(ptr nocapture %a, i32 zeroext %b) {
+define void @uwConv2qp(ptr nocapture %a, i32 zeroext %b) nounwind {
 ; CHECK-LABEL: uwConv2qp:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    mtvsrwz v2, r4
@@ -641,9 +590,6 @@ define void @uwConv2qp(ptr nocapture %a, i32 zeroext %b) {
 ; CHECK-P8-LABEL: uwConv2qp:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -666,7 +612,7 @@ entry:
 }
 
 ; Function Attrs: norecurse nounwind
-define void @uwConv2qp_02(ptr nocapture %a, ptr nocapture readonly %b) {
+define void @uwConv2qp_02(ptr nocapture %a, ptr nocapture readonly %b) nounwind {
 ; CHECK-LABEL: uwConv2qp_02:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    lxsiwzx v2, 0, r4
@@ -677,9 +623,6 @@ define void @uwConv2qp_02(ptr nocapture %a, ptr nocapture readonly %b) {
 ; CHECK-P8-LABEL: uwConv2qp_02:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    std r0, 64(r1)
@@ -695,15 +638,15 @@ define void @uwConv2qp_02(ptr nocapture %a, ptr nocapture readonly %b) {
 ; CHECK-P8-NEXT:    mtlr r0
 ; CHECK-P8-NEXT:    blr
 entry:
-  %0 = load i32, ptr %b, align 4
-  %conv = uitofp i32 %0 to fp128
+  %i = load i32, ptr %b, align 4
+  %conv = uitofp i32 %i to fp128
   store fp128 %conv, ptr %a, align 16
   ret void
 
 }
 
 ; Function Attrs: norecurse nounwind
-define void @uwConv2qp_03(ptr nocapture %a) {
+define void @uwConv2qp_03(ptr nocapture %a) nounwind {
 ; CHECK-LABEL: uwConv2qp_03:
 ; CHECK:       # %bb.0: # %entry
 ; CHECK-NEXT:    addis r4, r2, .LC3@toc@ha
@@ -717,9 +660,6 @@ define void @uwConv2qp_03(ptr nocapture %a) {
 ; CHECK-P8-LABEL: uwConv2qp_03:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    std r30, -16(r1) # 8-byte Folded Spill
 ; CHECK-P8-NEXT:    stdu r1, -48(r1)
 ; CHECK-P8-NEXT:    mr r30, r3
@@ -737,9 +677,9 @@ define void @uwConv2qp_03(ptr nocapture %a) {
 ; CHECK-P8-NEXT:    mtlr r0
 ; CHECK-P8-NEXT:    blr
 entry:
-  %0 = load i32, ptr getelementptr inbounds
+  %i = load i32, ptr getelementptr inbounds
                         ([5 x i32], ptr @uwMem, i64 0, i64 3), align 4
-  %conv = uitofp i32 %0 to fp128
+  %conv = uitofp i32 %i to fp128
   store fp128 %conv, ptr %a, align 16
   ret void
 
@@ -759,9 +699,6 @@ define void @uwConv2qp_04(ptr nocapture %a,
 ; CHECK-P8-LABEL: uwConv2qp_04:
 ; CHECK-P8:       # %bb.0: # %entry
 ; CHECK-P8-NEXT:    mflr r0
-; CHECK-P8-NEXT:    .cfi_def_cfa_offset 48
-; CHECK-P8-NEXT:    .cfi_offset lr, 16
-; CHECK-P8-NEXT:    .cfi_offset r30, -16
 ; CHECK-P8-NEXT:    ...
[truncated]

Copy link
Collaborator

@RolandF77 RolandF77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 57 to 58
; CHECK-LABEL: trunctfhf2:
; CHECK: __trunctfhf2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs an update at

setLibcallName(RTLIB::FPROUND_F128_F32, "__trunckfsf2");
setLibcallName(RTLIB::FPROUND_F128_F64, "__trunckfdf2");
so __trunckfhf2 is used rather than the tf version.

@alexrp
Copy link
Member

alexrp commented Mar 9, 2025

FWIW, the backend crash that this is fixing affects Zig, so it would be great to backport this to 20.x if possible.

Copy link
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm, thanks!

@lei137 lei137 merged commit ade22fc into llvm:main Mar 19, 2025
12 checks passed
@lei137 lei137 added this to the LLVM 20.X Release milestone Mar 19, 2025
@lei137
Copy link
Contributor Author

lei137 commented Mar 19, 2025

/cherry-pick ade22fc

@llvmbot
Copy link
Member

llvmbot commented Mar 19, 2025

/pull-request #132049

lei137 added a commit to lei137/llvm-project that referenced this pull request Mar 27, 2025
Enables conversion between f16 and f128.
Expanding on pre-Power9 targets and using HW instructions on Power9.

Fixes llvm#92866
Commandeer of:  llvm#97677

---------

Co-authored-by: esmeyi <[email protected]>
(cherry picked from commit ade22fc)
lei137 added a commit to lei137/llvm-project that referenced this pull request Mar 27, 2025
Enables conversion between f16 and f128.
Expanding on pre-Power9 targets and using HW instructions on Power9.

Fixes llvm#92866
Commandeer of:  llvm#97677

---------

Co-authored-by: esmeyi <[email protected]>
(cherry picked from commit ade22fc)
swift-ci pushed a commit to swiftlang/llvm-project that referenced this pull request Mar 27, 2025
Enables conversion between f16 and f128.
Expanding on pre-Power9 targets and using HW instructions on Power9.

Fixes llvm#92866
Commandeer of:  llvm#97677

---------

Co-authored-by: esmeyi <[email protected]>
(cherry picked from commit ade22fc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging this pull request may close these issues.

LLVM f128 -> f16 conversion selection failure on powerpc64le
6 participants