[LV][AArch64] Add test for fp128 fmuladd reduction.(NFC)#137576
Merged
ElvisWang123 merged 1 commit intollvm:mainfrom Apr 29, 2025
Merged
[LV][AArch64] Add test for fp128 fmuladd reduction.(NFC)#137576ElvisWang123 merged 1 commit intollvm:mainfrom
ElvisWang123 merged 1 commit intollvm:mainfrom
Conversation
Member
|
@llvm/pr-subscribers-llvm-transforms Author: Elvis Wang (ElvisWang123) ChangesThis patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure. Pre-commit test for #113903. 1 Files Affected:
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/f128-fmuladd-reduction.ll b/llvm/test/Transforms/LoopVectorize/AArch64/f128-fmuladd-reduction.ll
new file mode 100644
index 0000000000000..7ae08dd330d24
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/f128-fmuladd-reduction.ll
@@ -0,0 +1,113 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -mtriple=aarch64 -mcpu=neoverse-v2 -p loop-vectorize %s -S | FileCheck %s
+define double @fp128_fmuladd_reduction(ptr %start0, ptr %start1, ptr %end0, ptr %end1, double %x, i64 %n) {
+; CHECK-LABEL: define double @fp128_fmuladd_reduction(
+; CHECK-SAME: ptr [[START0:%.*]], ptr [[START1:%.*]], ptr [[END0:%.*]], ptr [[END1:%.*]], double [[X:%.*]], i64 [[N:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT: [[ENTRY:.*]]:
+; CHECK-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[N]], 4
+; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; CHECK: [[VECTOR_PH]]:
+; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[N]], 4
+; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[N]], [[N_MOD_VF]]
+; CHECK-NEXT: [[TMP0:%.*]] = mul i64 [[N_VEC]], 16
+; CHECK-NEXT: [[TMP1:%.*]] = getelementptr i8, ptr [[START0]], i64 [[TMP0]]
+; CHECK-NEXT: [[TMP2:%.*]] = mul i64 [[N_VEC]], 8
+; CHECK-NEXT: [[TMP3:%.*]] = getelementptr i8, ptr [[START1]], i64 [[TMP2]]
+; CHECK-NEXT: br label %[[VECTOR_BODY:.*]]
+; CHECK: [[VECTOR_BODY]]:
+; CHECK-NEXT: [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[VEC_PHI:%.*]] = phi double [ [[X]], %[[VECTOR_PH]] ], [ [[TMP29:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT: [[OFFSET_IDX:%.*]] = mul i64 [[INDEX]], 16
+; CHECK-NEXT: [[TMP4:%.*]] = add i64 [[OFFSET_IDX]], 16
+; CHECK-NEXT: [[TMP5:%.*]] = add i64 [[OFFSET_IDX]], 32
+; CHECK-NEXT: [[TMP6:%.*]] = add i64 [[OFFSET_IDX]], 48
+; CHECK-NEXT: [[NEXT_GEP:%.*]] = getelementptr i8, ptr [[START0]], i64 [[OFFSET_IDX]]
+; CHECK-NEXT: [[NEXT_GEP1:%.*]] = getelementptr i8, ptr [[START0]], i64 [[TMP4]]
+; CHECK-NEXT: [[NEXT_GEP2:%.*]] = getelementptr i8, ptr [[START0]], i64 [[TMP5]]
+; CHECK-NEXT: [[NEXT_GEP3:%.*]] = getelementptr i8, ptr [[START0]], i64 [[TMP6]]
+; CHECK-NEXT: [[OFFSET_IDX4:%.*]] = mul i64 [[INDEX]], 8
+; CHECK-NEXT: [[TMP7:%.*]] = add i64 [[OFFSET_IDX4]], 8
+; CHECK-NEXT: [[TMP8:%.*]] = add i64 [[OFFSET_IDX4]], 16
+; CHECK-NEXT: [[TMP9:%.*]] = add i64 [[OFFSET_IDX4]], 24
+; CHECK-NEXT: [[NEXT_GEP5:%.*]] = getelementptr i8, ptr [[START1]], i64 [[OFFSET_IDX4]]
+; CHECK-NEXT: [[NEXT_GEP6:%.*]] = getelementptr i8, ptr [[START1]], i64 [[TMP7]]
+; CHECK-NEXT: [[NEXT_GEP7:%.*]] = getelementptr i8, ptr [[START1]], i64 [[TMP8]]
+; CHECK-NEXT: [[NEXT_GEP8:%.*]] = getelementptr i8, ptr [[START1]], i64 [[TMP9]]
+; CHECK-NEXT: [[TMP10:%.*]] = load fp128, ptr [[NEXT_GEP]], align 16
+; CHECK-NEXT: [[TMP11:%.*]] = load fp128, ptr [[NEXT_GEP1]], align 16
+; CHECK-NEXT: [[TMP12:%.*]] = load fp128, ptr [[NEXT_GEP2]], align 16
+; CHECK-NEXT: [[TMP13:%.*]] = load fp128, ptr [[NEXT_GEP3]], align 16
+; CHECK-NEXT: [[TMP14:%.*]] = load double, ptr [[NEXT_GEP5]], align 16
+; CHECK-NEXT: [[TMP15:%.*]] = load double, ptr [[NEXT_GEP6]], align 16
+; CHECK-NEXT: [[TMP16:%.*]] = load double, ptr [[NEXT_GEP7]], align 16
+; CHECK-NEXT: [[TMP17:%.*]] = load double, ptr [[NEXT_GEP8]], align 16
+; CHECK-NEXT: [[TMP18:%.*]] = fptrunc fp128 [[TMP10]] to double
+; CHECK-NEXT: [[TMP19:%.*]] = fptrunc fp128 [[TMP11]] to double
+; CHECK-NEXT: [[TMP20:%.*]] = fptrunc fp128 [[TMP12]] to double
+; CHECK-NEXT: [[TMP21:%.*]] = fptrunc fp128 [[TMP13]] to double
+; CHECK-NEXT: [[TMP22:%.*]] = fmul double [[TMP18]], [[TMP14]]
+; CHECK-NEXT: [[TMP23:%.*]] = fmul double [[TMP19]], [[TMP15]]
+; CHECK-NEXT: [[TMP24:%.*]] = fmul double [[TMP20]], [[TMP16]]
+; CHECK-NEXT: [[TMP25:%.*]] = fmul double [[TMP21]], [[TMP17]]
+; CHECK-NEXT: [[TMP26:%.*]] = fadd double [[VEC_PHI]], [[TMP22]]
+; CHECK-NEXT: [[TMP27:%.*]] = fadd double [[TMP26]], [[TMP23]]
+; CHECK-NEXT: [[TMP28:%.*]] = fadd double [[TMP27]], [[TMP24]]
+; CHECK-NEXT: [[TMP29]] = fadd double [[TMP28]], [[TMP25]]
+; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
+; CHECK-NEXT: [[TMP30:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
+; CHECK-NEXT: br i1 [[TMP30]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK: [[MIDDLE_BLOCK]]:
+; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N]], [[N_VEC]]
+; CHECK-NEXT: br i1 [[CMP_N]], label %[[EXIT:.*]], label %[[SCALAR_PH]]
+; CHECK: [[SCALAR_PH]]:
+; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi ptr [ [[TMP1]], %[[MIDDLE_BLOCK]] ], [ [[START0]], %[[ENTRY]] ]
+; CHECK-NEXT: [[BC_RESUME_VAL9:%.*]] = phi ptr [ [[TMP3]], %[[MIDDLE_BLOCK]] ], [ [[START1]], %[[ENTRY]] ]
+; CHECK-NEXT: [[BC_RESUME_VAL10:%.*]] = phi i64 [ [[N_VEC]], %[[MIDDLE_BLOCK]] ], [ 0, %[[ENTRY]] ]
+; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi double [ [[TMP29]], %[[MIDDLE_BLOCK]] ], [ [[X]], %[[ENTRY]] ]
+; CHECK-NEXT: br label %[[LOOP:.*]]
+; CHECK: [[LOOP]]:
+; CHECK-NEXT: [[PTR0:%.*]] = phi ptr [ [[PTR0_NEXT:%.*]], %[[LOOP]] ], [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ]
+; CHECK-NEXT: [[PTR1:%.*]] = phi ptr [ [[PTR1_NEXT:%.*]], %[[LOOP]] ], [ [[BC_RESUME_VAL9]], %[[SCALAR_PH]] ]
+; CHECK-NEXT: [[IV:%.*]] = phi i64 [ [[IV_NEXT:%.*]], %[[LOOP]] ], [ [[BC_RESUME_VAL10]], %[[SCALAR_PH]] ]
+; CHECK-NEXT: [[RED:%.*]] = phi double [ [[RED_NEXT:%.*]], %[[LOOP]] ], [ [[BC_MERGE_RDX]], %[[SCALAR_PH]] ]
+; CHECK-NEXT: [[PTR0_NEXT]] = getelementptr i8, ptr [[PTR0]], i64 16
+; CHECK-NEXT: [[PTR1_NEXT]] = getelementptr i8, ptr [[PTR1]], i64 8
+; CHECK-NEXT: [[LOAD0:%.*]] = load fp128, ptr [[PTR0]], align 16
+; CHECK-NEXT: [[LOAD1:%.*]] = load double, ptr [[PTR1]], align 16
+; CHECK-NEXT: [[TRUNC:%.*]] = fptrunc fp128 [[LOAD0]] to double
+; CHECK-NEXT: [[RED_NEXT]] = tail call double @llvm.fmuladd.f64(double [[TRUNC]], double [[LOAD1]], double [[RED]])
+; CHECK-NEXT: [[IV_NEXT]] = add i64 [[IV]], 1
+; CHECK-NEXT: [[CMP1_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[N]]
+; CHECK-NEXT: br i1 [[CMP1_NOT]], label %[[EXIT]], label %[[LOOP]], !llvm.loop [[LOOP3:![0-9]+]]
+; CHECK: [[EXIT]]:
+; CHECK-NEXT: [[LCSSA:%.*]] = phi double [ [[RED_NEXT]], %[[LOOP]] ], [ [[TMP29]], %[[MIDDLE_BLOCK]] ]
+; CHECK-NEXT: ret double [[LCSSA]]
+;
+entry:
+ br label %loop
+
+loop:
+ %ptr0 = phi ptr [ %ptr0.next, %loop ], [ %start0, %entry ]
+ %ptr1 = phi ptr [ %ptr1.next, %loop ], [ %start1, %entry ]
+ %iv = phi i64 [ %iv.next, %loop ], [ 0, %entry ]
+ %red = phi double [ %red.next, %loop ], [ %x, %entry ]
+ %ptr0.next = getelementptr i8, ptr %ptr0, i64 16
+ %ptr1.next = getelementptr i8, ptr %ptr1, i64 8
+ %load0 = load fp128, ptr %ptr0, align 16
+ %load1 = load double, ptr %ptr1, align 16
+ %trunc = fptrunc fp128 %load0 to double
+ %red.next = tail call double @llvm.fmuladd.f64(double %trunc, double %load1, double %red)
+ %iv.next = add i64 %iv, 1
+ %cmp1.not = icmp eq i64 %iv.next, %n
+ br i1 %cmp1.not, label %exit, label %loop
+
+exit:
+ %lcssa = phi double [ %red.next, %loop ]
+ ret double %lcssa
+}
+;.
+; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
+; CHECK: [[META1]] = !{!"llvm.loop.isvectorized", i32 1}
+; CHECK: [[META2]] = !{!"llvm.loop.unroll.runtime.disable"}
+; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META1]]}
+;.
|
fhahn
approved these changes
Apr 28, 2025
| @@ -0,0 +1,113 @@ | |||
| ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5 | |||
| ; RUN: opt -mtriple=aarch64 -mcpu=neoverse-v2 -p loop-vectorize %s -S | FileCheck %s | |||
| define double @fp128_fmuladd_reduction(ptr %start0, ptr %start1, ptr %end0, ptr %end1, double %x, i64 %n) { | |||
Contributor
There was a problem hiding this comment.
Suggested change
| define double @fp128_fmuladd_reduction(ptr %start0, ptr %start1, ptr %end0, ptr %end1, double %x, i64 %n) { | |
| define double @fp128_fmuladd_reduction(ptr %start0, ptr %start1, ptr %end0, ptr %end1, double %x, i64 %n) { |
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure.
c8b0987 to
412c1d8
Compare
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/16831 Here is the relevant piece of the build log for the reference |
IanWood1
pushed a commit
to IanWood1/llvm-project
that referenced
this pull request
May 6, 2025
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure. Pre-commit test for llvm#113903.
GeorgeARM
pushed a commit
to GeorgeARM/llvm-project
that referenced
this pull request
May 7, 2025
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change. Note that without the fp128 load and trunc, there is no failure. Pre-commit test for llvm#113903.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This patch add the test for the fmuladd reduction to show the test change/fail for the cost model change.
Note that without the fp128 load and trunc, there is no failure.
Pre-commit test for #113903.