[X86] Lower mathlib call ldexp into scalef when avx512 is enabled#69710
[X86] Lower mathlib call ldexp into scalef when avx512 is enabled#69710
Conversation
|
Similar to #67552 |
There was a problem hiding this comment.
Supplement the check, thanks!
There was a problem hiding this comment.
no need to create a method - just make it static inside X86ISelLowering.cpp
There was a problem hiding this comment.
We'd be much better off adding a X86ISD::SCALEF nodetype rather than individual intrinsic lowering cases.
There was a problem hiding this comment.
Hi, @RKSimon, sorry for the delay in replying your comments, I have made some changes to the patch, please have a check, thanks very much!
7355b45 to
85f59ff
Compare
|
There was a problem hiding this comment.
Use the single line style like (most of) the previous cases.
There was a problem hiding this comment.
It seems that coding style check discourages this change ...
There was a problem hiding this comment.
Why do you need to vectorize to use SCALEFS? I thought SCALEFS was for scalar types and SCALEF was for vector types? (So it should be possible to add vector support here as well).
There was a problem hiding this comment.
Can we use vscalefph if we have AVX512FP16?
There was a problem hiding this comment.
Hi, there may be risk of truncation, as the EXP operand of FLDEXP types i32, e.g., @llvm.ldexp.f16.i32(half, i32) ->@llvm.x86.avx512fp16.mask.scalef.sh(<8 x half>, <8 x half>, <8 x half>. I didn't know how to handle the issue elegantly, so I made an extension here.
There was a problem hiding this comment.
Good point! LangRef doesn't give an example f16 case. Should we define it as @llvm.ldexp.f16.i16(half, i16)? i32 is a too large range to be useful for FP16.
There was a problem hiding this comment.
Could we use value tracking to check the bounds of the EXP operand?
There was a problem hiding this comment.
Value tracking seems to make things very complicated.
There was a problem hiding this comment.
I'd settle for a TODO comment for now
85f59ff to
c365199
Compare
You can test this locally with the following command:git-clang-format --diff e34c35a21ccc215ce507a1e19b4ff2a1ce9906f3 4ac55f1aecf19b4e2ce981ad757c79b0daf46e5e -- llvm/lib/Target/X86/X86ISelLowering.cppView the diff from clang-format here.diff --git a/llvm/lib/Target/X86/X86ISelLowering.cpp b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 19510bbba0..7ac39cea21 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -2406,11 +2406,11 @@ X86TargetLowering::X86TargetLowering(const X86TargetMachine &TM,
}
if (Subtarget.hasAVX512()) {
- for (MVT VT : { MVT::f16, MVT::f32, MVT::f64, MVT::v4f32, MVT::v2f64 })
+ for (MVT VT : {MVT::f16, MVT::f32, MVT::f64, MVT::v4f32, MVT::v2f64})
setOperationAction(ISD::FLDEXP, VT, Custom);
if (Subtarget.hasVLX())
- for (MVT VT : { MVT::v8f32, MVT::v4f64, MVT::v16f32, MVT::v8f64 })
+ for (MVT VT : {MVT::v8f32, MVT::v4f64, MVT::v16f32, MVT::v8f64})
setOperationAction(ISD::FLDEXP, VT, Custom);
}
@@ -32039,7 +32039,8 @@ SDValue X86TargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
case ISD::ADDRSPACECAST: return LowerADDRSPACECAST(Op, DAG);
case X86ISD::CVTPS2PH: return LowerCVTPS2PH(Op, DAG);
case ISD::PREFETCH: return LowerPREFETCH(Op, Subtarget, DAG);
- case ISD::FLDEXP: return LowerFLDEXP(Op, Subtarget, DAG);
+ case ISD::FLDEXP:
+ return LowerFLDEXP(Op, Subtarget, DAG);
}
}
|
|
c365199 to
4ac55f1
Compare
|
| } | ||
|
|
||
| if (Subtarget.hasAVX512()) { | ||
| for (MVT VT : { MVT::f16, MVT::f32, MVT::f64, MVT::v4f32, MVT::v2f64 }) |
There was a problem hiding this comment.
Are we sure this is right? I'd expect MVT::v4f32, MVT::v2f64, MVT::v8f32, MVT::v4f64 to be the VLX cases.
| ; AVX512VL-NEXT: vpinsrw $0, %eax, %xmm0, %xmm0 | ||
| ; AVX512VL-NEXT: retq | ||
| entry: | ||
| %r = tail call fast half @llvm.ldexp.f16.i32(half %x, i32 %exp) |
There was a problem hiding this comment.
Drop unnecessary fast flags?
|
@huhu233 reverse-ping |
|
Any progress on this PR? |
|
Closing this abandoned PR and adding a issue (#165694) to track future work - I'll add the test coverage from this patch to trunk so we have something to work from. |
Pulled out of the abandoned patch llvm#69710 to act as a baseline for llvm#165694
Pulled out of the abandoned patch llvm#69710 to act as a baseline for llvm#165694
Pulled out of the abandoned patch llvm#69710 to act as a baseline for llvm#165694
Pulled out of the abandoned patch llvm#69710 to act as a baseline for llvm#165694
No description provided.