[RISCV] Add cost for @llvm.experimental.vp.splat#117313
Conversation
This is split off from llvm#115274. There doesn't seem to be an easy way to share this with getShuffleCost since that requires passing in a real insert_element operand to get it to recognise it's a scalar splat. There's no tests for i1 vectors, since we can't lower vp splats of them yet and currently crash. Co-authored-by: Shih-Po Hung <[email protected]>
|
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-backend-risc-v Author: Luke Lau (lukel97) ChangesThis is split off from #115274. There doesn't seem to be an easy way to share this with getShuffleCost since that requires passing in a real insert_element operand to get it to recognise it's a scalar splat. There's no tests for i1 vectors, since we can't lower vp splats of them yet and currently crash. 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 2b16dcbcd8695b..ecdecd7edff07c 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -1155,6 +1155,11 @@ RISCVTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
return getCmpSelInstrCost(Instruction::Select, ICA.getReturnType(),
ICA.getArgTypes()[0], CmpInst::BAD_ICMP_PREDICATE,
CostKind);
+ case Intrinsic::experimental_vp_splat: {
+ auto LT = getTypeLegalizationCost(RetTy);
+ return LT.first *
+ getRISCVInstructionCost(RISCV::VMV_V_X, LT.second, CostKind);
+ }
case Intrinsic::vp_reduce_add:
case Intrinsic::vp_reduce_fadd:
case Intrinsic::vp_reduce_mul:
diff --git a/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll b/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
index 800ea223850d31..a7ce6f660ca4af 100644
--- a/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
@@ -2125,6 +2125,112 @@ define void @vp_fdiv(){
ret void
}
+define void @splat() {
+; CHECK-LABEL: 'splat'
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = call <2 x i8> @llvm.experimental.vp.splat.v2i8(i8 undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = call <4 x i8> @llvm.experimental.vp.splat.v4i8(i8 undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = call <8 x i8> @llvm.experimental.vp.splat.v8i8(i8 undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %4 = call <16 x i8> @llvm.experimental.vp.splat.v16i8(i8 undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %5 = call <2 x i16> @llvm.experimental.vp.splat.v2i16(i16 undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %6 = call <4 x i16> @llvm.experimental.vp.splat.v4i16(i16 undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = call <8 x i16> @llvm.experimental.vp.splat.v8i16(i16 undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = call <16 x i16> @llvm.experimental.vp.splat.v16i16(i16 undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %9 = call <2 x i32> @llvm.experimental.vp.splat.v2i32(i32 undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %10 = call <4 x i32> @llvm.experimental.vp.splat.v4i32(i32 undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %11 = call <8 x i32> @llvm.experimental.vp.splat.v8i32(i32 undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %12 = call <16 x i32> @llvm.experimental.vp.splat.v16i32(i32 undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %13 = call <2 x i64> @llvm.experimental.vp.splat.v2i64(i64 undef, <2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %14 = call <4 x i64> @llvm.experimental.vp.splat.v4i64(i64 undef, <4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %15 = call <8 x i64> @llvm.experimental.vp.splat.v8i64(i64 undef, <8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %16 = call <16 x i64> @llvm.experimental.vp.splat.v16i64(i64 undef, <16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %17 = call <vscale x 2 x i8> @llvm.experimental.vp.splat.nxv2i8(i8 undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %18 = call <vscale x 4 x i8> @llvm.experimental.vp.splat.nxv4i8(i8 undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %19 = call <vscale x 8 x i8> @llvm.experimental.vp.splat.nxv8i8(i8 undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %20 = call <vscale x 16 x i8> @llvm.experimental.vp.splat.nxv16i8(i8 undef, <vscale x 16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %21 = call <vscale x 2 x i16> @llvm.experimental.vp.splat.nxv2i16(i16 undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %22 = call <vscale x 4 x i16> @llvm.experimental.vp.splat.nxv4i16(i16 undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %23 = call <vscale x 8 x i16> @llvm.experimental.vp.splat.nxv8i16(i16 undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %24 = call <vscale x 16 x i16> @llvm.experimental.vp.splat.nxv16i16(i16 undef, <vscale x 16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %25 = call <vscale x 2 x i32> @llvm.experimental.vp.splat.nxv2i32(i32 undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %26 = call <vscale x 4 x i32> @llvm.experimental.vp.splat.nxv4i32(i32 undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %27 = call <vscale x 8 x i32> @llvm.experimental.vp.splat.nxv8i32(i32 undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %28 = call <vscale x 16 x i32> @llvm.experimental.vp.splat.nxv16i32(i32 undef, <vscale x 16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %29 = call <vscale x 2 x i64> @llvm.experimental.vp.splat.nxv2i64(i64 undef, <vscale x 2 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %30 = call <vscale x 4 x i64> @llvm.experimental.vp.splat.nxv4i64(i64 undef, <vscale x 4 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %31 = call <vscale x 8 x i64> @llvm.experimental.vp.splat.nxv8i64(i64 undef, <vscale x 8 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %32 = call <vscale x 16 x i64> @llvm.experimental.vp.splat.nxv16i64(i64 undef, <vscale x 16 x i1> undef, i32 undef)
+; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; TYPEBASED-LABEL: 'splat'
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = call <2 x i8> @llvm.experimental.vp.splat.v2i8(i8 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = call <4 x i8> @llvm.experimental.vp.splat.v4i8(i8 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = call <8 x i8> @llvm.experimental.vp.splat.v8i8(i8 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %4 = call <16 x i8> @llvm.experimental.vp.splat.v16i8(i8 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %5 = call <2 x i16> @llvm.experimental.vp.splat.v2i16(i16 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %6 = call <4 x i16> @llvm.experimental.vp.splat.v4i16(i16 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = call <8 x i16> @llvm.experimental.vp.splat.v8i16(i16 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %8 = call <16 x i16> @llvm.experimental.vp.splat.v16i16(i16 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %9 = call <2 x i32> @llvm.experimental.vp.splat.v2i32(i32 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %10 = call <4 x i32> @llvm.experimental.vp.splat.v4i32(i32 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %11 = call <8 x i32> @llvm.experimental.vp.splat.v8i32(i32 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %12 = call <16 x i32> @llvm.experimental.vp.splat.v16i32(i32 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %13 = call <2 x i64> @llvm.experimental.vp.splat.v2i64(i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %14 = call <4 x i64> @llvm.experimental.vp.splat.v4i64(i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %15 = call <8 x i64> @llvm.experimental.vp.splat.v8i64(i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %16 = call <16 x i64> @llvm.experimental.vp.splat.v16i64(i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %17 = call <vscale x 2 x i8> @llvm.experimental.vp.splat.nxv2i8(i8 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %18 = call <vscale x 4 x i8> @llvm.experimental.vp.splat.nxv4i8(i8 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %19 = call <vscale x 8 x i8> @llvm.experimental.vp.splat.nxv8i8(i8 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %20 = call <vscale x 16 x i8> @llvm.experimental.vp.splat.nxv16i8(i8 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %21 = call <vscale x 2 x i16> @llvm.experimental.vp.splat.nxv2i16(i16 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %22 = call <vscale x 4 x i16> @llvm.experimental.vp.splat.nxv4i16(i16 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %23 = call <vscale x 8 x i16> @llvm.experimental.vp.splat.nxv8i16(i16 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %24 = call <vscale x 16 x i16> @llvm.experimental.vp.splat.nxv16i16(i16 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %25 = call <vscale x 2 x i32> @llvm.experimental.vp.splat.nxv2i32(i32 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %26 = call <vscale x 4 x i32> @llvm.experimental.vp.splat.nxv4i32(i32 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %27 = call <vscale x 8 x i32> @llvm.experimental.vp.splat.nxv8i32(i32 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %28 = call <vscale x 16 x i32> @llvm.experimental.vp.splat.nxv16i32(i32 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %29 = call <vscale x 2 x i64> @llvm.experimental.vp.splat.nxv2i64(i64 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %30 = call <vscale x 4 x i64> @llvm.experimental.vp.splat.nxv4i64(i64 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 8 for instruction: %31 = call <vscale x 8 x i64> @llvm.experimental.vp.splat.nxv8i64(i64 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %32 = call <vscale x 16 x i64> @llvm.experimental.vp.splat.nxv16i64(i64 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+ call <2 x i8> @llvm.experimental.vp.splat.v2i8(i8 undef, <2 x i1> undef, i32 undef)
+ call <4 x i8> @llvm.experimental.vp.splat.v4i8(i8 undef, <4 x i1> undef, i32 undef)
+ call <8 x i8> @llvm.experimental.vp.splat.v8i8(i8 undef, <8 x i1> undef, i32 undef)
+ call <16 x i8> @llvm.experimental.vp.splat.v16i8(i8 undef, <16 x i1> undef, i32 undef)
+ call <2 x i16> @llvm.experimental.vp.splat.v2i16(i16 undef, <2 x i1> undef, i32 undef)
+ call <4 x i16> @llvm.experimental.vp.splat.v4i16(i16 undef, <4 x i1> undef, i32 undef)
+ call <8 x i16> @llvm.experimental.vp.splat.v8i16(i16 undef, <8 x i1> undef, i32 undef)
+ call <16 x i16> @llvm.experimental.vp.splat.v16i16(i16 undef, <16 x i1> undef, i32 undef)
+ call <2 x i32> @llvm.experimental.vp.splat.v2i32(i32 undef, <2 x i1> undef, i32 undef)
+ call <4 x i32> @llvm.experimental.vp.splat.v4i32(i32 undef, <4 x i1> undef, i32 undef)
+ call <8 x i32> @llvm.experimental.vp.splat.v8i32(i32 undef, <8 x i1> undef, i32 undef)
+ call <16 x i32> @llvm.experimental.vp.splat.v16i32(i32 undef, <16 x i1> undef, i32 undef)
+ call <2 x i64> @llvm.experimental.vp.splat.v2i64(i64 undef, <2 x i1> undef, i32 undef)
+ call <4 x i64> @llvm.experimental.vp.splat.v4i64(i64 undef, <4 x i1> undef, i32 undef)
+ call <8 x i64> @llvm.experimental.vp.splat.v8i64(i64 undef, <8 x i1> undef, i32 undef)
+ call <16 x i64> @llvm.experimental.vp.splat.v16i64(i64 undef, <16 x i1> undef, i32 undef)
+ call <vscale x 2 x i8> @llvm.experimental.vp.splat.nxv2i8(i8 undef, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i8> @llvm.experimental.vp.splat.nxv4i8(i8 undef, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i8> @llvm.experimental.vp.splat.nxv8i8(i8 undef, <vscale x 8 x i1> undef, i32 undef)
+ call <vscale x 16 x i8> @llvm.experimental.vp.splat.nxv16i8(i8 undef, <vscale x 16 x i1> undef, i32 undef)
+ call <vscale x 2 x i16> @llvm.experimental.vp.splat.nxv2i16(i16 undef, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i16> @llvm.experimental.vp.splat.nxv4i16(i16 undef, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i16> @llvm.experimental.vp.splat.nxv8i16(i16 undef, <vscale x 8 x i1> undef, i32 undef)
+ call <vscale x 16 x i16> @llvm.experimental.vp.splat.nxv16i16(i16 undef, <vscale x 16 x i1> undef, i32 undef)
+ call <vscale x 2 x i32> @llvm.experimental.vp.splat.nxv2i32(i32 undef, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i32> @llvm.experimental.vp.splat.nxv4i32(i32 undef, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i32> @llvm.experimental.vp.splat.nxv8i32(i32 undef, <vscale x 8 x i1> undef, i32 undef)
+ call <vscale x 16 x i32> @llvm.experimental.vp.splat.nxv16i32(i32 undef, <vscale x 16 x i1> undef, i32 undef)
+ call <vscale x 2 x i64> @llvm.experimental.vp.splat.nxv2i64(i64 undef, <vscale x 2 x i1> undef, i32 undef)
+ call <vscale x 4 x i64> @llvm.experimental.vp.splat.nxv4i64(i64 undef, <vscale x 4 x i1> undef, i32 undef)
+ call <vscale x 8 x i64> @llvm.experimental.vp.splat.nxv8i64(i64 undef, <vscale x 8 x i1> undef, i32 undef)
+ call <vscale x 16 x i64> @llvm.experimental.vp.splat.nxv16i64(i64 undef, <vscale x 16 x i1> undef, i32 undef)
+ ret void
+}
+
declare <2 x i8> @llvm.vp.add.v2i8(<2 x i8>, <2 x i8>, <2 x i1>, i32)
declare <4 x i8> @llvm.vp.add.v4i8(<4 x i8>, <4 x i8>, <4 x i1>, i32)
declare <8 x i8> @llvm.vp.add.v8i8(<8 x i8>, <8 x i8>, <8 x i1>, i32)
|
| ICA.getArgTypes()[0], CmpInst::BAD_ICMP_PREDICATE, | ||
| CostKind); | ||
| case Intrinsic::experimental_vp_splat: { | ||
| auto LT = getTypeLegalizationCost(RetTy); |
There was a problem hiding this comment.
You're missing the hasV check, and it looks like this is only handling integer (not FP).
There was a problem hiding this comment.
If there's no V instructions then getTypeLegalizationCost will return an invalid type legalization cost for scalable vectors, or a scalar legalized type for fixed vectors, both of which will end up in invalid cost. We could be more explicit about this though?
There was a problem hiding this comment.
The fixed vector case will probably end up passing a scalar type to the getRISCVInstructionCost routine which I don't think is what we want. Please add the same type of guard we see elsewhere in this switch for the moment, we can come back and revisit in batch.
There was a problem hiding this comment.
Sounds good. I've made it return invalid rather than falling out of the switch since we can't scalarize scalable nor fixed versions of this intrinsic yet
There was a problem hiding this comment.
As a follow up, can you add handling for fixed vectors (only) in BasicTTI via getShuffleCost? We should be modeling the scalarization here.
There was a problem hiding this comment.
I took a look into this, but ended up in a bit of a rabbit hole. The default BasicTTIImpl scalarization implementation calls into the InsertElement/ExtractElement cost, but currently we return a cost of 0 for constant indices which at least isn't true for the inserts generated from a scalarized splat shuffle, for example.
However changing the InsertElement/ExtractElement cost seems somewhat sensitive, see #67334. I think we may want to make the cost "dumber", e.g. just a single scalar load/store for any insert or extract, but we would need to double check this doesn't cause excessive unrolling.
This is more complicated than what I first thought it would be, so leaving a note here to maybe return to later.
|
LGTM. Thanks! |
This is split off from #115274. There doesn't seem to be an easy way to share this with getShuffleCost since that requires passing in a real insert_element operand to get it to recognise it's a scalar splat.
For i1 vectors we can't currently lower them so it returns an invalid cost.