For my custom backend, I've following GMIR input to legalizer:
%1:_(s32) = COPY $r8
%0:_(s16) = G_TRUNC %1:_(s32)
%2:_(s16), %3:_ = G_SDIVREM %0:_, %0:_
%4:_(s16) = G_INTRINSIC intrinsic(@llvm.custom.fma), %2:_(s16), %3:_(s16), %2:_(s16)
%7:_(s32) = G_ANYEXT %4:_(s16)
$r8 = COPY %7:_(s32)
I've following legalization rule for G_SDIVREM:
getActionDefinitionsBuilder(
{G_UDIVREM, G_SDIVREM})
.clampScalar(0, s32, s64)
.scalarize(0)
.customFor({s32, s64});
Therefore, legalizer tries to widen G_SDIVREM operation to s32. Then, it inserts G_TRUNC to truncate %2, %3 so they can be consumed by the following llvm.custom.fma intrinsic.
The problem occurs with placement of G_TRUNC operations in LegalizerHelp.cpp. Here's the existing logic:
case TargetOpcode::G_SDIVREM:
Observer.changingInstr(MI);
widenScalarSrc(MI, WideTy, 2, TargetOpcode::G_SEXT);
widenScalarSrc(MI, WideTy, 3, TargetOpcode::G_SEXT);
widenScalarDst(MI, WideTy);
//MIRBuilder.setInsertPt(MIRBuilder.getMBB(), --MIRBuilder.getInsertPt());
widenScalarDst(MI, WideTy, 1);
Observer.changedInstr(MI);
return Legalized;
After widening, we must insert 2 G_TRUNC operations. First G_TRUNC is inserted at correct position. widenScalarDst() increments next insertion position. Therefore, the second widenScalarDst() inserts 2nd G_TRUNC 1 instruction too late. This leads to following incorrect code:
%2:_(s16) = G_TRUNC %10:_(s32)
%4:_(s16) = G_INTRINSIC intrinsic(@llvm.custom.fma), %2:_(s16), %3:_(s16), %2:_(s16)
%3:_(s16) = G_TRUNC %11:_(s32)
My proposed fix is in the comment above. This applies to G_UDIVREM as well.
To submit the patch, I need a test. But I'm unable to reproduce this on backends other than my custom backend. So what's the right way to submit this test?
For my custom backend, I've following GMIR input to legalizer:
I've following legalization rule for G_SDIVREM:
Therefore, legalizer tries to widen G_SDIVREM operation to s32. Then, it inserts G_TRUNC to truncate %2, %3 so they can be consumed by the following llvm.custom.fma intrinsic.
The problem occurs with placement of G_TRUNC operations in LegalizerHelp.cpp. Here's the existing logic:
After widening, we must insert 2 G_TRUNC operations. First G_TRUNC is inserted at correct position. widenScalarDst() increments next insertion position. Therefore, the second widenScalarDst() inserts 2nd G_TRUNC 1 instruction too late. This leads to following incorrect code:
My proposed fix is in the comment above. This applies to G_UDIVREM as well.
To submit the patch, I need a test. But I'm unable to reproduce this on backends other than my custom backend. So what's the right way to submit this test?