[PowerPC] Fix instruction sizes / branch relaxation by nikic · Pull Request #175556 · llvm/llvm-project

nikic · 2026-01-12T15:10:25Z

For PowerPC, having accurate (or at least not too small) instruction sizes is critical, because the PPCBranchSelector pass relies on them. Underestimating the size of an instruction can result in the wrong branch kind being chosen, which will result in an MC error.

This patch introduces validation that the instruction size reported by TII matches the actually emitted instruction size, and fixes various cases where this was not the case.

Fixes #175190.

llvmbot · 2026-01-12T15:11:02Z

@llvm/pr-subscribers-backend-powerpc

Author: Nikita Popov (nikic)

Changes

For PowerPC, having accurate (or at least not too small) instruction sizes is critical, because the PPCBranchSelector pass relies on them. Underestimating the size of an instruction can result in the wrong branch kind being chosen, which will result in an MC error.

This patch introduces validation that the instruction size reported by TII matches the actually emitted instruction size, and fixes various cases where this was not the case.

Fixes #175190.

Full diff: https://github.com/llvm/llvm-project/pull/175556.diff

4 Files Affected:

(modified) llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp (+27)
(modified) llvm/lib/Target/PowerPC/PPCInstr64Bit.td (+10-6)
(modified) llvm/lib/Target/PowerPC/PPCInstrInfo.cpp (+12-1)
(modified) llvm/lib/Target/PowerPC/PPCInstrInfo.td (+12-4)

diff --git a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
index 718d51c5a0673..863f35e6699b3 100644
--- a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
+++ b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
@@ -32,6 +32,7 @@
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/Twine.h"
+#include "llvm/ADT/ScopeExit.h"
 #include "llvm/BinaryFormat/ELF.h"
 #include "llvm/CodeGen/AsmPrinter.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
@@ -916,6 +917,32 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
     return PPC::S_None;
   };
 
+#ifndef NDEBUG
+  // Instruction sizes must be correct for PPCBranchSelector to pick the
+  // right branch kind. Verify that the reported sizes and the actually
+  // emitted sizes match.
+  unsigned ExpectedSize = Subtarget->getInstrInfo()->getInstSizeInBytes(*MI);
+  MCFragment *OldFragment = OutStreamer->getCurrentFragment();
+  size_t OldFragSize = OldFragment->getFixedSize();
+  scope_exit VerifyInstSize([&]() {
+    if (!OutStreamer->isObj())
+      return; // Can only verify size when streaming to object.
+    MCFragment *NewFragment = OutStreamer->getCurrentFragment();
+    if (NewFragment != OldFragment)
+      return; // Don't try to handle fragment splitting cases.
+    unsigned ActualSize = NewFragment->getFixedSize() - OldFragSize;
+    // For now, allow over-estimates here. Some pseudos expand to a variable
+    // number of instructions, and for correctness using the upper bound size
+    // is fine.
+    if (ActualSize > ExpectedSize) {
+      dbgs() << "Size mismatch for: " << *MI << "\n";
+      dbgs() << "Expected size: " << ExpectedSize << "\n";
+      dbgs() << "Actual size: " << ActualSize << "\n";
+      abort();
+    }
+  });
+#endif
+
   // Lower multi-instruction pseudo operations.
   switch (MI->getOpcode()) {
   default: break;
diff --git a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
index 620dfd4738226..16d24f05525da 100644
--- a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
+++ b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
@@ -1507,21 +1507,25 @@ class GETtlsADDRPseudo <string asmstr> : PPCEmitTimePseudo<(outs g8rc:$rD), (ins
                                              asmstr,
                                              [(set i64:$rD,
                                                (PPCgetTlsAddr i64:$reg, tglobaltlsaddr:$sym))]>,
-                                      isPPC64;
+                                      isPPC64 {
+  // May be either 4 or 8 bytes, specify upper bound.
+  let Size = 8;
+}
 class GETtlsldADDRPseudo <string asmstr> : PPCEmitTimePseudo<(outs g8rc:$rD), (ins g8rc:$reg, tlsgd:$sym),
                                              asmstr,
                                              [(set i64:$rD,
                                                (PPCgetTlsldAddr i64:$reg, tglobaltlsaddr:$sym))]>,
-                                      isPPC64;
+                                      isPPC64 {
+  // May be either 4 or 8 bytes, specify upper bound.
+  let Size = 8;
+}
 
 let hasExtraSrcRegAllocReq = 1, hasExtraDefRegAllocReq = 1 in {
 // LR8 is a true define, while the rest of the Defs are clobbers. X3 is
 // explicitly defined when this op is created, so not mentioned here.
-// This is lowered to BL8_NOP_TLS by the assembly printer, so the size must be
-// correct because the branch select pass is relying on it.
-let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7], Size = 8 in
+let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in
 def GETtlsADDR : GETtlsADDRPseudo <"#GETtlsADDR">;
-let Defs = [X0,X2,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7], Size = 8 in
+let Defs = [X0,X2,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in
 def GETtlsADDRPCREL : GETtlsADDRPseudo <"#GETtlsADDRPCREL">;
 
 // LR8 is a true define, while the rest of the Defs are clobbers. X3 is
diff --git a/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp b/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
index 58ad12e2ce65a..d6fc9cdfab5b0 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
+++ b/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
@@ -3013,7 +3013,18 @@ unsigned PPCInstrInfo::getInstSizeInBytes(const MachineInstr &MI) const {
     return Opers.getNumPatchBytes();
   } else if (Opcode == TargetOpcode::PATCHPOINT) {
     PatchPointOpers Opers(&MI);
-    return Opers.getNumPatchBytes();
+    // The call sequence is up to 44 bytes large.
+    // TODO: Per LangRef the client must ensure that the number of bytes is
+    // large enough, but many tests use patchpoint with 40 bytes.
+    return std::max(Opers.getNumPatchBytes(), 44u);
+  } else if (Opcode == TargetOpcode::PATCHABLE_FUNCTION_ENTER) {
+    const MachineFunction *MF = MI.getParent()->getParent();
+    const Function &F = MF->getFunction();
+    unsigned Num = 0;
+    (void)F.getFnAttribute("patchable-function-entry")
+        .getValueAsString()
+        .getAsInteger(10, Num);
+    return Num * 4;
   } else {
     return get(Opcode).getSize();
   }
diff --git a/llvm/lib/Target/PowerPC/PPCInstrInfo.td b/llvm/lib/Target/PowerPC/PPCInstrInfo.td
index fdccddd86b9b7..130ee2a7c4a97 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrInfo.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrInfo.td
@@ -2489,7 +2489,9 @@ def EnforceIEIO : XForm_24_eieio<31, 854, (outs), (ins),
                                  "eieio", IIC_LdStLoad, []>;
 
 def PseudoEIEIO : PPCEmitTimePseudo<(outs), (ins), "#PPCEIEIO",
-                  [(int_ppc_eieio)]>;
+                  [(int_ppc_eieio)]> {
+  let Size = 12;
+}
 
 def : Pat<(int_ppc_sync),   (SYNC 0)>, Requires<[HasSYNC]>;
 def : Pat<(int_ppc_iospace_sync),   (SYNC 0)>, Requires<[HasSYNC]>;
@@ -3448,14 +3450,18 @@ def : Pat<(add i32:$in, (PPChi tblockaddress:$g, 0)),
 
 // Support for thread-local storage.
 def PPC32GOT: PPCEmitTimePseudo<(outs gprc:$rD), (ins), "#PPC32GOT",
-                [(set i32:$rD, (PPCppc32GOT))]>;
+                [(set i32:$rD, (PPCppc32GOT))]> {
+  let Size = 8;
+}
 
 // Get the _GLOBAL_OFFSET_TABLE_ in PIC mode.
 // This uses two output registers, the first as the real output, the second as a
 // temporary register, used internally in code generation. A "bl" also clobbers LR.
 let Defs = [LR] in
 def PPC32PICGOT: PPCEmitTimePseudo<(outs gprc:$rD, gprc:$rT), (ins), "#PPC32PICGOT",
-                []>;
+                []> {
+  let Size = 20;
+}
 
 def LDgotTprelL32: PPCEmitTimePseudo<(outs gprc_nor0:$rD), (ins s16imm:$disp, gprc_nor0:$reg),
                            "#LDgotTprelL32",
@@ -3581,7 +3587,9 @@ def ADDItocL : PPCEmitTimePseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, tocentry3
 
 // Get Global (GOT) Base Register offset, from the word immediately preceding
 // the function label.
-def UpdateGBR : PPCEmitTimePseudo<(outs gprc:$rD, gprc:$rT), (ins gprc:$rI), "#UpdateGBR", []>;
+def UpdateGBR : PPCEmitTimePseudo<(outs gprc:$rD, gprc:$rT), (ins gprc:$rI), "#UpdateGBR", []> {
+  let Size = 8;
+}
 
 // Pseudo-instruction marked for deletion. When deleting the instruction would
 // cause iterator invalidation in MIR transformation passes, this pseudo can be

github-actions · 2026-01-12T15:12:22Z

✅ With the latest revision this PR passed the C/C++ code formatter.

nikic · 2026-01-12T15:45:17Z

Probably worth noting that because most tests don't use -filetype=obj, I tested this by adding -filetype=obj -o /dev/null to all the llc invocations and then seeing which ones crashed due to size mismatch.

nikic · 2026-01-19T08:39:21Z

Ping. I'd appreciate a review on this one as it's a blocker for LLVM 22.

lei137

Would you be able to provide some reasonable sized IR tests for this patch? The test in the original issue is way too big. I tried running the original IR through bug-point but it wasn't able to reduce it.

lei137 · 2026-01-22T15:30:35Z

+  case PPC::GETtlsldADDRPCREL:
+    if (MI.getOperand(2).getTargetFlags() == PPCII::MO_GOT_TLSGD_PCREL_FLAG ||
+        MI.getOperand(2).getTargetFlags() == PPCII::MO_GOT_TLSLD_PCREL_FLAG)
+      return 4;


These opcodes are named so that they indicate whether the pcrel flag is set or not. If you feel the need to double check here I don't have an issue with it. However, I don't think we should remove explicitly setting the size for them in the td definition file.

I find the original test case passes even if I remove these case block and apply the following diff to the td file:

diff --git a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td index 10d090480604..0e075a3415a8 100644 --- a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td +++ b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td @@ -1524,14 +1524,14 @@ class GETtlsldADDRPseudo <string asmstr> : PPCEmitTimePseudo<(outs g8rc:$rD), (i let hasExtraSrcRegAllocReq = 1, hasExtraDefRegAllocReq = 1 in { // LR8 is a true define, while the rest of the Defs are clobbers. X3 is // explicitly defined when this op is created, so not mentioned here. -let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in +let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7], Size = 8 in def GETtlsADDR : GETtlsADDRPseudo <"#GETtlsADDR">; let Defs = [X0,X2,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in def GETtlsADDRPCREL : GETtlsADDRPseudo <"#GETtlsADDRPCREL">; // LR8 is a true define, while the rest of the Defs are clobbers. X3 is // explicitly defined when this op is created, so not mentioned here. -let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in +let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7], Size = 8 in def GETtlsldADDR : GETtlsldADDRPseudo <"#GETtlsldADDR">; let Defs = [X0,X2,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in def GETtlsldADDRPCREL : GETtlsldADDRPseudo <"#GETtlsldADDRPCREL">;

Hm... I'm not sure what I was doing wrong before, I think I just got confused by different GETtls opcodes.

As you say, defining the sizes in TableGen does work fine. The only thing that was wrong with the original definitions is that the Size = 8 was on GETtlsADDRPCREL instead of GETtlsldADDR.

I've restored the size setting in TableGen and dropped the corresponding code in InstrInfo.

mandlebug · 2026-01-22T20:29:18Z

+    // The call sequence is up to 44 bytes large.
+    // TODO: Per LangRef the client must ensure that the number of bytes is
+    // large enough, but many tests use patchpoint with 40 bytes.
+    return std::max(Opers.getNumPatchBytes(), 44u);


I posted PR #177453 to change the assertion in LowerPATCHPOINT to a fatal error when the bytes requested aren't enough to contain the call. That way we can use the bytes size reported by the operands here without worry its incorrect. FWIW I think the longest call sequence is indeed 40 bytes, however the assertion wouldn't trigger as we were miss-counting the instructions we emitted.

@mandlebug what do you mean by "miss-counting". Does that need to be fixed?

Thanks! I've rebased on top of your change, which allowed dropping the adjustment for PATCHPOINT.

mandlebug

Other then Lei's outstanding comments LGTM.

For PowerPC, having accurate (or at least not too small) instruction sizes is critical, because the PPCBranchSelector pass relies on them. Underestimating the size of an instruction can result in the wrong branch kind being chosen, which will result in an MC error. This patch introduces validation that the instruction size reported by TII matches the actually emitted instruction size, and fixes various cases where this was not the case.

I must have gotten confused with the different opcodes in previous testing. These do have fixed sizes, one of the "Size = 8" ones was just on the wrong opcode.

lei137

This LGTM.
Thanks for fixes!

lei137 · 2026-01-23T15:22:38Z

A test case if you can manage it would be good though.

nikic · 2026-01-23T15:24:24Z

The problem with the test case is that it needs a jump across 2^16 bytes (~2^14 instructions), so it ends up being very large. That's why I didn't include it.

lei137 · 2026-01-23T16:24:55Z

The problem with the test case is that it needs a jump across 2^16 bytes (~2^14 instructions), so it ends up being very large. That's why I didn't include it.

Yes, it's difficult. I spoke to @mandlebug about this too and we couldn't think of a way to generate one either.

For PowerPC, having accurate (or at least not too small) instruction sizes is critical, because the PPCBranchSelector pass relies on them. Underestimating the size of an instruction can result in the wrong branch kind being chosen, which will result in an MC error. This patch introduces validation that the instruction size reported by TII matches the actually emitted instruction size, and fixes various cases where this was not the case. Fixes llvm#175190. (cherry picked from commit 45abb30)

nikic requested review from amy-kwan, lei137 and mandlebug January 12, 2026 15:10

llvmbot added the backend:PowerPC label Jan 12, 2026

nikic force-pushed the ppc-inst-size-fix branch from cf1875b to 5059f6b Compare January 12, 2026 15:13

lei137 reviewed Jan 20, 2026

View reviewed changes

Comment thread llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp

Comment thread llvm/lib/Target/PowerPC/PPCInstr64Bit.td Outdated

nikic force-pushed the ppc-inst-size-fix branch from 5059f6b to 56840e1 Compare January 21, 2026 08:20

lei137 reviewed Jan 22, 2026

View reviewed changes

mandlebug reviewed Jan 22, 2026

View reviewed changes

mandlebug approved these changes Jan 22, 2026

View reviewed changes

nikic added 5 commits January 23, 2026 09:28

Accurate instruction size for GETtlsADDR

32c3531

Remove no longer needed patchpoint workaround

65b1471

Specify GETtls instruction sizes in tablegen

016ec46

I must have gotten confused with the different opcodes in previous testing. These do have fixed sizes, one of the "Size = 8" ones was just on the wrong opcode.

Make the size check stricter

854bb06

nikic force-pushed the ppc-inst-size-fix branch from 56840e1 to 854bb06 Compare January 23, 2026 08:47

lei137 approved these changes Jan 23, 2026

View reviewed changes

lei137 assigned mandlebug and unassigned mandlebug Jan 23, 2026

nikic merged commit 45abb30 into llvm:main Jan 23, 2026
11 checks passed

nikic deleted the ppc-inst-size-fix branch January 23, 2026 16:41

Conversation

nikic commented Jan 12, 2026

Uh oh!

llvmbot commented Jan 12, 2026

Uh oh!

github-actions Bot commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic commented Jan 12, 2026

Uh oh!

nikic commented Jan 19, 2026

Uh oh!

Uh oh!

Uh oh!

lei137 left a comment

Choose a reason for hiding this comment

Uh oh!

lei137 Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

nikic Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mandlebug Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

lei137 Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

nikic Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

mandlebug left a comment

Choose a reason for hiding this comment

Uh oh!

lei137 left a comment

Choose a reason for hiding this comment

Uh oh!

lei137 commented Jan 23, 2026

Uh oh!

nikic commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lei137 commented Jan 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented Jan 12, 2026 •

edited

Loading

nikic commented Jan 23, 2026 •

edited

Loading