Skip to content

[PowerPC] Fix instruction sizes / branch relaxation#175556

Merged
nikic merged 5 commits intollvm:mainfrom
nikic:ppc-inst-size-fix
Jan 23, 2026
Merged

[PowerPC] Fix instruction sizes / branch relaxation#175556
nikic merged 5 commits intollvm:mainfrom
nikic:ppc-inst-size-fix

Conversation

@nikic
Copy link
Copy Markdown
Contributor

@nikic nikic commented Jan 12, 2026

For PowerPC, having accurate (or at least not too small) instruction sizes is critical, because the PPCBranchSelector pass relies on them. Underestimating the size of an instruction can result in the wrong branch kind being chosen, which will result in an MC error.

This patch introduces validation that the instruction size reported by TII matches the actually emitted instruction size, and fixes various cases where this was not the case.

Fixes #175190.

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Jan 12, 2026

@llvm/pr-subscribers-backend-powerpc

Author: Nikita Popov (nikic)

Changes

For PowerPC, having accurate (or at least not too small) instruction sizes is critical, because the PPCBranchSelector pass relies on them. Underestimating the size of an instruction can result in the wrong branch kind being chosen, which will result in an MC error.

This patch introduces validation that the instruction size reported by TII matches the actually emitted instruction size, and fixes various cases where this was not the case.

Fixes #175190.


Full diff: https://github.com/llvm/llvm-project/pull/175556.diff

4 Files Affected:

  • (modified) llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp (+27)
  • (modified) llvm/lib/Target/PowerPC/PPCInstr64Bit.td (+10-6)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.cpp (+12-1)
  • (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.td (+12-4)
diff --git a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
index 718d51c5a0673..863f35e6699b3 100644
--- a/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
+++ b/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
@@ -32,6 +32,7 @@
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/Twine.h"
+#include "llvm/ADT/ScopeExit.h"
 #include "llvm/BinaryFormat/ELF.h"
 #include "llvm/CodeGen/AsmPrinter.h"
 #include "llvm/CodeGen/MachineBasicBlock.h"
@@ -916,6 +917,32 @@ void PPCAsmPrinter::emitInstruction(const MachineInstr *MI) {
     return PPC::S_None;
   };
 
+#ifndef NDEBUG
+  // Instruction sizes must be correct for PPCBranchSelector to pick the
+  // right branch kind. Verify that the reported sizes and the actually
+  // emitted sizes match.
+  unsigned ExpectedSize = Subtarget->getInstrInfo()->getInstSizeInBytes(*MI);
+  MCFragment *OldFragment = OutStreamer->getCurrentFragment();
+  size_t OldFragSize = OldFragment->getFixedSize();
+  scope_exit VerifyInstSize([&]() {
+    if (!OutStreamer->isObj())
+      return; // Can only verify size when streaming to object.
+    MCFragment *NewFragment = OutStreamer->getCurrentFragment();
+    if (NewFragment != OldFragment)
+      return; // Don't try to handle fragment splitting cases.
+    unsigned ActualSize = NewFragment->getFixedSize() - OldFragSize;
+    // For now, allow over-estimates here. Some pseudos expand to a variable
+    // number of instructions, and for correctness using the upper bound size
+    // is fine.
+    if (ActualSize > ExpectedSize) {
+      dbgs() << "Size mismatch for: " << *MI << "\n";
+      dbgs() << "Expected size: " << ExpectedSize << "\n";
+      dbgs() << "Actual size: " << ActualSize << "\n";
+      abort();
+    }
+  });
+#endif
+
   // Lower multi-instruction pseudo operations.
   switch (MI->getOpcode()) {
   default: break;
diff --git a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
index 620dfd4738226..16d24f05525da 100644
--- a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
+++ b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
@@ -1507,21 +1507,25 @@ class GETtlsADDRPseudo <string asmstr> : PPCEmitTimePseudo<(outs g8rc:$rD), (ins
                                              asmstr,
                                              [(set i64:$rD,
                                                (PPCgetTlsAddr i64:$reg, tglobaltlsaddr:$sym))]>,
-                                      isPPC64;
+                                      isPPC64 {
+  // May be either 4 or 8 bytes, specify upper bound.
+  let Size = 8;
+}
 class GETtlsldADDRPseudo <string asmstr> : PPCEmitTimePseudo<(outs g8rc:$rD), (ins g8rc:$reg, tlsgd:$sym),
                                              asmstr,
                                              [(set i64:$rD,
                                                (PPCgetTlsldAddr i64:$reg, tglobaltlsaddr:$sym))]>,
-                                      isPPC64;
+                                      isPPC64 {
+  // May be either 4 or 8 bytes, specify upper bound.
+  let Size = 8;
+}
 
 let hasExtraSrcRegAllocReq = 1, hasExtraDefRegAllocReq = 1 in {
 // LR8 is a true define, while the rest of the Defs are clobbers. X3 is
 // explicitly defined when this op is created, so not mentioned here.
-// This is lowered to BL8_NOP_TLS by the assembly printer, so the size must be
-// correct because the branch select pass is relying on it.
-let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7], Size = 8 in
+let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in
 def GETtlsADDR : GETtlsADDRPseudo <"#GETtlsADDR">;
-let Defs = [X0,X2,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7], Size = 8 in
+let Defs = [X0,X2,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in
 def GETtlsADDRPCREL : GETtlsADDRPseudo <"#GETtlsADDRPCREL">;
 
 // LR8 is a true define, while the rest of the Defs are clobbers. X3 is
diff --git a/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp b/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
index 58ad12e2ce65a..d6fc9cdfab5b0 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
+++ b/llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
@@ -3013,7 +3013,18 @@ unsigned PPCInstrInfo::getInstSizeInBytes(const MachineInstr &MI) const {
     return Opers.getNumPatchBytes();
   } else if (Opcode == TargetOpcode::PATCHPOINT) {
     PatchPointOpers Opers(&MI);
-    return Opers.getNumPatchBytes();
+    // The call sequence is up to 44 bytes large.
+    // TODO: Per LangRef the client must ensure that the number of bytes is
+    // large enough, but many tests use patchpoint with 40 bytes.
+    return std::max(Opers.getNumPatchBytes(), 44u);
+  } else if (Opcode == TargetOpcode::PATCHABLE_FUNCTION_ENTER) {
+    const MachineFunction *MF = MI.getParent()->getParent();
+    const Function &F = MF->getFunction();
+    unsigned Num = 0;
+    (void)F.getFnAttribute("patchable-function-entry")
+        .getValueAsString()
+        .getAsInteger(10, Num);
+    return Num * 4;
   } else {
     return get(Opcode).getSize();
   }
diff --git a/llvm/lib/Target/PowerPC/PPCInstrInfo.td b/llvm/lib/Target/PowerPC/PPCInstrInfo.td
index fdccddd86b9b7..130ee2a7c4a97 100644
--- a/llvm/lib/Target/PowerPC/PPCInstrInfo.td
+++ b/llvm/lib/Target/PowerPC/PPCInstrInfo.td
@@ -2489,7 +2489,9 @@ def EnforceIEIO : XForm_24_eieio<31, 854, (outs), (ins),
                                  "eieio", IIC_LdStLoad, []>;
 
 def PseudoEIEIO : PPCEmitTimePseudo<(outs), (ins), "#PPCEIEIO",
-                  [(int_ppc_eieio)]>;
+                  [(int_ppc_eieio)]> {
+  let Size = 12;
+}
 
 def : Pat<(int_ppc_sync),   (SYNC 0)>, Requires<[HasSYNC]>;
 def : Pat<(int_ppc_iospace_sync),   (SYNC 0)>, Requires<[HasSYNC]>;
@@ -3448,14 +3450,18 @@ def : Pat<(add i32:$in, (PPChi tblockaddress:$g, 0)),
 
 // Support for thread-local storage.
 def PPC32GOT: PPCEmitTimePseudo<(outs gprc:$rD), (ins), "#PPC32GOT",
-                [(set i32:$rD, (PPCppc32GOT))]>;
+                [(set i32:$rD, (PPCppc32GOT))]> {
+  let Size = 8;
+}
 
 // Get the _GLOBAL_OFFSET_TABLE_ in PIC mode.
 // This uses two output registers, the first as the real output, the second as a
 // temporary register, used internally in code generation. A "bl" also clobbers LR.
 let Defs = [LR] in
 def PPC32PICGOT: PPCEmitTimePseudo<(outs gprc:$rD, gprc:$rT), (ins), "#PPC32PICGOT",
-                []>;
+                []> {
+  let Size = 20;
+}
 
 def LDgotTprelL32: PPCEmitTimePseudo<(outs gprc_nor0:$rD), (ins s16imm:$disp, gprc_nor0:$reg),
                            "#LDgotTprelL32",
@@ -3581,7 +3587,9 @@ def ADDItocL : PPCEmitTimePseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, tocentry3
 
 // Get Global (GOT) Base Register offset, from the word immediately preceding
 // the function label.
-def UpdateGBR : PPCEmitTimePseudo<(outs gprc:$rD, gprc:$rT), (ins gprc:$rI), "#UpdateGBR", []>;
+def UpdateGBR : PPCEmitTimePseudo<(outs gprc:$rD, gprc:$rT), (ins gprc:$rI), "#UpdateGBR", []> {
+  let Size = 8;
+}
 
 // Pseudo-instruction marked for deletion. When deleting the instruction would
 // cause iterator invalidation in MIR transformation passes, this pseudo can be

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jan 12, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

@nikic nikic force-pushed the ppc-inst-size-fix branch from cf1875b to 5059f6b Compare January 12, 2026 15:13
@nikic
Copy link
Copy Markdown
Contributor Author

nikic commented Jan 12, 2026

Probably worth noting that because most tests don't use -filetype=obj, I tested this by adding -filetype=obj -o /dev/null to all the llc invocations and then seeing which ones crashed due to size mismatch.

@nikic
Copy link
Copy Markdown
Contributor Author

nikic commented Jan 19, 2026

Ping. I'd appreciate a review on this one as it's a blocker for LLVM 22.

Comment thread llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
Comment thread llvm/lib/Target/PowerPC/PPCInstr64Bit.td Outdated
@nikic nikic force-pushed the ppc-inst-size-fix branch from 5059f6b to 56840e1 Compare January 21, 2026 08:20
Copy link
Copy Markdown
Contributor

@lei137 lei137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be able to provide some reasonable sized IR tests for this patch? The test in the original issue is way too big. I tried running the original IR through bug-point but it wasn't able to reduce it.

case PPC::GETtlsldADDRPCREL:
if (MI.getOperand(2).getTargetFlags() == PPCII::MO_GOT_TLSGD_PCREL_FLAG ||
MI.getOperand(2).getTargetFlags() == PPCII::MO_GOT_TLSLD_PCREL_FLAG)
return 4;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These opcodes are named so that they indicate whether the pcrel flag is set or not. If you feel the need to double check here I don't have an issue with it. However, I don't think we should remove explicitly setting the size for them in the td definition file.

I find the original test case passes even if I remove these case block and apply the following diff to the td file:

diff --git a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
index 10d090480604..0e075a3415a8 100644
--- a/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
+++ b/llvm/lib/Target/PowerPC/PPCInstr64Bit.td
@@ -1524,14 +1524,14 @@ class GETtlsldADDRPseudo <string asmstr> : PPCEmitTimePseudo<(outs g8rc:$rD), (i
 let hasExtraSrcRegAllocReq = 1, hasExtraDefRegAllocReq = 1 in {
 // LR8 is a true define, while the rest of the Defs are clobbers. X3 is
 // explicitly defined when this op is created, so not mentioned here.
-let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in
+let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7], Size = 8 in
 def GETtlsADDR : GETtlsADDRPseudo <"#GETtlsADDR">;
 let Defs = [X0,X2,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in
 def GETtlsADDRPCREL : GETtlsADDRPseudo <"#GETtlsADDRPCREL">;

 // LR8 is a true define, while the rest of the Defs are clobbers. X3 is
 // explicitly defined when this op is created, so not mentioned here.
-let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in
+let Defs = [X0,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7], Size = 8 in
 def GETtlsldADDR : GETtlsldADDRPseudo <"#GETtlsldADDR">;
 let Defs = [X0,X2,X4,X5,X6,X7,X8,X9,X10,X11,X12,LR8,CTR8,CR0,CR1,CR5,CR6,CR7] in
 def GETtlsldADDRPCREL : GETtlsldADDRPseudo <"#GETtlsldADDRPCREL">;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... I'm not sure what I was doing wrong before, I think I just got confused by different GETtls opcodes.

As you say, defining the sizes in TableGen does work fine. The only thing that was wrong with the original definitions is that the Size = 8 was on GETtlsADDRPCREL instead of GETtlsldADDR.

I've restored the size setting in TableGen and dropped the corresponding code in InstrInfo.

Comment thread llvm/lib/Target/PowerPC/PPCInstrInfo.cpp Outdated
// The call sequence is up to 44 bytes large.
// TODO: Per LangRef the client must ensure that the number of bytes is
// large enough, but many tests use patchpoint with 40 bytes.
return std::max(Opers.getNumPatchBytes(), 44u);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I posted PR #177453 to change the assertion in LowerPATCHPOINT to a fatal error when the bytes requested aren't enough to contain the call. That way we can use the bytes size reported by the operands here without worry its incorrect. FWIW I think the longest call sequence is indeed 40 bytes, however the assertion wouldn't trigger as we were miss-counting the instructions we emitted.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mandlebug what do you mean by "miss-counting". Does that need to be fixed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've rebased on top of your change, which allowed dropping the adjustment for PATCHPOINT.

Copy link
Copy Markdown
Member

@mandlebug mandlebug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other then Lei's outstanding comments LGTM.

nikic added 5 commits January 23, 2026 09:28
For PowerPC, having accurate (or at least not too small) instruction
sizes is critical, because the PPCBranchSelector pass relies on
them. Underestimating the size of an instruction can result in the
wrong branch kind being chosen, which will result in an MC error.

This patch introduces validation that the instruction size reported
by TII matches the actually emitted instruction size, and fixes
various cases where this was not the case.
I must have gotten confused with the different opcodes in previous
testing. These do have fixed sizes, one of the "Size = 8" ones
was just on the wrong opcode.
@nikic nikic force-pushed the ppc-inst-size-fix branch from 56840e1 to 854bb06 Compare January 23, 2026 08:47
Copy link
Copy Markdown
Contributor

@lei137 lei137 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM.
Thanks for fixes!

@lei137
Copy link
Copy Markdown
Contributor

lei137 commented Jan 23, 2026

A test case if you can manage it would be good though.

@nikic
Copy link
Copy Markdown
Contributor Author

nikic commented Jan 23, 2026

The problem with the test case is that it needs a jump across 2^16 bytes (~2^14 instructions), so it ends up being very large. That's why I didn't include it.

@lei137 lei137 assigned mandlebug and unassigned mandlebug Jan 23, 2026
@lei137
Copy link
Copy Markdown
Contributor

lei137 commented Jan 23, 2026

The problem with the test case is that it needs a jump across 2^16 bytes (~2^14 instructions), so it ends up being very large. That's why I didn't include it.

Yes, it's difficult. I spoke to @mandlebug about this too and we couldn't think of a way to generate one either.

@nikic nikic merged commit 45abb30 into llvm:main Jan 23, 2026
11 checks passed
@nikic nikic deleted the ppc-inst-size-fix branch January 23, 2026 16:41
c-rhodes pushed a commit to llvmbot/llvm-project that referenced this pull request Jan 27, 2026
For PowerPC, having accurate (or at least not too small) instruction
sizes is critical, because the PPCBranchSelector pass relies on them.
Underestimating the size of an instruction can result in the wrong
branch kind being chosen, which will result in an MC error.

This patch introduces validation that the instruction size reported by
TII matches the actually emitted instruction size, and fixes various
cases where this was not the case.

Fixes llvm#175190.

(cherry picked from commit 45abb30)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[PowerPC] Branch target out of range

4 participants