X86: Use LiveRegUnits in findDeadCallerSavedReg#156817
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
@llvm/pr-subscribers-backend-x86 Author: Matt Arsenault (arsenm) ChangesMy goal here was to eliminate the use of getGPRsForTailCall. There are a few things I find confusing about this function; the API There are a few codegen test changes. One of them I think is a correct 5 Files Affected:
diff --git a/llvm/lib/Target/X86/X86RegisterInfo.cpp b/llvm/lib/Target/X86/X86RegisterInfo.cpp
index b79e508df7c97..3f4955f28e68b 100644
--- a/llvm/lib/Target/X86/X86RegisterInfo.cpp
+++ b/llvm/lib/Target/X86/X86RegisterInfo.cpp
@@ -1002,8 +1002,6 @@ unsigned X86RegisterInfo::findDeadCallerSavedReg(
if (MF->callsEHReturn())
return 0;
- const TargetRegisterClass &AvailableRegs = *getGPRsForTailCall(*MF);
-
if (MBBI == MBB.end())
return 0;
@@ -1025,20 +1023,20 @@ unsigned X86RegisterInfo::findDeadCallerSavedReg(
case X86::TCRETURNmi64:
case X86::EH_RETURN:
case X86::EH_RETURN64: {
- SmallSet<uint16_t, 8> Uses;
- for (MachineOperand &MO : MBBI->operands()) {
- if (!MO.isReg() || MO.isDef())
- continue;
- Register Reg = MO.getReg();
- if (!Reg)
- continue;
- for (MCRegAliasIterator AI(Reg, this, true); AI.isValid(); ++AI)
- Uses.insert(*AI);
+ LiveRegUnits LRU(*this);
+ LRU.addLiveOuts(MBB);
+ LRU.stepBackward(*MBBI);
+
+ // FIXME: Why do we need to special case this register? Is it missing from
+ // return implicit uses?
+ LRU.removeReg(X86::RIP);
+
+ const TargetRegisterClass &RC =
+ Is64Bit ? X86::GR64_NOSPRegClass : X86::GR32_NOSPRegClass;
+ for (MCRegister Reg : RC) {
+ if (LRU.available(Reg))
+ return Reg;
}
-
- for (auto CS : AvailableRegs)
- if (!Uses.count(CS) && CS != X86::RIP && CS != X86::RSP && CS != X86::ESP)
- return CS;
}
}
diff --git a/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll b/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
index ad24608d338ad..d6d4db3509103 100644
--- a/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
+++ b/llvm/test/CodeGen/X86/apx/push2-pop2-cfi-seh.ll
@@ -81,7 +81,7 @@ define i32 @csr6_alloc16(ptr %argv) {
; LIN-NEXT: .cfi_def_cfa_offset 32
; LIN-NEXT: pop2 %rbp, %r15
; LIN-NEXT: .cfi_def_cfa_offset 16
-; LIN-NEXT: popq %rcx
+; LIN-NEXT: popq %rax
; LIN-NEXT: .cfi_def_cfa_offset 8
; LIN-NEXT: retq
;
@@ -116,7 +116,7 @@ define i32 @csr6_alloc16(ptr %argv) {
; LIN-PPX-NEXT: .cfi_def_cfa_offset 32
; LIN-PPX-NEXT: pop2p %rbp, %r15
; LIN-PPX-NEXT: .cfi_def_cfa_offset 16
-; LIN-PPX-NEXT: popq %rcx
+; LIN-PPX-NEXT: popq %rax
; LIN-PPX-NEXT: .cfi_def_cfa_offset 8
; LIN-PPX-NEXT: retq
;
@@ -180,7 +180,7 @@ define i32 @csr6_alloc16(ptr %argv) {
; WIN-NEXT: pop2 %rbp, %rbx
; WIN-NEXT: pop2 %r13, %r12
; WIN-NEXT: pop2 %r15, %r14
-; WIN-NEXT: popq %rcx
+; WIN-NEXT: popq %rax
; WIN-NEXT: .seh_endepilogue
; WIN-NEXT: retq
; WIN-NEXT: .seh_endproc
@@ -211,7 +211,7 @@ define i32 @csr6_alloc16(ptr %argv) {
; WIN-PPX-NEXT: pop2p %rbp, %rbx
; WIN-PPX-NEXT: pop2p %r13, %r12
; WIN-PPX-NEXT: pop2p %r15, %r14
-; WIN-PPX-NEXT: popq %rcx
+; WIN-PPX-NEXT: popq %rax
; WIN-PPX-NEXT: .seh_endepilogue
; WIN-PPX-NEXT: retq
; WIN-PPX-NEXT: .seh_endproc
diff --git a/llvm/test/CodeGen/X86/lvi-hardening-ret.ll b/llvm/test/CodeGen/X86/lvi-hardening-ret.ll
index faa8bff8f0943..954985a3798b7 100644
--- a/llvm/test/CodeGen/X86/lvi-hardening-ret.ll
+++ b/llvm/test/CodeGen/X86/lvi-hardening-ret.ll
@@ -41,9 +41,9 @@ entry:
%add = add nsw i32 %0, %1
ret i32 %add
; CHECK-NOT: retq
-; CHECK: popq %rcx
+; CHECK: popq %rsi
; CHECK-NEXT: lfence
-; CHECK-NEXT: jmpq *%rcx
+; CHECK-NEXT: jmpq *%rsi
}
; Function Attrs: noinline nounwind optnone uwtable
@@ -52,9 +52,9 @@ define dso_local preserve_mostcc void @preserve_most() #0 {
entry:
ret void
; CHECK-NOT: retq
-; CHECK: popq %rax
+; CHECK: popq %r11
; CHECK-NEXT: lfence
-; CHECK-NEXT: jmpq *%rax
+; CHECK-NEXT: jmpq *%r11
}
; Function Attrs: noinline nounwind optnone uwtable
@@ -63,9 +63,9 @@ define dso_local preserve_allcc void @preserve_all() #0 {
entry:
ret void
; CHECK-NOT: retq
-; CHECK: popq %rax
+; CHECK: popq %r11
; CHECK-NEXT: lfence
-; CHECK-NEXT: jmpq *%rax
+; CHECK-NEXT: jmpq *%r11
}
define { i64, i128 } @ret_i64_i128() #0 {
diff --git a/llvm/test/CodeGen/X86/pr40289-64bit.ll b/llvm/test/CodeGen/X86/pr40289-64bit.ll
index 58da5258a6703..a83674ac81998 100644
--- a/llvm/test/CodeGen/X86/pr40289-64bit.ll
+++ b/llvm/test/CodeGen/X86/pr40289-64bit.ll
@@ -6,5 +6,5 @@ define cc 92 < 9 x i64 > @clobber() {
ret < 9 x i64 > undef
; CHECK-LABEL: clobber:
; CHECK-NOT: popq %rsp
- ; CHECK: addq $8, %rsp
+ ; CHECK: pushq %rax
}
diff --git a/llvm/test/CodeGen/X86/pr40289.ll b/llvm/test/CodeGen/X86/pr40289.ll
index 851b23c002bdb..21e50931b40f2 100644
--- a/llvm/test/CodeGen/X86/pr40289.ll
+++ b/llvm/test/CodeGen/X86/pr40289.ll
@@ -6,5 +6,5 @@ define < 3 x i32 > @clobber() {
ret < 3 x i32 > undef
; CHECK-LABEL: clobber:
; CHECK-NOT: popl %esp
- ; CHECK: addl $4, %esp
+ ; CHECK: popl %eax
}
|
My goal here was to eliminate the use of getGPRsForTailCall. I don't think it makes sense to use here, considering this function handles other non-tail call cases. It seems to have been used to find a non-callee saved register in a roundabout way. We can be more precise by letting LiveRegUnits figure out the CSR set for the current function. There are a few things I find confusing about this function; the API isn't right. The callers ideally would be maintaining LiveRegUnits in the context where they need the free register. It also doesn't provide a RegisterClass to use. Also, this should work for any block and this shouldn't need to special case this set of return opcodes. There are a few codegen test changes. One of them I think is a correct improvement since the old code didn't consider undef uses as available. The others I think are just to allocation order changes, since we're now using the broader GPR_*NOSP classes.
73e2257 to
c831a2a
Compare
|
ping |
| // FIXME: Why do we need to special case this register? Is it missing from | ||
| // return implicit uses? | ||
| LRU.removeReg(X86::RIP); |
There was a problem hiding this comment.
Isn't X86::GR64_NOSPRegClass excluding RIP?
There was a problem hiding this comment.
Not sure if that's the class constraint an all of these instructions. This is attempting to reproduce the existing logic
There was a problem hiding this comment.

My goal here was to eliminate the use of getGPRsForTailCall.
I don't think it makes sense to use here, considering this function
handles other non-tail call cases. It seems to have been used to
find a non-callee saved register in a roundabout way. We can be more
precise by letting LiveRegUnits figure out the CSR set for the current
function.
There are a few things I find confusing about this function; the API
isn't right. The callers ideally would be maintaining LiveRegUnits
in the context where they need the free register. It also doesn't
provide a RegisterClass to use. Also, this should work for any
block and this shouldn't need to special case this set of return
opcodes. I'm also not sure why there is a special case for RIP. It was introduced in
fe5e5dc but doesn't appear to
be tested.
There are a few codegen test changes. One of them I think is a correct
improvement since the old code didn't consider undef uses as available.
The others I think are just to allocation order changes, since we're now
using the broader GPR_*NOSP classes.