-
Notifications
You must be signed in to change notification settings - Fork 14.8k
X86: Stop overriding getRegClass #155128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
X86: Stop overriding getRegClass #155128
Conversation
This function should not be virtual; making this virtual was an AMDGPU hack that should be removed not spread to other backends. This does not need to be overridden to reserve registers. The register reservation mechanism is orthogonal to to the register class constraints of the instruction, this should be reporting the underlying instruction constraint. The registers are separately reserved, so they will be removed from the allocation order anyway. If the actual class needs to change based on the subtarget, it should probably generalize the LookupPtrRegClass mechanism. This was added by #70958. The new tests there for the class are probably not useful anymore. These instead should compile to the end and try to stress the allocation behavior.
@llvm/pr-subscribers-backend-x86 Author: Matt Arsenault (arsenm) ChangesThis function should not be virtual; making this virtual was This does not need to be overridden to reserve registers. The This was added by #70958. The new tests there for the class are Full diff: https://github.com/llvm/llvm-project/pull/155128.diff 6 Files Affected:
diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp
index abf365eedec39..75ed096904629 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.cpp
+++ b/llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -91,23 +91,6 @@ X86InstrInfo::X86InstrInfo(X86Subtarget &STI)
X86::CATCHRET, (STI.is64Bit() ? X86::RET64 : X86::RET32)),
Subtarget(STI), RI(STI.getTargetTriple()) {}
-const TargetRegisterClass *
-X86InstrInfo::getRegClass(const MCInstrDesc &MCID, unsigned OpNum,
- const TargetRegisterInfo *TRI,
- const MachineFunction &MF) const {
- auto *RC = TargetInstrInfo::getRegClass(MCID, OpNum, TRI, MF);
- // If the target does not have egpr, then r16-r31 will be resereved for all
- // instructions.
- if (!RC || !Subtarget.hasEGPR())
- return RC;
-
- if (X86II::canUseApxExtendedReg(MCID))
- return RC;
-
- const X86RegisterInfo *RI = Subtarget.getRegisterInfo();
- return RI->constrainRegClassToNonRex2(RC);
-}
-
bool X86InstrInfo::isCoalescableExtInstr(const MachineInstr &MI,
Register &SrcReg, Register &DstReg,
unsigned &SubIdx) const {
diff --git a/llvm/lib/Target/X86/X86InstrInfo.h b/llvm/lib/Target/X86/X86InstrInfo.h
index 9dc5f4b0e086e..b2e7e0e9eea28 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.h
+++ b/llvm/lib/Target/X86/X86InstrInfo.h
@@ -240,17 +240,6 @@ class X86InstrInfo final : public X86GenInstrInfo {
public:
explicit X86InstrInfo(X86Subtarget &STI);
- /// Given a machine instruction descriptor, returns the register
- /// class constraint for OpNum, or NULL. Returned register class
- /// may be different from the definition in the TD file, e.g.
- /// GR*RegClass (definition in TD file)
- /// ->
- /// GR*_NOREX2RegClass (Returned register class)
- const TargetRegisterClass *
- getRegClass(const MCInstrDesc &MCID, unsigned OpNum,
- const TargetRegisterInfo *TRI,
- const MachineFunction &MF) const override;
-
/// getRegisterInfo - TargetInstrInfo is a superset of MRegister info. As
/// such, whenever a client has an instance of instruction info, it should
/// always be able to get register info as well (through this method).
diff --git a/llvm/test/CodeGen/X86/apx/no-rex2-general.ll b/llvm/test/CodeGen/X86/apx/no-rex2-general.ll
index 805fc7ccaab76..74d8fc2d801a8 100644
--- a/llvm/test/CodeGen/X86/apx/no-rex2-general.ll
+++ b/llvm/test/CodeGen/X86/apx/no-rex2-general.ll
@@ -14,6 +14,7 @@ define i32 @map0(ptr nocapture noundef readonly %a, i64 noundef %b) {
; SSE-NEXT: [[MOV32rm:%[0-9]+]]:gr32 = MOV32rm [[COPY1]], 4, [[COPY]], 0, $noreg :: (load (s32) from %ir.add.ptr)
; SSE-NEXT: $eax = COPY [[MOV32rm]]
; SSE-NEXT: RET 0, $eax
+ ;
; AVX-LABEL: name: map0
; AVX: bb.0.entry:
; AVX-NEXT: liveins: $rdi, $rsi
@@ -38,12 +39,13 @@ define i32 @map1_or_vex(<2 x double> noundef %a) {
; SSE-NEXT: [[CVTSD2SIrr_Int:%[0-9]+]]:gr32 = nofpexcept CVTSD2SIrr_Int [[COPY]], implicit $mxcsr
; SSE-NEXT: $eax = COPY [[CVTSD2SIrr_Int]]
; SSE-NEXT: RET 0, $eax
+ ;
; AVX-LABEL: name: map1_or_vex
; AVX: bb.0.entry:
; AVX-NEXT: liveins: $xmm0
; AVX-NEXT: {{ $}}
; AVX-NEXT: [[COPY:%[0-9]+]]:vr128 = COPY $xmm0
- ; AVX-NEXT: [[VCVTSD2SIrr_Int:%[0-9]+]]:gr32_norex2 = nofpexcept VCVTSD2SIrr_Int [[COPY]], implicit $mxcsr
+ ; AVX-NEXT: [[VCVTSD2SIrr_Int:%[0-9]+]]:gr32 = nofpexcept VCVTSD2SIrr_Int [[COPY]], implicit $mxcsr
; AVX-NEXT: $eax = COPY [[VCVTSD2SIrr_Int]]
; AVX-NEXT: RET 0, $eax
entry:
@@ -56,17 +58,18 @@ define <2 x i64> @map2_or_vex(ptr nocapture noundef readonly %b, i64 noundef %c)
; SSE: bb.0.entry:
; SSE-NEXT: liveins: $rdi, $rsi
; SSE-NEXT: {{ $}}
- ; SSE-NEXT: [[COPY:%[0-9]+]]:gr64_norex2_nosp = COPY $rsi
- ; SSE-NEXT: [[COPY1:%[0-9]+]]:gr64_norex2 = COPY $rdi
+ ; SSE-NEXT: [[COPY:%[0-9]+]]:gr64_nosp = COPY $rsi
+ ; SSE-NEXT: [[COPY1:%[0-9]+]]:gr64 = COPY $rdi
; SSE-NEXT: [[PABSBrm:%[0-9]+]]:vr128 = PABSBrm [[COPY1]], 4, [[COPY]], 0, $noreg :: (load (s128) from %ir.add.ptr)
; SSE-NEXT: $xmm0 = COPY [[PABSBrm]]
; SSE-NEXT: RET 0, $xmm0
+ ;
; AVX-LABEL: name: map2_or_vex
; AVX: bb.0.entry:
; AVX-NEXT: liveins: $rdi, $rsi
; AVX-NEXT: {{ $}}
- ; AVX-NEXT: [[COPY:%[0-9]+]]:gr64_norex2_nosp = COPY $rsi
- ; AVX-NEXT: [[COPY1:%[0-9]+]]:gr64_norex2 = COPY $rdi
+ ; AVX-NEXT: [[COPY:%[0-9]+]]:gr64_nosp = COPY $rsi
+ ; AVX-NEXT: [[COPY1:%[0-9]+]]:gr64 = COPY $rdi
; AVX-NEXT: [[VPABSBrm:%[0-9]+]]:vr128 = VPABSBrm [[COPY1]], 4, [[COPY]], 0, $noreg :: (load (s128) from %ir.add.ptr)
; AVX-NEXT: $xmm0 = COPY [[VPABSBrm]]
; AVX-NEXT: RET 0, $xmm0
diff --git a/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-amx.ll b/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-amx.ll
index 5fa4cb4c8826b..812ea87fd0baa 100644
--- a/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-amx.ll
+++ b/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-amx.ll
@@ -7,8 +7,8 @@ define dso_local void @amx(ptr noundef %data) {
; CHECK: bb.0.entry:
; CHECK-NEXT: liveins: $rdi
; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: [[COPY:%[0-9]+]]:gr64_norex2 = COPY $rdi
- ; CHECK-NEXT: [[MOV32ri64_:%[0-9]+]]:gr64_norex2_nosp = MOV32ri64 8
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gr64 = COPY $rdi
+ ; CHECK-NEXT: [[MOV32ri64_:%[0-9]+]]:gr64_nosp = MOV32ri64 8
; CHECK-NEXT: PTILELOADD 4, [[COPY]], 1, killed [[MOV32ri64_]], 0, $noreg
; CHECK-NEXT: RET 0
entry:
diff --git a/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-x87.ll b/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-x87.ll
index a9ca591a156c2..5156047dac7b8 100644
--- a/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-x87.ll
+++ b/llvm/test/CodeGen/X86/apx/no-rex2-pseudo-x87.ll
@@ -7,8 +7,8 @@ define void @x87(ptr %0, ptr %1) {
; CHECK: bb.0 (%ir-block.2):
; CHECK-NEXT: liveins: $rdi, $rsi
; CHECK-NEXT: {{ $}}
- ; CHECK-NEXT: [[COPY:%[0-9]+]]:gr64_norex2 = COPY $rsi
- ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gr64_norex2 = COPY $rdi
+ ; CHECK-NEXT: [[COPY:%[0-9]+]]:gr64 = COPY $rsi
+ ; CHECK-NEXT: [[COPY1:%[0-9]+]]:gr64 = COPY $rdi
; CHECK-NEXT: [[LD_Fp32m:%[0-9]+]]:rfp32 = nofpexcept LD_Fp32m [[COPY1]], 1, $noreg, 0, $noreg, implicit-def dead $fpsw, implicit $fpcw :: (load (s32) from %ir.0)
; CHECK-NEXT: nofpexcept ST_Fp32m [[COPY]], 1, $noreg, 0, $noreg, killed [[LD_Fp32m]], implicit-def dead $fpsw, implicit $fpcw :: (store (s32) into %ir.1)
; CHECK-NEXT: RET 0
diff --git a/llvm/test/CodeGen/X86/apx/no-rex2-special.ll b/llvm/test/CodeGen/X86/apx/no-rex2-special.ll
index 86534427a9eae..51b7fe9391cf4 100644
--- a/llvm/test/CodeGen/X86/apx/no-rex2-special.ll
+++ b/llvm/test/CodeGen/X86/apx/no-rex2-special.ll
@@ -9,7 +9,7 @@ define void @test_xsave(ptr %ptr, i32 %hi, i32 %lo) {
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:gr32 = COPY $edx
; CHECK-NEXT: [[COPY1:%[0-9]+]]:gr32 = COPY $esi
- ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gr64_norex2 = COPY $rdi
+ ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gr64 = COPY $rdi
; CHECK-NEXT: $edx = COPY [[COPY1]]
; CHECK-NEXT: $eax = COPY [[COPY]]
; CHECK-NEXT: XSAVE [[COPY2]], 1, $noreg, 0, $noreg, implicit $edx, implicit $eax
@@ -26,7 +26,7 @@ define void @test_xsave64(ptr %ptr, i32 %hi, i32 %lo) {
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:gr32 = COPY $edx
; CHECK-NEXT: [[COPY1:%[0-9]+]]:gr32 = COPY $esi
- ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gr64_norex2 = COPY $rdi
+ ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gr64 = COPY $rdi
; CHECK-NEXT: $edx = COPY [[COPY1]]
; CHECK-NEXT: $eax = COPY [[COPY]]
; CHECK-NEXT: XSAVE64 [[COPY2]], 1, $noreg, 0, $noreg, implicit $edx, implicit $eax
@@ -43,7 +43,7 @@ define void @test_xrstor(ptr %ptr, i32 %hi, i32 %lo) {
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:gr32 = COPY $edx
; CHECK-NEXT: [[COPY1:%[0-9]+]]:gr32 = COPY $esi
- ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gr64_norex2 = COPY $rdi
+ ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gr64 = COPY $rdi
; CHECK-NEXT: $edx = COPY [[COPY1]]
; CHECK-NEXT: $eax = COPY [[COPY]]
; CHECK-NEXT: XRSTOR [[COPY2]], 1, $noreg, 0, $noreg, implicit $edx, implicit $eax
@@ -60,7 +60,7 @@ define void @test_xrstor64(ptr %ptr, i32 %hi, i32 %lo) {
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:gr32 = COPY $edx
; CHECK-NEXT: [[COPY1:%[0-9]+]]:gr32 = COPY $esi
- ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gr64_norex2 = COPY $rdi
+ ; CHECK-NEXT: [[COPY2:%[0-9]+]]:gr64 = COPY $rdi
; CHECK-NEXT: $edx = COPY [[COPY1]]
; CHECK-NEXT: $eax = COPY [[COPY]]
; CHECK-NEXT: XRSTOR64 [[COPY2]], 1, $noreg, 0, $noreg, implicit $edx, implicit $eax
|
It seems this PR should not only remove the overriding. If there is an alternative/better way to achieve that
, it should be included too, otherwise there is a correctness issue. |
The original PR already reserved these registers
That would be necessary if the underlying class fundamentally changed (e.g. 32-bit to 64-bit), but this just swapping to a different subset, which also aligns with the reserved registers so I don't think there's anything to do here. |
It's different from the logic of reserved registers, which does not depend on the instruction itself. Here we have the check
b/c even when feature APX/EGPR is on, some instructions can not use R16-R31. |
Can you take care of this then? Consider this a post-commit review. That patch was not actually approved, and does not have adequate test coverage for this restriction |
This function should not be virtual; making this virtual was
an AMDGPU hack that should be removed not spread to other
backends.
This does not need to be overridden to reserve registers. The
register reservation mechanism is orthogonal to to the register
class constraints of the instruction, this should be reporting
the underlying instruction constraint. The registers are separately
reserved, so they will be removed from the allocation order anyway.
If the actual class needs to change based on the subtarget,
it should probably generalize the LookupPtrRegClass mechanism.
This was added by #70958. The new tests there for the class are
probably not useful anymore. These instead should compile to the
end and try to stress the allocation behavior.