-
Notifications
You must be signed in to change notification settings - Fork 15.7k
[SelectionDAG][WIP] Move HwMode expansion from tablegen to SelectionISel. #174471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…Sel. The way HwMode is currently implemented, tablegen duplicates each pattern that is dependent on hardware mode. The HwMode predicate is added as a pattern predicate on the duplicated pattern. RISC-V uses HwMode on the GPR register class which means almost every isel pattern is affected by HwMode. This results in the isel table being nearly twice the size it would be if we only had a single GPR size. This patch proposes to do the expansion at instruction selection time instead. To accomplish this new opcodes like OPC_CheckTypeByHwMode are added to the isel table. The unique combinations of types and HwMode are converted to an index that is the payload for the new opcodes. TableGen emits a new virtual function getValueTypeByHwMode that uses this index and the current HwMode to look up the type. This reduces the size of the isel table on RISC-V from ~2.38 million bytes to ~1.38 million bytes. I did not add an OPC_SwitchTypeByHwMode opcode yet. If the VT requires a hardware mode, we emit an OPC_Scope+OPC_CheckTypeByHwMode instead. I expect adding an OPC_SwitchTypeByHwMode could further reduce the table size. I will investigate this as a follow up. I haven't measured yet, but it's possible the new getValueTypeByHwMode may have an affect on compile time. If necessary we could add a cache on this lookup to mitigate some impact. Many of the matcher classes in tablegen now use ValueTypeByHwMode instead of MVT. This may have an impact on the memory usage and runtime of tablegen. We can mitigate some of this by splitting the matchers into MVT and ValueTypeByHwMode versions. We can also explore alternate data structures for ValueTypeByHwMode instead of a std::map. Maybe a sorted vector. I plan to do some cleanup to try to reduce some code duplication, but I wanted to get early feedback on the direction. Given the scope of this patch and the timing of the LLVM 22 branch, I don't expect to commit this until after the branch.
|
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-tablegen Author: Craig Topper (topperc) ChangesThe way HwMode is currently implemented, tablegen duplicates each RISC-V uses HwMode on the GPR register class which means almost every This patch proposes to do the expansion at instruction selection time This reduces the size of the isel table on RISC-V from ~2.38 million bytes I did not add an OPC_SwitchTypeByHwMode opcode yet. If the VT requires a I haven't measured yet, but it's possible the new getValueTypeByHwMode Many of the matcher classes in tablegen now use ValueTypeByHwMode instead I plan to do some cleanup to try to reduce some code duplication, but Given the scope of this patch and the timing of the LLVM 22 branch, Patch is 49.07 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/174471.diff 11 Files Affected:
diff --git a/llvm/include/llvm/CodeGen/SelectionDAGISel.h b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
index 569353670c532..c96a7ba97201f 100644
--- a/llvm/include/llvm/CodeGen/SelectionDAGISel.h
+++ b/llvm/include/llvm/CodeGen/SelectionDAGISel.h
@@ -77,6 +77,8 @@ class SelectionDAGISel {
bool MatchFilterFuncName = false;
StringRef FuncName;
+ unsigned HwMode;
+
explicit SelectionDAGISel(TargetMachine &tm,
CodeGenOptLevel OL = CodeGenOptLevel::Default);
virtual ~SelectionDAGISel();
@@ -202,7 +204,9 @@ class SelectionDAGISel {
// Space-optimized forms that implicitly encode VT.
OPC_CheckTypeI32,
OPC_CheckTypeI64,
+ OPC_CheckTypeByHwMode,
OPC_CheckTypeRes,
+ OPC_CheckTypeResByHwMode,
OPC_SwitchType,
OPC_CheckChild0Type,
OPC_CheckChild1Type,
@@ -231,6 +235,15 @@ class SelectionDAGISel {
OPC_CheckChild6TypeI64,
OPC_CheckChild7TypeI64,
+ OPC_CheckChild0TypeByHwMode,
+ OPC_CheckChild1TypeByHwMode,
+ OPC_CheckChild2TypeByHwMode,
+ OPC_CheckChild3TypeByHwMode,
+ OPC_CheckChild4TypeByHwMode,
+ OPC_CheckChild5TypeByHwMode,
+ OPC_CheckChild6TypeByHwMode,
+ OPC_CheckChild7TypeByHwMode,
+
OPC_CheckInteger,
OPC_CheckChild0Integer,
OPC_CheckChild1Integer,
@@ -261,10 +274,13 @@ class SelectionDAGISel {
OPC_EmitIntegerI16,
OPC_EmitIntegerI32,
OPC_EmitIntegerI64,
+ OPC_EmitIntegerByHwMode,
OPC_EmitRegister,
OPC_EmitRegisterI32,
OPC_EmitRegisterI64,
+ OPC_EmitRegisterByHwMode,
OPC_EmitRegister2,
+ OPC_EmitRegisterByHwMode2,
OPC_EmitConvertToTarget,
OPC_EmitConvertToTarget0,
OPC_EmitConvertToTarget1,
@@ -290,6 +306,7 @@ class SelectionDAGISel {
OPC_EmitCopyToRegTwoByte,
OPC_EmitNodeXForm,
OPC_EmitNode,
+ OPC_EmitNodeByHwMode,
// Space-optimized forms that implicitly encode number of result VTs.
OPC_EmitNode0,
OPC_EmitNode1,
@@ -302,6 +319,7 @@ class SelectionDAGISel {
OPC_EmitNode1Chain,
OPC_EmitNode2Chain,
OPC_MorphNodeTo,
+ OPC_MorphNodeToByHwMode,
// Space-optimized forms that implicitly encode number of result VTs.
OPC_MorphNodeTo0,
OPC_MorphNodeTo1,
@@ -448,6 +466,10 @@ class SelectionDAGISel {
llvm_unreachable("Tblgen should generate this!");
}
+ virtual MVT getValueTypeByHwMode(unsigned Index) const {
+ llvm_unreachable("Tblgen should generate the implementation of this!");
+ }
+
void SelectCodeCommon(SDNode *NodeToMatch, const uint8_t *MatcherTable,
unsigned TableSize);
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index 91c4a37d9885c..0ff95264af417 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -531,6 +531,8 @@ void SelectionDAGISel::initializeAnalysisResults(
SP = &FAM.getResult<SSPLayoutAnalysis>(Fn);
TTI = &FAM.getResult<TargetIRAnalysis>(Fn);
+
+ HwMode = MF->getSubtarget().getHwMode();
}
void SelectionDAGISel::initializeAnalysisResults(MachineFunctionPass &MFP) {
@@ -588,6 +590,8 @@ void SelectionDAGISel::initializeAnalysisResults(MachineFunctionPass &MFP) {
SP = &MFP.getAnalysis<StackProtector>().getLayoutInfo();
TTI = &MFP.getAnalysis<TargetTransformInfoWrapperPass>().getTTI(Fn);
+
+ HwMode = MF->getSubtarget().getHwMode();
}
bool SelectionDAGISel::runOnMachineFunction(MachineFunction &mf) {
@@ -2747,6 +2751,18 @@ getSimpleVT(const uint8_t *MatcherTable, unsigned &MatcherIndex) {
return static_cast<MVT::SimpleValueType>(SimpleVT);
}
+/// getSimpleVT - Decode a value in MatcherTable, if it's a VBR encoded value,
+/// use GetVBR to decode it.
+LLVM_ATTRIBUTE_ALWAYS_INLINE static MVT
+getHwModeVT(const uint8_t *MatcherTable, unsigned &MatcherIndex,
+ const SelectionDAGISel &SDISel) {
+ unsigned Index = MatcherTable[MatcherIndex++];
+ if (Index & 128)
+ Index = GetVBR(Index, MatcherTable, MatcherIndex);
+
+ return SDISel.getValueTypeByHwMode(Index);
+}
+
void SelectionDAGISel::Select_JUMP_TABLE_DEBUG_INFO(SDNode *N) {
SDLoc dl(N);
CurDAG->SelectNodeTo(N, TargetOpcode::JUMP_TABLE_DEBUG_INFO, MVT::Glue,
@@ -3150,12 +3166,23 @@ static unsigned IsPredicateKnownToFail(
Result = !::CheckType(VT, N, SDISel.TLI, SDISel.CurDAG->getDataLayout());
return Index;
}
+ case SelectionDAGISel::OPC_CheckTypeByHwMode: {
+ MVT VT = getHwModeVT(Table, Index, SDISel);
+ Result = !::CheckType(VT.SimpleTy, N, SDISel.TLI, SDISel.CurDAG->getDataLayout());
+ return Index;
+ }
case SelectionDAGISel::OPC_CheckTypeRes: {
unsigned Res = Table[Index++];
Result = !::CheckType(getSimpleVT(Table, Index), N.getValue(Res),
SDISel.TLI, SDISel.CurDAG->getDataLayout());
return Index;
}
+ case SelectionDAGISel::OPC_CheckTypeResByHwMode: {
+ unsigned Res = Table[Index++];
+ MVT VT = getHwModeVT(Table, Index, SDISel);
+ Result = !::CheckType(VT.SimpleTy, N.getValue(Res), SDISel.TLI, SDISel.CurDAG->getDataLayout());
+ return Index;
+ }
case SelectionDAGISel::OPC_CheckChild0Type:
case SelectionDAGISel::OPC_CheckChild1Type:
case SelectionDAGISel::OPC_CheckChild2Type:
@@ -3198,6 +3225,20 @@ static unsigned IsPredicateKnownToFail(
SDISel.CurDAG->getDataLayout(), ChildNo);
return Index;
}
+ case SelectionDAGISel::OPC_CheckChild0TypeByHwMode:
+ case SelectionDAGISel::OPC_CheckChild1TypeByHwMode:
+ case SelectionDAGISel::OPC_CheckChild2TypeByHwMode:
+ case SelectionDAGISel::OPC_CheckChild3TypeByHwMode:
+ case SelectionDAGISel::OPC_CheckChild4TypeByHwMode:
+ case SelectionDAGISel::OPC_CheckChild5TypeByHwMode:
+ case SelectionDAGISel::OPC_CheckChild6TypeByHwMode:
+ case SelectionDAGISel::OPC_CheckChild7TypeByHwMode: {
+ MVT VT = getHwModeVT(Table, Index, SDISel);
+ unsigned ChildNo = Opcode - SelectionDAGISel::OPC_CheckChild0TypeByHwMode;
+ Result = !::CheckChildType(VT.SimpleTy, N, SDISel.TLI,
+ SDISel.CurDAG->getDataLayout(), ChildNo);
+ return Index;
+ }
case SelectionDAGISel::OPC_CheckCondCode:
Result = !::CheckCondCode(Table, Index, N);
return Index;
@@ -3718,6 +3759,12 @@ void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
break;
continue;
}
+ case OPC_CheckTypeByHwMode: {
+ MVT VT = getHwModeVT(MatcherTable, MatcherIndex, *this);
+ if (!::CheckType(VT.SimpleTy, N, TLI, CurDAG->getDataLayout()))
+ break;
+ continue;
+ }
case OPC_CheckTypeRes: {
unsigned Res = MatcherTable[MatcherIndex++];
@@ -3726,6 +3773,13 @@ void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
break;
continue;
}
+ case OPC_CheckTypeResByHwMode: {
+ unsigned Res = MatcherTable[MatcherIndex++];
+ MVT VT = getHwModeVT(MatcherTable, MatcherIndex, *this);
+ if (!::CheckType(VT.SimpleTy, N.getValue(Res), TLI, CurDAG->getDataLayout()))
+ break;
+ continue;
+ }
case OPC_SwitchOpcode: {
unsigned CurNodeOpcode = N.getOpcode();
@@ -3832,6 +3886,20 @@ void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
break;
continue;
}
+ case OPC_CheckChild0TypeByHwMode:
+ case OPC_CheckChild1TypeByHwMode:
+ case OPC_CheckChild2TypeByHwMode:
+ case OPC_CheckChild3TypeByHwMode:
+ case OPC_CheckChild4TypeByHwMode:
+ case OPC_CheckChild5TypeByHwMode:
+ case OPC_CheckChild6TypeByHwMode:
+ case OPC_CheckChild7TypeByHwMode: {
+ MVT VT = getHwModeVT(MatcherTable, MatcherIndex, *this);
+ unsigned ChildNo = Opcode - OPC_CheckChild0TypeByHwMode;
+ if (!::CheckChildType(VT.SimpleTy, N, TLI, CurDAG->getDataLayout(), ChildNo))
+ break;
+ continue;
+ }
case OPC_CheckCondCode:
if (!::CheckCondCode(MatcherTable, MatcherIndex, N)) break;
continue;
@@ -3920,12 +3988,24 @@ void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
break;
}
int64_t Val = GetSignedVBR(MatcherTable, MatcherIndex);
+ Val = SignExtend64(Val, MVT(VT).getFixedSizeInBits());
RecordedNodes.emplace_back(
CurDAG->getSignedConstant(Val, SDLoc(NodeToMatch), VT,
/*isTarget=*/true),
nullptr);
continue;
}
+ case OPC_EmitIntegerByHwMode: {
+ MVT VT = getHwModeVT(MatcherTable, MatcherIndex, *this);
+ int64_t Val = GetSignedVBR(MatcherTable, MatcherIndex);
+ Val = SignExtend64(Val, MVT(VT).getFixedSizeInBits());
+ RecordedNodes.emplace_back(
+ CurDAG->getSignedConstant(Val, SDLoc(NodeToMatch), VT.SimpleTy,
+ /*isTarget=*/true),
+ nullptr);
+ continue;
+ }
+
case OPC_EmitRegister:
case OPC_EmitRegisterI32:
case OPC_EmitRegisterI64: {
@@ -3945,6 +4025,12 @@ void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
RecordedNodes.emplace_back(CurDAG->getRegister(RegNo, VT), nullptr);
continue;
}
+ case OPC_EmitRegisterByHwMode: {
+ MVT VT = getHwModeVT(MatcherTable, MatcherIndex, *this);
+ unsigned RegNo = MatcherTable[MatcherIndex++];
+ RecordedNodes.emplace_back(CurDAG->getRegister(RegNo, VT), nullptr);
+ continue;
+ }
case OPC_EmitRegister2: {
// For targets w/ more than 256 register names, the register enum
// values are stored in two bytes in the matcher table (just like
@@ -3955,6 +4041,16 @@ void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
RecordedNodes.emplace_back(CurDAG->getRegister(RegNo, VT), nullptr);
continue;
}
+ case OPC_EmitRegisterByHwMode2: {
+ // For targets w/ more than 256 register names, the register enum
+ // values are stored in two bytes in the matcher table (just like
+ // opcodes).
+ MVT VT = getHwModeVT(MatcherTable, MatcherIndex, *this);
+ unsigned RegNo = MatcherTable[MatcherIndex++];
+ RegNo |= MatcherTable[MatcherIndex++] << 8;
+ RecordedNodes.emplace_back(CurDAG->getRegister(RegNo, VT), nullptr);
+ continue;
+ }
case OPC_EmitConvertToTarget:
case OPC_EmitConvertToTarget0:
@@ -4114,6 +4210,7 @@ void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
}
case OPC_EmitNode:
+ case OPC_EmitNodeByHwMode:
case OPC_EmitNode0:
case OPC_EmitNode1:
case OPC_EmitNode2:
@@ -4124,6 +4221,7 @@ void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
case OPC_EmitNode1Chain:
case OPC_EmitNode2Chain:
case OPC_MorphNodeTo:
+ case OPC_MorphNodeToByHwMode:
case OPC_MorphNodeTo0:
case OPC_MorphNodeTo1:
case OPC_MorphNodeTo2:
@@ -4187,11 +4285,20 @@ void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
else
NumVTs = MatcherTable[MatcherIndex++];
SmallVector<EVT, 4> VTs;
- for (unsigned i = 0; i != NumVTs; ++i) {
- MVT::SimpleValueType VT = getSimpleVT(MatcherTable, MatcherIndex);
- if (VT == MVT::iPTR)
- VT = TLI->getPointerTy(CurDAG->getDataLayout()).SimpleTy;
- VTs.push_back(VT);
+ if (Opcode == OPC_EmitNodeByHwMode || Opcode == OPC_MorphNodeToByHwMode) {
+ for (unsigned i = 0; i != NumVTs; ++i) {
+ MVT VT = getHwModeVT(MatcherTable, MatcherIndex, *this);
+ if (VT == MVT::iPTR)
+ VT = TLI->getPointerTy(CurDAG->getDataLayout());
+ VTs.push_back(VT);
+ }
+ } else {
+ for (unsigned i = 0; i != NumVTs; ++i) {
+ MVT::SimpleValueType VT = getSimpleVT(MatcherTable, MatcherIndex);
+ if (VT == MVT::iPTR)
+ VT = TLI->getPointerTy(CurDAG->getDataLayout()).SimpleTy;
+ VTs.push_back(VT);
+ }
}
if (EmitNodeInfo & OPFL_Chain)
@@ -4258,7 +4365,7 @@ void SelectionDAGISel::SelectCodeCommon(SDNode *NodeToMatch,
// Create the node.
MachineSDNode *Res = nullptr;
bool IsMorphNodeTo =
- Opcode == OPC_MorphNodeTo ||
+ Opcode == OPC_MorphNodeTo || Opcode == OPC_MorphNodeToByHwMode ||
(Opcode >= OPC_MorphNodeTo0 && Opcode <= OPC_MorphNodeTo2GlueOutput);
if (!IsMorphNodeTo) {
// If this is a normal EmitNode command, just create the new node and
diff --git a/llvm/test/TableGen/RegClassByHwMode.td b/llvm/test/TableGen/RegClassByHwMode.td
index 193a4c616bb89..fa23932a10ef9 100644
--- a/llvm/test/TableGen/RegClassByHwMode.td
+++ b/llvm/test/TableGen/RegClassByHwMode.td
@@ -193,30 +193,8 @@ include "Common/RegClassByHwModeCommon.td"
// ISEL-SDAG-NEXT: OPC_RecordNode, // #0 = 'st' chained node
// ISEL-SDAG-NEXT: OPC_RecordChild1, // #1 = $val
// ISEL-SDAG-NEXT: OPC_RecordChild2, // #2 = $src
-// ISEL-SDAG-NEXT: OPC_Scope, {{[0-9]+}}, /*->{{[0-9]+}}*/ // 2 children in Scope
-// ISEL-SDAG-NEXT: OPC_CheckChild2TypeI32,
// ISEL-SDAG-NEXT: OPC_CheckPredicate0, // Predicate_unindexedstore
// ISEL-SDAG-NEXT: OPC_CheckPredicate1, // Predicate_store
-// ISEL-SDAG-NEXT: OPC_Scope, {{[0-9]+}}, /*->{{[0-9]+}}*/ // 3 children in Scope
-// ISEL-SDAG-NEXT: OPC_CheckPatternPredicate0, // (Subtarget->hasAlignedRegisters())
-// ISEL-SDAG-NEXT: OPC_EmitMergeInputChains1_0,
-// ISEL-SDAG-NEXT: OPC_MorphNodeTo0, TARGET_VAL(MyTarget::MY_STORE), 0|OPFL_Chain|OPFL_MemRefs,
-
-// ISEL-SDAG: /*Scope*/
-// ISEL-SDAG: OPC_CheckPatternPredicate1, // (Subtarget->hasUnalignedRegisters())
-// ISEL-SDAG-NEXT: OPC_EmitMergeInputChains1_0,
-// ISEL-SDAG-NEXT: OPC_MorphNodeTo0, TARGET_VAL(MyTarget::MY_STORE), 0|OPFL_Chain|OPFL_MemRefs,
-
-// ISEL-SDAG: /*Scope*/
-// ISEL-SDAG: OPC_CheckPatternPredicate2, // !((Subtarget->hasAlignedRegisters())) && !((Subtarget->hasUnalignedRegisters())) && !((Subtarget->isPtr64()))
-// ISEL-SDAG-NEXT: OPC_EmitMergeInputChains1_0,
-// ISEL-SDAG-NEXT: OPC_MorphNodeTo0, TARGET_VAL(MyTarget::MY_STORE), 0|OPFL_Chain|OPFL_MemRefs,
-
-// ISEL-SDAG: /*Scope*/
-// ISEL-SDAG-NEXT: OPC_CheckChild2TypeI64,
-// ISEL-SDAG-NEXT: OPC_CheckPredicate0, // Predicate_unindexedstore
-// ISEL-SDAG-NEXT: OPC_CheckPredicate1, // Predicate_store
-// ISEL-SDAG-NEXT: OPC_CheckPatternPredicate3, // (Subtarget->isPtr64())
// ISEL-SDAG-NEXT: OPC_EmitMergeInputChains1_0,
// ISEL-SDAG-NEXT: OPC_MorphNodeTo0, TARGET_VAL(MyTarget::MY_STORE), 0|OPFL_Chain|OPFL_MemRefs,
@@ -224,33 +202,12 @@ include "Common/RegClassByHwModeCommon.td"
// ISEL-SDAG-NEXT: OPC_RecordMemRef,
// ISEL-SDAG-NEXT: OPC_RecordNode, // #0 = 'ld' chained node
// ISEL-SDAG-NEXT: OPC_RecordChild1, // #1 = $src
-// ISEL-SDAG-NEXT: OPC_CheckTypeI64,
-// ISEL-SDAG-NEXT: OPC_Scope, {{[0-9]+}}, /*->{{[0-9]+}}*/ // 2 children in Scope
-// ISEL-SDAG-NEXT: OPC_CheckChild1TypeI32,
-// ISEL-SDAG-NEXT: OPC_CheckPredicate2, // Predicate_unindexedload
-// ISEL-SDAG-NEXT: OPC_CheckPredicate3, // Predicate_load
-// ISEL-SDAG-NEXT: OPC_Scope, {{[0-9]+}}, /*->{{[0-9]+}}*/ // 3 children in Scope
-// ISEL-SDAG-NEXT: OPC_CheckPatternPredicate0, // (Subtarget->hasAlignedRegisters())
-// ISEL-SDAG-NEXT: OPC_EmitMergeInputChains1_0,
-// ISEL-SDAG-NEXT: OPC_MorphNodeTo1, TARGET_VAL(MyTarget::MY_LOAD), 0|OPFL_Chain|OPFL_MemRefs,
-
-// ISEL-SDAG: /*Scope*/
-// ISEL-SDAG: OPC_CheckPatternPredicate1, // (Subtarget->hasUnalignedRegisters())
-// ISEL-SDAG-NEXT: OPC_EmitMergeInputChains1_0,
-// ISEL-SDAG-NEXT: OPC_MorphNodeTo1, TARGET_VAL(MyTarget::MY_LOAD), 0|OPFL_Chain|OPFL_MemRefs,
-
-// ISEL-SDAG: /*Scope*/
-// ISEL-SDAG: OPC_CheckPatternPredicate2, // !((Subtarget->hasAlignedRegisters())) && !((Subtarget->hasUnalignedRegisters())) && !((Subtarget->isPtr64()))
-// ISEL-SDAG-NEXT: OPC_EmitMergeInputChains1_0,
-// ISEL-SDAG-NEXT: OPC_MorphNodeTo1, TARGET_VAL(MyTarget::MY_LOAD), 0|OPFL_Chain|OPFL_MemRefs,
-
-// ISEL-SDAG: /*Scope*/
-// ISEL-SDAG-NEXT: OPC_CheckChild1TypeI64,
+// ISEL-SDAG-NEXT: OPC_CheckChild1TypeByHwMode, /*{(*:i32),(m3:i64)}*/0,
// ISEL-SDAG-NEXT: OPC_CheckPredicate2, // Predicate_unindexedload
// ISEL-SDAG-NEXT: OPC_CheckPredicate3, // Predicate_load
-// ISEL-SDAG-NEXT: OPC_CheckPatternPredicate3, // (Subtarget->isPtr64())
+// ISEL-SDAG-NEXT: OPC_CheckTypeI64,
// ISEL-SDAG-NEXT: OPC_EmitMergeInputChains1_0,
-// ISEL-SDAG-NEXT: OPC_MorphNodeTo1, TARGET_VAL(MyTarget::MY_LOAD), 0|OPFL_Chain|OPFL_MemRefs,
+// ISEL-SDAG-NEXT: OPC_MorphNodeToByHwMode, TARGET_VAL(MyTarget::MY_LOAD), 0|OPFL_Chain|OPFL_MemRefs,
diff --git a/llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp b/llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp
index 35f8a06916298..af1b5592e56c9 100644
--- a/llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp
+++ b/llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp
@@ -94,13 +94,15 @@ bool TypeSetByHwMode::isValueTypeByHwMode(bool AllowEmpty) const {
return true;
}
-ValueTypeByHwMode TypeSetByHwMode::getValueTypeByHwMode() const {
+ValueTypeByHwMode TypeSetByHwMode::getValueTypeByHwMode(bool SkipEmpty) const {
assert(isValueTypeByHwMode(true) &&
"The type set has multiple types for at least one HW mode");
ValueTypeByHwMode VVT;
VVT.PtrAddrSpace = AddrSpace;
for (const auto &I : *this) {
+ if (SkipEmpty && I.second.empty())
+ continue;
MVT T = I.second.empty() ? MVT::Other : *I.second.begin();
VVT.insertTypeForMode(I.first, T);
}
@@ -1480,10 +1482,9 @@ static unsigned getPatternSize(const TreePatternNode &P,
// Count children in the count if they are also nodes.
for (const TreePatternNode &Child : P.children()) {
if (!Child.isLeaf() && Child.getNumTypes()) {
- const TypeSetByHwMode &T0 = Child.getExtType(0);
- // At this point, all variable type sets should be simple, i.e. only
- // have a default mode.
- if (T0.getMachineValueType() != MVT::Other) {
+ // FIXME: Can we assume non-simple VTs should be counted?
+ auto VVT = Child.getType(0);
+ if (llvm::any_of(VVT, [](auto &P) { return P.second != MVT::Other; })) {
Size += getPatternSize(Child, CGP);
continue;
}
@@ -3319,7 +3320,7 @@ void TreePattern::dump() const { print(errs()); }
// CodeGenDAGPatterns implementation
//
-CodeGenDAGPatterns::CodeGenDAGPatterns(const RecordKeeper &R)
+CodeGenDAGPatterns::CodeGenDAGPatterns(const RecordKeeper &R, bool ExpandHwMode)
: Records(R), Target(R), Intrinsics(R),
LegalVTS(Target.getLegalValueTypes()),
LegalPtrVTS(ComputeLegalPtrTypes()) {
@@ -3339,7 +3340,8 @@ CodeGenDAGPatterns::CodeGenDAGPatterns(const RecordKeeper &R)
// Break patterns with parameterized types into a series of patterns,
// where each one has a fixed type and is predicated on the conditions
// of the associated HW mode.
- ExpandHwModeBasedTypes();
+ if (ExpandHwMode)
+ ExpandHwModeBasedTypes();
// Infer instruction flags. For example, we can detect loads,
// stores, and side effects in many cases by examining an
diff --git a/llvm/utils/TableGen/Common/CodeGenDAGPatterns.h b/llvm/utils/TableGen/Common/CodeGenDAGPatterns.h
index 220fa43bf5037..7d93e9ce126d5 100644
--- a/llvm/utils/TableGen/Common/CodeGenDAGPatterns.h
+++ b/llvm/utils/TableGen/Common/CodeGenDAGPatterns.h
@@ -190,7 +190,7 @@ struct TypeSetByHwMode : public InfoByHwMode<MachineValueTypeSet> {
SetType &getOrCreate(unsigned Mode) { return Map[Mode]; }
bool isValueTypeByHwMode(bool AllowEmpty) const;
- ValueTypeByHwMode getValueTypeByHwMode() const;
+ ValueTypeByHwMode getValueTypeByHwMode(bool SkipEmpty = false) const;
LLVM_ATTRIBUTE_ALWAYS_INLINE
bool isMachineValueType() const {
@@ -672,6 +672,9 @@ class TreePatternNode : public RefCountedBase<TreePatternNode> {
// Type accessors.
unsigned getNumTypes() const { return Types.size(); }
+ ValueTypeByHwMode getType(unsigned ResNo) const {
+ return Types[ResNo].getValueTypeByHwMode(/*SkipEmpty=*/true);
+ }
const std::vector<TypeSetByHwMode> &getExtTypes() const { return Types; }
const TypeSetByHwMode &getExtType(unsigned ResNo) const {
return Types[ResNo];
@@ -1123,7 +1126,7 @@ class CodeGenDAGPatterns {
unsigned NumScopes = 0;
public:
- CodeGenDAGPatterns(const RecordKeeper &R);
+ CodeGenDAGPatterns(const RecordKeeper &R, bool ExpandHwMode = true);
CodeGenTarget &getTargetInfo() { return Target; }
const CodeGenTarget &getTargetInfo() const { return Target; }
diff --git a/llvm/utils/TableGen/Common/DAGI...
[truncated]
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
|
|
||
| TTI = &FAM.getResult<TargetIRAnalysis>(Fn); | ||
|
|
||
| HwMode = MF->getSubtarget().getHwMode(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is supposed to be queried with a specific hw mode kind (which seems to be a stalled project from who knows when)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, ValueTypeByHwMode and RegClassInfo/RegClassByHwMode use different hw mode kinds but they both affect the VT in CodeGenDAGPatterns so I don't how to make that work.
|
@arichardson has been working on an initial port of the Y base to use HwMode, so it might be worth verifying that it's possible base that on this change. |
| } | ||
|
|
||
| if (RT.getNumTypes() != 0) { | ||
| for (auto VT : RT.getType(0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps we could use for (auto [_, VT] : RT.getType(0)) here?
| OS << " switch (Index) {\n"; | ||
| OS << " default: llvm_unreachable(\"Unexpected index\");\n"; | ||
|
|
||
| for (const auto &P : ValueTypeMap) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto for (const auto &[..., ...] : ValueTypeMap)?
| SmallVector<ValueTypeByHwMode, 4> ResultVTs; | ||
| for (unsigned i = 0, e = N.getNumTypes(); i != e; ++i) | ||
| ResultVTs.push_back(N.getSimpleType(i)); | ||
| ResultVTs.push_back(N.getType(i)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: emplace_back?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the advantage here?
wangpc-pp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reduces the size of the isel table on RISC-V from ~2.38 million bytes
to ~1.38 million bytes.
This seems to be promising! Just added some rough comments.
| OPC_EmitNode1Chain, | ||
| OPC_EmitNode2Chain, | ||
| OPC_MorphNodeTo, | ||
| OPC_MorphNodeToByHwMode, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many opcodes do we have after this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
186
| getHwModeVT(const uint8_t *MatcherTable, unsigned &MatcherIndex, | ||
| const SelectionDAGISel &SDISel) { | ||
| unsigned Index = MatcherTable[MatcherIndex++]; | ||
| if (Index & 128) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that we have such a large Index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt. I can make it a fatal error in tablegen if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is always small, I prefer to to make it fixed to save some compile time.
The way HwMode is currently implemented, tablegen duplicates each
pattern that is dependent on hardware mode. The HwMode predicate is
added as a pattern predicate on the duplicated pattern.
RISC-V uses HwMode on the GPR register class which means almost every
isel pattern is affected by HwMode. This results in the isel table
being nearly twice the size it would be if we only had a single GPR size.
This patch proposes to do the expansion at instruction selection time
instead. To accomplish this new opcodes like OPC_CheckTypeByHwMode
are added to the isel table. The unique combinations of types and HwMode
are converted to an index that is the payload for the new opcodes.
TableGen emits a new virtual function getValueTypeByHwMode that uses
this index and the current HwMode to look up the type.
This reduces the size of the isel table on RISC-V from ~2.38 million bytes
to ~1.38 million bytes.
I did not add an OPC_SwitchTypeByHwMode opcode yet. If the VT requires a
hardware mode, we emit an OPC_Scope+OPC_CheckTypeByHwMode instead. I
expect adding an OPC_SwitchTypeByHwMode could further reduce the table
size. I will investigate this as a follow up.
I haven't measured yet, but it's possible the new getValueTypeByHwMode
may have an affect on compile time. If necessary we could add a cache on
this lookup to mitigate some impact.
Many of the matcher classes in tablegen now use ValueTypeByHwMode instead
of MVT. This may have an impact on the memory usage and runtime of tablegen.
We can mitigate some of this by splitting the matchers into MVT and
ValueTypeByHwMode versions. We can also explore alternate data structures
for ValueTypeByHwMode instead of a std::map. Maybe a sorted vector.
I plan to do some cleanup to try to reduce some code duplication, but
I wanted to get early feedback on the direction.
Given the scope of this patch and the timing of the LLVM 22 branch,
I don't expect to commit this until after the branch.
A similar change can be made to GlobalISel as a follow up.