-
Notifications
You must be signed in to change notification settings - Fork 29
[AIEX] Extend Staged 2D/3D regalloc to avoid spills #685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks you @andcarminati, very much !! |
|
Also do you think following commit will help ? |
Maybe yes! As mentioned before, I prefer to keep just the minimal necessary changes. We can test after, on top of this PR. |
31b7e71 to
a27561f
Compare
a27561f to
e124649
Compare
80e6f7c to
4c47705
Compare
| const AIEBaseRegisterInfo &TRI, | ||
| std::set<Register> &VisitedVRegs); | ||
|
|
||
| SmallSet<int, 8> getRewritableSubRegs(Register Reg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we have a comment what this function does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is a refactor but it is always good to document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we swap the documentation here?
I think the high level function should get the documentation, the actual implementation function (the one above this) should get the slimmed documentation.
a88a541 to
ab8c08d
Compare
9e25710 to
8dba52e
Compare
| } | ||
|
|
||
| /// Rewrite a full copy into multiple copies using the subregs in \p CopySubRegs | ||
| void rewriteFullCopy(MachineInstr &CopyMI, LiveIntervals &LIS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note: in the current state we will never rewrite subreg copies, e.g. copies of d0 or d4 in the case of a 3d_0 register.
Do you know why they are not relevant for the Superregrewriter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this will never occur because it will be prevented by. getRewritableSubRegs.
| ++VRegIdx) { | ||
| const Register Reg = Register::index2VirtReg(VRegIdx); | ||
|
|
||
| // Ignore un-used od already allocated registers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: or
| // | ||
| // (c) Copyright 2025 Advanced Micro Devices, Inc. or its affiliates | ||
| // | ||
| //===----------------------------------------------------------------------===// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What registers aren't assigned after the previous 2d/3d reg allocs?
Are we changing here the copies and moves not the padds?
Could you Comment why this pass is necessary and the previous passes do not pick this up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will complement this part.
8dba52e to
b8476f5
Compare
|
LGTM @martien-de-jong what do you say |
b8476f5 to
57f2573
Compare
| int SubReg = RegOp.getSubReg(); | ||
| assert(SubReg); | ||
| RegOp.setReg(SubRegToVReg[SubReg]); | ||
| RegOp.setSubReg(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: 3D and 2D reg alloc will have already allocated the the 3D and 2D users except for the copies.
We spill them in line 178, so that we only work on subregs.
The users we now encounter are the subreg copies from the rewriteFullCopy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that rewriteFullCopy is to prevent failures during the rewrite. We can rewrite also operands in load and move (also immediate) instructions. See instructions covered by ItinRegClassPair.
Now we filter by register class and usage. Basically, we exclude here instructions like copies and non-2D/3D ones. Co-Authored-By: Krishnam Tibrewala <[email protected]>
…gisters Co-Authored-By: Krishnam Tibrewala <[email protected]>
Co-Authored-By: Krishnam Tibrewala <[email protected]>
The goal of this test is to check if we properly insert undef flag on the def side of a expanded full copy. On a sub-register def operand, it refers to the part of the register that isn't written. A sub-register def implicitly reads the other parts of the register being redefined unless the <undef> flag is set, and a missing flag can force the related register to be inserted in liveout set of the predecessors block, causing dominance problems. Co-Authored-By: Krishnam Tibrewala <[email protected]>
This will handle properly use of non-dominating definitions. We also change the handling of the destination registers in two parts: *Copy expansion: we replace the ogininal index by the index of the first lane copy to avoid the creation LRs with just one instruction, in this way we keep que LI correct. *Rewrite: reset dead flags if necessary. Co-Authored-By: Krishnam Tibrewala <[email protected]>
…reedy run Co-Authored-By: Krishnam Tibrewala <[email protected]>
If we don't need a full register, we can expand to individual lanes. Co-Authored-By: Krishnam Tibrewala <[email protected]>
Co-Authored-By: Krishnam Tibrewala <[email protected]>
This avoids cycles in bundles that appear in VirtRegRewriter. We also update LIs related to src and dst operands of those expanded copies. Co-Authored-By: Krishnam Tibrewala <[email protected]>
57f2573 to
a08adb9
Compare



This work is intended to avoid 2D/3D (when possible) register spills.
The idea and rationale behind this work is in a previous Draft PR: #442.
To review, I recommend to follow this PR commit by commit.
Credits also for the co-author @krishnamtibrewala.