-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Allow unaligned VGPR for ds_read_b96_tr_b6 #125169
[AMDGPU] Allow unaligned VGPR for ds_read_b96_tr_b6 #125169
Conversation
@llvm/pr-subscribers-mc @llvm/pr-subscribers-backend-amdgpu Author: Pravin Jagtap (pravinjagtap) ChangesAll load transpose instructions follow gfx950 standard of even aligned VGPR except ds_read_b96_tr_b6, which allows unaligned VGPR. Full diff: https://github.com/llvm/llvm-project/pull/125169.diff 4 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index d8f441d1ccfe44..ed922245b3e2ba 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -4853,7 +4853,10 @@ bool AMDGPUAsmParser::validateAGPRLdSt(const MCInst &Inst) const {
bool AMDGPUAsmParser::validateVGPRAlign(const MCInst &Inst) const {
auto FB = getFeatureBits();
- if (!FB[AMDGPU::FeatureGFX90AInsts])
+ unsigned Opc = Inst.getOpcode();
+ // DS_READ_B96_TR_B6 is the only DS instruction in GFX950, that allows
+ // unaligned VGPR. All others only allow even aligned VGPRs.
+ if (!(FB[AMDGPU::FeatureGFX90AInsts]) || Opc == AMDGPU::DS_READ_B96_TR_B6_vi)
return true;
const MCRegisterInfo *MRI = getMRI();
diff --git a/llvm/test/MC/AMDGPU/gfx950-unsupported.s b/llvm/test/MC/AMDGPU/gfx950-unsupported.s
index 225784177ae185..8bdab2da2394cc 100644
--- a/llvm/test/MC/AMDGPU/gfx950-unsupported.s
+++ b/llvm/test/MC/AMDGPU/gfx950-unsupported.s
@@ -239,10 +239,6 @@ ds_read_b64_tr_b16 v[2:3], v2 offset:-64
//===----------------------------------------------------------------------===//
// ds_read_b96_tr_b6
//===----------------------------------------------------------------------===//
-ds_read_b96_tr_b6 v[1:3], v0
-// ERR: :[[@LINE-1]]:{{[0-9]+}}: error: invalid register class: vgpr tuples must be 64 bit aligned
-// W32-ERR: :[[@LINE-2]]:{{[0-9]+}}: error: instruction not supported on this GPU
-
ds_read_b96_tr_b6 v1, v0
// ERR: :[[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
// W32-ERR: :[[@LINE-2]]:{{[0-9]+}}: error: instruction not supported on this GPU
diff --git a/llvm/test/MC/AMDGPU/gfx950_asm_read_tr.s b/llvm/test/MC/AMDGPU/gfx950_asm_read_tr.s
index a6907caafcbb61..b9ceb143ecf8e8 100644
--- a/llvm/test/MC/AMDGPU/gfx950_asm_read_tr.s
+++ b/llvm/test/MC/AMDGPU/gfx950_asm_read_tr.s
@@ -32,3 +32,11 @@ ds_read_b96_tr_b6 v[0:2], v0
ds_read_b96_tr_b6 v[2:4], v2 offset:64
// GFX940-ERR: [[@LINE-1]]:{{[0-9]+}}: error: instruction not supported on this GPU
// GFX950: encoding: [0x40,0x00,0xc2,0xd9,0x02,0x00,0x00,0x02]
+
+ds_read_b96_tr_b6 v[1:3], v0
+// GFX940-ERR: [[@LINE-1]]:{{[0-9]+}}: error: instruction not supported on this GPU
+// GFX950: encoding: [0x00,0x00,0xc2,0xd9,0x00,0x00,0x00,0x01]
+
+ds_read_b96_tr_b6 v[1:3], v2 offset:64
+// GFX940-ERR: [[@LINE-1]]:{{[0-9]+}}: error: instruction not supported on this GPU
+// GFX950: encoding: [0x40,0x00,0xc2,0xd9,0x02,0x00,0x00,0x01]
\ No newline at end of file
diff --git a/llvm/test/MC/Disassembler/AMDGPU/gfx950_dasm_ds_read_tr.txt b/llvm/test/MC/Disassembler/AMDGPU/gfx950_dasm_ds_read_tr.txt
index 10310f7ad1f3de..0b5dba24612ca5 100644
--- a/llvm/test/MC/Disassembler/AMDGPU/gfx950_dasm_ds_read_tr.txt
+++ b/llvm/test/MC/Disassembler/AMDGPU/gfx950_dasm_ds_read_tr.txt
@@ -35,3 +35,9 @@
# GFX950: ds_read_b96_tr_b6 v[2:4], v2 offset:64 ; encoding: [0x40,0x00,0xc2,0xd9,0x02,0x00,0x00,0x02]
0x40,0x00,0xc2,0xd9,0x02,0x00,0x00,0x02
+
+# GFX950: ds_read_b96_tr_b6 v[1:3], v0 ; encoding: [0x00,0x00,0xc2,0xd9,0x00,0x00,0x00,0x01]
+0x00,0x00,0xc2,0xd9,0x00,0x00,0x00,0x01
+
+# GFX950: ds_read_b96_tr_b6 v[1:3], v2 offset:64 ; encoding: [0x40,0x00,0xc2,0xd9,0x02,0x00,0x00,0x01]
+0x40,0x00,0xc2,0xd9,0x02,0x00,0x00,0x01
|
All load transpose instructions follow gfx950 standard of even aligned VGPR except ds_read_b96_tr_b6, which allows unaligned VGPR.
035fa4f
to
6c10533
Compare
unsigned Opc = Inst.getOpcode(); | ||
// DS_READ_B96_TR_B6 is the only DS instruction in GFX950, that allows | ||
// unaligned VGPR. All others only allow even aligned VGPRs. | ||
if (!(FB[AMDGPU::FeatureGFX90AInsts]) || Opc == AMDGPU::DS_READ_B96_TR_B6_vi) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra parentheses. Also we probably should not directly special case the opcode here. This should come from the register class constraint of the operand
…vm#400) All load transpose instructions follow gfx950 standard of even aligned VGPR except ds_read_b96_tr_b6, which allows unaligned VGPR. Co-authored-by: Sirish Pande [[email protected]](mailto:[email protected])
All load transpose instructions follow gfx950 standard of even aligned VGPR except ds_read_b96_tr_b6, which allows unaligned VGPR.
Co-authored-by: Sirish Pande [email protected]