Skip to content

[PowerPC] Add DMF basic builtins #145372

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

RolandF77
Copy link
Collaborator

Add support for PPC Dense Math basic builtins dmsetdmrz, dmmr, dmxor.

@RolandF77 RolandF77 self-assigned this Jun 23, 2025
@RolandF77 RolandF77 requested review from lei137 and maryammo June 23, 2025 19:02
@RolandF77 RolandF77 marked this pull request as ready for review June 23, 2025 19:02
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:PowerPC clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Jun 23, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 23, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-powerpc

Author: None (RolandF77)

Changes

Add support for PPC Dense Math basic builtins dmsetdmrz, dmmr, dmxor.


Full diff: https://github.com/llvm/llvm-project/pull/145372.diff

3 Files Affected:

  • (modified) clang/include/clang/Basic/BuiltinsPPC.def (+6)
  • (modified) clang/lib/CodeGen/TargetBuiltins/PPC.cpp (+5)
  • (modified) clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c (+16)
diff --git a/clang/include/clang/Basic/BuiltinsPPC.def b/clang/include/clang/Basic/BuiltinsPPC.def
index 099500754a0e0..7f4dc9fe4f719 100644
--- a/clang/include/clang/Basic/BuiltinsPPC.def
+++ b/clang/include/clang/Basic/BuiltinsPPC.def
@@ -1146,6 +1146,12 @@ UNALIASED_CUSTOM_BUILTIN(mma_dmxvi8gerx4spp,  "vW1024*W256V", true,
                          "mma,paired-vector-memops")
 UNALIASED_CUSTOM_BUILTIN(mma_pmdmxvi8gerx4spp, "vW1024*W256Vi255i15i15", true,
                          "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_dmsetdmrz, "vW1024*", false,
+                         "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_dmmr, "vW1024*W1024*", false,
+                         "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_dmxor, "vW1024*W1024*", true,
+                         "mma,paired-vector-memops")
 
 // FIXME: Obviously incomplete.
 
diff --git a/clang/lib/CodeGen/TargetBuiltins/PPC.cpp b/clang/lib/CodeGen/TargetBuiltins/PPC.cpp
index f9890285f0aab..270e9fc976f23 100644
--- a/clang/lib/CodeGen/TargetBuiltins/PPC.cpp
+++ b/clang/lib/CodeGen/TargetBuiltins/PPC.cpp
@@ -1151,6 +1151,11 @@ Value *CodeGenFunction::EmitPPCBuiltinExpr(unsigned BuiltinID,
       Value *Acc = Builder.CreateLoad(Addr);
       CallOps.push_back(Acc);
     }
+    if (BuiltinID == PPC::BI__builtin_mma_dmmr ||
+        BuiltinID == PPC::BI__builtin_mma_dmxor) {
+      Address Addr = EmitPointerWithAlignment(E->getArg(1));
+      Ops[1] = Builder.CreateLoad(Addr);
+    }
     for (unsigned i=1; i<Ops.size(); i++)
       CallOps.push_back(Ops[i]);
     llvm::Function *F = CGM.getIntrinsic(ID);
diff --git a/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c b/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c
index 41f13155847ba..4aafc09602228 100644
--- a/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c
+++ b/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c
@@ -92,3 +92,19 @@ void test_pmdmxvi8gerx4spp(unsigned char *vdmrp, unsigned char *vpp, vector unsi
   __builtin_mma_pmdmxvi8gerx4spp(&vdmr, vp, vc, 0, 0, 0);
   *((__dmr1024 *)resp) = vdmr;
 }
+
+// CHECK-LABEL: @test_dmf_basic
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[TMP0:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.dmsetdmrz()
+// CHECK-NEXT: [[TMP1:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.dmmr(<1024 x i1> [[TMP0]])
+// CHECK-NEXT: store <1024 x i1> [[TMP1]], ptr %res1, align 128
+// CHECK-NEXT: [[TMP2:%.*]] = load <1024 x i1>, ptr %res2, align 128
+// CHECK-NEXT: [[TMP3:%.*]] = load <1024 x i1>, ptr %p, align 128
+// CHECK-NEXT: [[TMP4:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.dmxor(<1024 x i1> [[TMP2]], <1024 x i1> [[TMP3]])
+// CHECK-NEXT: store <1024 x i1> [[TMP4]], ptr %res2, align 128
+void test_dmf_basic(char *p, char *res1, char *res2) {
+  __dmr1024 x[2];
+  __builtin_mma_dmsetdmrz(&x[0]);
+  __builtin_mma_dmmr((__dmr1024*)res1, &x[0]);
+  __builtin_mma_dmxor((__dmr1024*)res2, (__dmr1024*)p);
+}

@llvmbot
Copy link
Member

llvmbot commented Jun 23, 2025

@llvm/pr-subscribers-clang-codegen

Author: None (RolandF77)

Changes

Add support for PPC Dense Math basic builtins dmsetdmrz, dmmr, dmxor.


Full diff: https://github.com/llvm/llvm-project/pull/145372.diff

3 Files Affected:

  • (modified) clang/include/clang/Basic/BuiltinsPPC.def (+6)
  • (modified) clang/lib/CodeGen/TargetBuiltins/PPC.cpp (+5)
  • (modified) clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c (+16)
diff --git a/clang/include/clang/Basic/BuiltinsPPC.def b/clang/include/clang/Basic/BuiltinsPPC.def
index 099500754a0e0..7f4dc9fe4f719 100644
--- a/clang/include/clang/Basic/BuiltinsPPC.def
+++ b/clang/include/clang/Basic/BuiltinsPPC.def
@@ -1146,6 +1146,12 @@ UNALIASED_CUSTOM_BUILTIN(mma_dmxvi8gerx4spp,  "vW1024*W256V", true,
                          "mma,paired-vector-memops")
 UNALIASED_CUSTOM_BUILTIN(mma_pmdmxvi8gerx4spp, "vW1024*W256Vi255i15i15", true,
                          "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_dmsetdmrz, "vW1024*", false,
+                         "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_dmmr, "vW1024*W1024*", false,
+                         "mma,paired-vector-memops")
+UNALIASED_CUSTOM_BUILTIN(mma_dmxor, "vW1024*W1024*", true,
+                         "mma,paired-vector-memops")
 
 // FIXME: Obviously incomplete.
 
diff --git a/clang/lib/CodeGen/TargetBuiltins/PPC.cpp b/clang/lib/CodeGen/TargetBuiltins/PPC.cpp
index f9890285f0aab..270e9fc976f23 100644
--- a/clang/lib/CodeGen/TargetBuiltins/PPC.cpp
+++ b/clang/lib/CodeGen/TargetBuiltins/PPC.cpp
@@ -1151,6 +1151,11 @@ Value *CodeGenFunction::EmitPPCBuiltinExpr(unsigned BuiltinID,
       Value *Acc = Builder.CreateLoad(Addr);
       CallOps.push_back(Acc);
     }
+    if (BuiltinID == PPC::BI__builtin_mma_dmmr ||
+        BuiltinID == PPC::BI__builtin_mma_dmxor) {
+      Address Addr = EmitPointerWithAlignment(E->getArg(1));
+      Ops[1] = Builder.CreateLoad(Addr);
+    }
     for (unsigned i=1; i<Ops.size(); i++)
       CallOps.push_back(Ops[i]);
     llvm::Function *F = CGM.getIntrinsic(ID);
diff --git a/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c b/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c
index 41f13155847ba..4aafc09602228 100644
--- a/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c
+++ b/clang/test/CodeGen/PowerPC/builtins-ppc-dmf.c
@@ -92,3 +92,19 @@ void test_pmdmxvi8gerx4spp(unsigned char *vdmrp, unsigned char *vpp, vector unsi
   __builtin_mma_pmdmxvi8gerx4spp(&vdmr, vp, vc, 0, 0, 0);
   *((__dmr1024 *)resp) = vdmr;
 }
+
+// CHECK-LABEL: @test_dmf_basic
+// CHECK-NEXT: entry:
+// CHECK-NEXT: [[TMP0:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.dmsetdmrz()
+// CHECK-NEXT: [[TMP1:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.dmmr(<1024 x i1> [[TMP0]])
+// CHECK-NEXT: store <1024 x i1> [[TMP1]], ptr %res1, align 128
+// CHECK-NEXT: [[TMP2:%.*]] = load <1024 x i1>, ptr %res2, align 128
+// CHECK-NEXT: [[TMP3:%.*]] = load <1024 x i1>, ptr %p, align 128
+// CHECK-NEXT: [[TMP4:%.*]] = tail call <1024 x i1> @llvm.ppc.mma.dmxor(<1024 x i1> [[TMP2]], <1024 x i1> [[TMP3]])
+// CHECK-NEXT: store <1024 x i1> [[TMP4]], ptr %res2, align 128
+void test_dmf_basic(char *p, char *res1, char *res2) {
+  __dmr1024 x[2];
+  __builtin_mma_dmsetdmrz(&x[0]);
+  __builtin_mma_dmmr((__dmr1024*)res1, &x[0]);
+  __builtin_mma_dmxor((__dmr1024*)res2, (__dmr1024*)p);
+}

UNALIASED_CUSTOM_BUILTIN(mma_dmmr, "vW1024*W1024*", false,
"mma,paired-vector-memops")
UNALIASED_CUSTOM_BUILTIN(mma_dmxor, "vW1024*W1024*", true,
"mma,paired-vector-memops")
Copy link
Contributor

@lei137 lei137 Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paired-vector-memops is specifically defined for P10+ to deal with mma instructions. Since these builtins deal with _dmr1024 types for cpu=future, should a new feature type be created so we can diag within PCTargetInfo::initFeatureMap()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, there is already a sub target feature called isa-future-instructions. For the dmr Integer builtins I did not use it as the #144594 (comment) did not land and without this -target-cpu future did not imply isa-future-instructions without extra hard coding in clang, maybe we can see if that works now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:PowerPC clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants