[DAG] Add generic expansion for ISD::FCANONICALIZE nodes #142105

dominik-steenken · 2025-05-30T08:16:00Z

This PR takes the work previously done by @pawan-nirpal-031 on X86 in #106370, and makes it available in common code. This should enable all targets to use __builtin_canonicalize at least for f32 and f64 data types.

Canonicalization is implemented here as multiplication by 1.0, as suggested in the docs.

Moving the X86 implementation to common code was suggested in the reviews of #106370 by @arsenm .

@pawan-nirpal-031 I hope you don't mind that i went ahead with this to get it working on SystemZ.

I don't have a lot of experience with this part of the code base yet, so reviews are very welcome.

dominik-steenken · 2025-05-30T08:16:13Z

@uweigand FYI

llvmbot · 2025-05-30T08:16:35Z

@llvm/pr-subscribers-backend-mips
@llvm/pr-subscribers-backend-loongarch
@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-selectiondag

Author: Dominik Steenken (dominik-steenken)

Changes

This PR takes the work previously done by @pawan-nirpal-031 on X86 in #106370, and makes it available in common code. This should enable all targets to use __builtin_canonicalize at least for f32 and f64 data types.

Canonicalization is implemented here as multiplication by 1.0, as suggested in the docs.

Moving the X86 implementation to common code was suggested in the reviews of #106370 by @arsenm .

@pawan-nirpal-031 I hope you don't mind that i went ahead with this to getit working on SystemZ.

I don't have a lot of experience with this part of the code base yet, so reviews are very welcome.

Full diff: https://github.com/llvm/llvm-project/pull/142105.diff

3 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (+17)
(modified) llvm/lib/CodeGen/TargetLoweringBase.cpp (+3)
(added) llvm/test/CodeGen/SystemZ/canonicalize-vars.ll (+141)

diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
index 528c07cc5549d..6599704aaf290 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
@@ -3356,6 +3356,23 @@ bool SelectionDAGLegalize::ExpandNode(SDNode *Node) {
     Results.push_back(Op);
     break;
   }
+  case ISD::FCANONICALIZE: {
+    // This implements llvm.canonicalize.f* by multiplication with 1.0,
+    // as suggested in https://llvm.org/docs/LangRef.html#id2335.
+    // Get operand x.
+    SDValue Operand = Node->getOperand(0);
+    // Get fp value type used.
+    EVT VT = Operand.getValueType();
+    // Produce appropriately-typed 1.0 constant.
+    SDValue One = DAG.getConstantFP(1.0, dl, VT);
+    // Produce multiplication node x * 1.0.
+    SDValue Chain = DAG.getEntryNode();
+    SDValue Mul = DAG.getNode(ISD::STRICT_FMUL, dl, {VT, MVT::Other},
+                              {Chain, Operand, One});
+
+    Results.push_back(Mul);
+    break;
+  }
   case ISD::SIGN_EXTEND_INREG: {
     EVT ExtraVT = cast<VTSDNode>(Node->getOperand(1))->getVT();
     EVT VT = Node->getValueType(0);
diff --git a/llvm/lib/CodeGen/TargetLoweringBase.cpp b/llvm/lib/CodeGen/TargetLoweringBase.cpp
index 935afaf9dd550..ba765afd80bf1 100644
--- a/llvm/lib/CodeGen/TargetLoweringBase.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringBase.cpp
@@ -870,6 +870,9 @@ void TargetLoweringBase::initActions() {
                       ISD::FATAN2},
                      {MVT::f32, MVT::f64, MVT::f128}, Expand);
 
+  // Insert custom handling default for llvm.canonicalize.*.
+  setOperationAction(ISD::FCANONICALIZE, {MVT::f32, MVT::f64}, Expand);
+
   // FIXME: Query RuntimeLibCalls to make the decision.
   setOperationAction({ISD::LRINT, ISD::LLRINT, ISD::LROUND, ISD::LLROUND},
                      {MVT::f32, MVT::f64, MVT::f128}, LibCall);
diff --git a/llvm/test/CodeGen/SystemZ/canonicalize-vars.ll b/llvm/test/CodeGen/SystemZ/canonicalize-vars.ll
new file mode 100644
index 0000000000000..477f5e1547567
--- /dev/null
+++ b/llvm/test/CodeGen/SystemZ/canonicalize-vars.ll
@@ -0,0 +1,141 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --default-march s390x-unknown-linux-gnu --version 5
+; RUN: llc -mtriple=s390x-linux-gnu -mcpu=z16 < %s | FileCheck %s -check-prefixes=Z16
+
+define float @canonicalize_fp32(float %a) {
+; Z16-LABEL: canonicalize_fp32:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vgmf %v1, 2, 8
+; Z16-NEXT:    meebr %f0, %f1
+; Z16-NEXT:    br %r14
+  %canonicalized = call float @llvm.canonicalize.f32(float %a)
+  ret float %canonicalized
+}
+
+define double @canonicalize_fp64(double %a) {
+; Z16-LABEL: canonicalize_fp64:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vgmg %v1, 2, 11
+; Z16-NEXT:    mdbr %f0, %f1
+; Z16-NEXT:    br %r14
+  %canonicalized = call double @llvm.canonicalize.f64(double %a)
+  ret double %canonicalized
+}
+
+define void @canonicalize_ptr_f32(float * %out) {
+; Z16-LABEL: canonicalize_ptr_f32:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vgmf %v0, 2, 8
+; Z16-NEXT:    meeb %f0, 0(%r2)
+; Z16-NEXT:    ste %f0, 0(%r2)
+; Z16-NEXT:    br %r14
+  %val = load float, float * %out
+  %canonicalized = call float @llvm.canonicalize.f32(float %val)
+  store float %canonicalized, float * %out
+  ret void
+}
+
+define void @canonicalize_ptr_f64(double * %out) {
+; Z16-LABEL: canonicalize_ptr_f64:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vgmg %v0, 2, 11
+; Z16-NEXT:    mdb %f0, 0(%r2)
+; Z16-NEXT:    std %f0, 0(%r2)
+; Z16-NEXT:    br %r14
+  %val = load double, double * %out
+  %canonicalized = call double @llvm.canonicalize.f64(double %val)
+  store double %canonicalized, double * %out
+  ret void
+}
+
+define <4 x float> @canonicalize_v4f32(<4 x float> %a) {
+; Z16-LABEL: canonicalize_v4f32:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vrepf %v0, %v24, 3
+; Z16-NEXT:    vgmf %v1, 2, 8
+; Z16-NEXT:    vrepf %v2, %v24, 2
+; Z16-NEXT:    meebr %f0, %f1
+; Z16-NEXT:    meebr %f2, %f1
+; Z16-NEXT:    vrepf %v3, %v24, 1
+; Z16-NEXT:    vmrhf %v0, %v2, %v0
+; Z16-NEXT:    wfmsb %f2, %v24, %f1
+; Z16-NEXT:    wfmsb %f1, %f3, %f1
+; Z16-NEXT:    vmrhf %v1, %v2, %v1
+; Z16-NEXT:    vmrhg %v24, %v1, %v0
+; Z16-NEXT:    br %r14
+  %canonicalized = call <4 x float> @llvm.canonicalize.v4f32(<4 x float> %a)
+  ret <4 x float> %canonicalized
+}
+
+define <4 x double> @canonicalize_v4f64(<4 x double> %a) {
+; Z16-LABEL: canonicalize_v4f64:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vgmg %v0, 2, 11
+; Z16-NEXT:    vrepg %v2, %v24, 1
+; Z16-NEXT:    wfmdb %f1, %v24, %f0
+; Z16-NEXT:    mdbr %f2, %f0
+; Z16-NEXT:    vmrhg %v24, %v1, %v2
+; Z16-NEXT:    vrepg %v2, %v26, 1
+; Z16-NEXT:    wfmdb %f1, %v26, %f0
+; Z16-NEXT:    wfmdb %f0, %f2, %f0
+; Z16-NEXT:    vmrhg %v26, %v1, %v0
+; Z16-NEXT:    br %r14
+  %canonicalized = call <4 x double> @llvm.canonicalize.v4f64(<4 x double> %a)
+  ret <4 x double> %canonicalized
+}
+
+define void @canonicalize_ptr_v4f32(<4 x float> * %out) {
+; Z16-LABEL: canonicalize_ptr_v4f32:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vl %v0, 0(%r2), 3
+; Z16-NEXT:    vrepf %v1, %v0, 3
+; Z16-NEXT:    vgmf %v2, 2, 8
+; Z16-NEXT:    vrepf %v3, %v0, 2
+; Z16-NEXT:    meebr %f1, %f2
+; Z16-NEXT:    meebr %f3, %f2
+; Z16-NEXT:    vmrhf %v1, %v3, %v1
+; Z16-NEXT:    wfmsb %f3, %f0, %f2
+; Z16-NEXT:    vrepf %v0, %v0, 1
+; Z16-NEXT:    meebr %f0, %f2
+; Z16-NEXT:    vmrhf %v0, %v3, %v0
+; Z16-NEXT:    vmrhg %v0, %v0, %v1
+; Z16-NEXT:    vst %v0, 0(%r2), 3
+; Z16-NEXT:    br %r14
+  %val = load <4 x float>, <4 x float> * %out
+  %canonicalized = call <4 x float> @llvm.canonicalize.v4f32(<4 x float> %val)
+  store <4 x float> %canonicalized, <4 x float> * %out
+  ret void
+}
+
+define void @canonicalize_ptr_v4f64(<4 x double> * %out) {
+; Z16-LABEL: canonicalize_ptr_v4f64:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vl %v1, 16(%r2), 4
+; Z16-NEXT:    vgmg %v2, 2, 11
+; Z16-NEXT:    wfmdb %f3, %f1, %f2
+; Z16-NEXT:    vrepg %v1, %v1, 1
+; Z16-NEXT:    mdbr %f1, %f2
+; Z16-NEXT:    vl %v0, 0(%r2), 4
+; Z16-NEXT:    vmrhg %v1, %v3, %v1
+; Z16-NEXT:    wfmdb %f3, %f0, %f2
+; Z16-NEXT:    vrepg %v0, %v0, 1
+; Z16-NEXT:    mdbr %f0, %f2
+; Z16-NEXT:    vmrhg %v0, %v3, %v0
+; Z16-NEXT:    vst %v0, 0(%r2), 4
+; Z16-NEXT:    vst %v1, 16(%r2), 4
+; Z16-NEXT:    br %r14
+  %val = load <4 x double>, <4 x double> * %out
+  %canonicalized = call <4 x double> @llvm.canonicalize.v4f64(<4 x double> %val)
+  store <4 x double> %canonicalized, <4 x double> * %out
+  ret void
+}
+
+define void @canonicalize_undef(double * %out) {
+; Z16-LABEL: canonicalize_undef:
+; Z16:       # %bb.0:
+; Z16-NEXT:    llihh %r0, 32760
+; Z16-NEXT:    stg %r0, 0(%r2)
+; Z16-NEXT:    br %r14
+  %canonicalized = call double @llvm.canonicalize.f64(double undef)
+  store double %canonicalized, double * %out
+  ret void
+}

llvmbot · 2025-05-30T08:16:36Z

@llvm/pr-subscribers-backend-systemz

Author: Dominik Steenken (dominik-steenken)

Changes

This PR takes the work previously done by @pawan-nirpal-031 on X86 in #106370, and makes it available in common code. This should enable all targets to use __builtin_canonicalize at least for f32 and f64 data types.

Canonicalization is implemented here as multiplication by 1.0, as suggested in the docs.

Moving the X86 implementation to common code was suggested in the reviews of #106370 by @arsenm .

@pawan-nirpal-031 I hope you don't mind that i went ahead with this to getit working on SystemZ.

I don't have a lot of experience with this part of the code base yet, so reviews are very welcome.

Full diff: https://github.com/llvm/llvm-project/pull/142105.diff

3 Files Affected:

(modified) llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (+17)
(modified) llvm/lib/CodeGen/TargetLoweringBase.cpp (+3)
(added) llvm/test/CodeGen/SystemZ/canonicalize-vars.ll (+141)

diff --git a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
index 528c07cc5549d..6599704aaf290 100644
--- a/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
@@ -3356,6 +3356,23 @@ bool SelectionDAGLegalize::ExpandNode(SDNode *Node) {
     Results.push_back(Op);
     break;
   }
+  case ISD::FCANONICALIZE: {
+    // This implements llvm.canonicalize.f* by multiplication with 1.0,
+    // as suggested in https://llvm.org/docs/LangRef.html#id2335.
+    // Get operand x.
+    SDValue Operand = Node->getOperand(0);
+    // Get fp value type used.
+    EVT VT = Operand.getValueType();
+    // Produce appropriately-typed 1.0 constant.
+    SDValue One = DAG.getConstantFP(1.0, dl, VT);
+    // Produce multiplication node x * 1.0.
+    SDValue Chain = DAG.getEntryNode();
+    SDValue Mul = DAG.getNode(ISD::STRICT_FMUL, dl, {VT, MVT::Other},
+                              {Chain, Operand, One});
+
+    Results.push_back(Mul);
+    break;
+  }
   case ISD::SIGN_EXTEND_INREG: {
     EVT ExtraVT = cast<VTSDNode>(Node->getOperand(1))->getVT();
     EVT VT = Node->getValueType(0);
diff --git a/llvm/lib/CodeGen/TargetLoweringBase.cpp b/llvm/lib/CodeGen/TargetLoweringBase.cpp
index 935afaf9dd550..ba765afd80bf1 100644
--- a/llvm/lib/CodeGen/TargetLoweringBase.cpp
+++ b/llvm/lib/CodeGen/TargetLoweringBase.cpp
@@ -870,6 +870,9 @@ void TargetLoweringBase::initActions() {
                       ISD::FATAN2},
                      {MVT::f32, MVT::f64, MVT::f128}, Expand);
 
+  // Insert custom handling default for llvm.canonicalize.*.
+  setOperationAction(ISD::FCANONICALIZE, {MVT::f32, MVT::f64}, Expand);
+
   // FIXME: Query RuntimeLibCalls to make the decision.
   setOperationAction({ISD::LRINT, ISD::LLRINT, ISD::LROUND, ISD::LLROUND},
                      {MVT::f32, MVT::f64, MVT::f128}, LibCall);
diff --git a/llvm/test/CodeGen/SystemZ/canonicalize-vars.ll b/llvm/test/CodeGen/SystemZ/canonicalize-vars.ll
new file mode 100644
index 0000000000000..477f5e1547567
--- /dev/null
+++ b/llvm/test/CodeGen/SystemZ/canonicalize-vars.ll
@@ -0,0 +1,141 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --default-march s390x-unknown-linux-gnu --version 5
+; RUN: llc -mtriple=s390x-linux-gnu -mcpu=z16 < %s | FileCheck %s -check-prefixes=Z16
+
+define float @canonicalize_fp32(float %a) {
+; Z16-LABEL: canonicalize_fp32:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vgmf %v1, 2, 8
+; Z16-NEXT:    meebr %f0, %f1
+; Z16-NEXT:    br %r14
+  %canonicalized = call float @llvm.canonicalize.f32(float %a)
+  ret float %canonicalized
+}
+
+define double @canonicalize_fp64(double %a) {
+; Z16-LABEL: canonicalize_fp64:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vgmg %v1, 2, 11
+; Z16-NEXT:    mdbr %f0, %f1
+; Z16-NEXT:    br %r14
+  %canonicalized = call double @llvm.canonicalize.f64(double %a)
+  ret double %canonicalized
+}
+
+define void @canonicalize_ptr_f32(float * %out) {
+; Z16-LABEL: canonicalize_ptr_f32:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vgmf %v0, 2, 8
+; Z16-NEXT:    meeb %f0, 0(%r2)
+; Z16-NEXT:    ste %f0, 0(%r2)
+; Z16-NEXT:    br %r14
+  %val = load float, float * %out
+  %canonicalized = call float @llvm.canonicalize.f32(float %val)
+  store float %canonicalized, float * %out
+  ret void
+}
+
+define void @canonicalize_ptr_f64(double * %out) {
+; Z16-LABEL: canonicalize_ptr_f64:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vgmg %v0, 2, 11
+; Z16-NEXT:    mdb %f0, 0(%r2)
+; Z16-NEXT:    std %f0, 0(%r2)
+; Z16-NEXT:    br %r14
+  %val = load double, double * %out
+  %canonicalized = call double @llvm.canonicalize.f64(double %val)
+  store double %canonicalized, double * %out
+  ret void
+}
+
+define <4 x float> @canonicalize_v4f32(<4 x float> %a) {
+; Z16-LABEL: canonicalize_v4f32:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vrepf %v0, %v24, 3
+; Z16-NEXT:    vgmf %v1, 2, 8
+; Z16-NEXT:    vrepf %v2, %v24, 2
+; Z16-NEXT:    meebr %f0, %f1
+; Z16-NEXT:    meebr %f2, %f1
+; Z16-NEXT:    vrepf %v3, %v24, 1
+; Z16-NEXT:    vmrhf %v0, %v2, %v0
+; Z16-NEXT:    wfmsb %f2, %v24, %f1
+; Z16-NEXT:    wfmsb %f1, %f3, %f1
+; Z16-NEXT:    vmrhf %v1, %v2, %v1
+; Z16-NEXT:    vmrhg %v24, %v1, %v0
+; Z16-NEXT:    br %r14
+  %canonicalized = call <4 x float> @llvm.canonicalize.v4f32(<4 x float> %a)
+  ret <4 x float> %canonicalized
+}
+
+define <4 x double> @canonicalize_v4f64(<4 x double> %a) {
+; Z16-LABEL: canonicalize_v4f64:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vgmg %v0, 2, 11
+; Z16-NEXT:    vrepg %v2, %v24, 1
+; Z16-NEXT:    wfmdb %f1, %v24, %f0
+; Z16-NEXT:    mdbr %f2, %f0
+; Z16-NEXT:    vmrhg %v24, %v1, %v2
+; Z16-NEXT:    vrepg %v2, %v26, 1
+; Z16-NEXT:    wfmdb %f1, %v26, %f0
+; Z16-NEXT:    wfmdb %f0, %f2, %f0
+; Z16-NEXT:    vmrhg %v26, %v1, %v0
+; Z16-NEXT:    br %r14
+  %canonicalized = call <4 x double> @llvm.canonicalize.v4f64(<4 x double> %a)
+  ret <4 x double> %canonicalized
+}
+
+define void @canonicalize_ptr_v4f32(<4 x float> * %out) {
+; Z16-LABEL: canonicalize_ptr_v4f32:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vl %v0, 0(%r2), 3
+; Z16-NEXT:    vrepf %v1, %v0, 3
+; Z16-NEXT:    vgmf %v2, 2, 8
+; Z16-NEXT:    vrepf %v3, %v0, 2
+; Z16-NEXT:    meebr %f1, %f2
+; Z16-NEXT:    meebr %f3, %f2
+; Z16-NEXT:    vmrhf %v1, %v3, %v1
+; Z16-NEXT:    wfmsb %f3, %f0, %f2
+; Z16-NEXT:    vrepf %v0, %v0, 1
+; Z16-NEXT:    meebr %f0, %f2
+; Z16-NEXT:    vmrhf %v0, %v3, %v0
+; Z16-NEXT:    vmrhg %v0, %v0, %v1
+; Z16-NEXT:    vst %v0, 0(%r2), 3
+; Z16-NEXT:    br %r14
+  %val = load <4 x float>, <4 x float> * %out
+  %canonicalized = call <4 x float> @llvm.canonicalize.v4f32(<4 x float> %val)
+  store <4 x float> %canonicalized, <4 x float> * %out
+  ret void
+}
+
+define void @canonicalize_ptr_v4f64(<4 x double> * %out) {
+; Z16-LABEL: canonicalize_ptr_v4f64:
+; Z16:       # %bb.0:
+; Z16-NEXT:    vl %v1, 16(%r2), 4
+; Z16-NEXT:    vgmg %v2, 2, 11
+; Z16-NEXT:    wfmdb %f3, %f1, %f2
+; Z16-NEXT:    vrepg %v1, %v1, 1
+; Z16-NEXT:    mdbr %f1, %f2
+; Z16-NEXT:    vl %v0, 0(%r2), 4
+; Z16-NEXT:    vmrhg %v1, %v3, %v1
+; Z16-NEXT:    wfmdb %f3, %f0, %f2
+; Z16-NEXT:    vrepg %v0, %v0, 1
+; Z16-NEXT:    mdbr %f0, %f2
+; Z16-NEXT:    vmrhg %v0, %v3, %v0
+; Z16-NEXT:    vst %v0, 0(%r2), 4
+; Z16-NEXT:    vst %v1, 16(%r2), 4
+; Z16-NEXT:    br %r14
+  %val = load <4 x double>, <4 x double> * %out
+  %canonicalized = call <4 x double> @llvm.canonicalize.v4f64(<4 x double> %val)
+  store <4 x double> %canonicalized, <4 x double> * %out
+  ret void
+}
+
+define void @canonicalize_undef(double * %out) {
+; Z16-LABEL: canonicalize_undef:
+; Z16:       # %bb.0:
+; Z16-NEXT:    llihh %r0, 32760
+; Z16-NEXT:    stg %r0, 0(%r2)
+; Z16-NEXT:    br %r14
+  %canonicalized = call double @llvm.canonicalize.f64(double undef)
+  store double %canonicalized, double * %out
+  ret void
+}

github-actions · 2025-05-30T08:18:22Z

✅ With the latest revision this PR passed the undef deprecator.

arsenm · 2025-05-30T09:04:34Z

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

+    // This implements llvm.canonicalize.f* by multiplication with 1.0,
+    // as suggested in https://llvm.org/docs/LangRef.html#id2335.


This should probably elaborate on how this is hacking in strictfp operations in non-strict functions

From what i understand, the point of the strictfp operation here is to prevent any optimization that would, for instance, recognize that 1.0 * x can be converted to just x. So would this suffice as an addition to the comment?

// We use `STRICT_FMUL` here to prevent this operation from being optimized out.

pawan-nirpal-031 · 2025-05-30T13:54:46Z

Hi @dominik-steenken absolutely, Please go ahead with this, I have been wanting to get back and finish on the commitments as following PRs, this is much welcome. I will make some time to support canonicalization of denormals ( I have a PR just haven't been able to attend to it, if you need it sooner please let me know, I'll give you the patch and you may go ahead with it ) Thanks for your PR.

dominik-steenken · 2025-06-05T11:35:02Z

Ok, so i updated the comment as requested, got rid of the undef check in order to remove the error from the undef deprecator and added a bunch of setOperationAction(..., Legal) to various backends that had independently handled fcanonicalize via their Tablegen files in order to get the tests clean again. Would appreciate another review, especially from folks of the backends that i had to touch in order to fix the tests :).

jayfoad · 2025-06-05T12:19:09Z

__builtin.canonicalize in common code

This patch is about the LLVM intrinsic, not the Clang builtin, so the title should say llvm.canonicalize

dominik-steenken · 2025-06-12T07:24:46Z

any further concerns?

arsenm · 2025-06-12T07:30:39Z

Patch title is now less clear, this is DAG codegen support

dominik-steenken · 2025-06-25T12:42:41Z

Patch title is now less clear, this is DAG codegen support

So, something like "DAG codegen support for llvm.canonicalize"?

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

RKSimon · 2025-06-25T15:29:45Z

Patch title is now less clear, this is DAG codegen support

So, something like "DAG codegen support for llvm.canonicalize"?

How about "[DAG] Add generic expansion for ISD::FCANONICALIZE nodes"?

The summary description still needs work

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

arsenm · 2025-06-27T08:08:49Z

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

+    SDValue Mul = DAG.getNode(ISD::STRICT_FMUL, dl, {VT, MVT::Other},
+                              {Chain, Operand, One});
+
+    Results.push_back(Mul);


Ignoring the new result chain feels wrong but it's probably correct in this context

@dominik-steenken Add a comment mentioning that ignoring the output chain was intended.

arsenm · 2025-06-27T08:09:23Z

llvm/test/CodeGen/SystemZ/canonicalize-vars.ll

+  ret double %canonicalized
+}
+
+define void @canonicalize_ptr_f32(float * %out) {


Test should be updated to use opaque pointers

ok, i've replaced float *, double *, <4 x float> * and <4 x double> *withptr`.

arsenm · 2025-06-27T08:10:39Z

llvm/lib/CodeGen/TargetLoweringBase.cpp

+  // Insert custom handling default for llvm.canonicalize.*.
+  setOperationAction(ISD::FCANONICALIZE, {MVT::f32, MVT::f64}, Expand);
+


If we'r going to expand by default, it should do it for all types, not just f32/f64

Ok, i have added expansion for f16 and f128. Should i also add other types that are specific to certain platforms, such as bfloat, x86_fp80, or ppc_fp128?

arsenm · 2025-06-27T08:10:59Z

llvm/test/CodeGen/SystemZ/canonicalize-vars.ll

+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --default-march s390x-unknown-linux-gnu --version 5
+; RUN: llc -mtriple=s390x-linux-gnu -mcpu=z16 < %s | FileCheck %s -check-prefixes=Z16
+
+define float @canonicalize_fp32(float %a) {


No half tests? Also are we missing tests in the other touched targets?

I have added half and extended fp tests. From what i understand, the other touched targets had already implemented their own legalization of fcanonicalize, and would have added tests for all of their supported floating point types then. This PR should not change anything there - the targets only needed to be touched since they assumed that common code would not expand by default.

arsenm · 2025-06-27T08:11:18Z

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

+    // Produce multiplication node x * 1.0.
+    SDValue Chain = DAG.getEntryNode();
+    SDValue Mul = DAG.getNode(ISD::STRICT_FMUL, dl, {VT, MVT::Other},
+                              {Chain, Operand, One});


Losing the fast math flags. You can propagate the existing flags, and additionally add NoFPExcept

I'm now propagating the flags and have added NoFPExcept.

dominik-steenken · 2025-06-30T10:10:46Z

Patch title is now less clear, this is DAG codegen support

So, something like "DAG codegen support for llvm.canonicalize"?

How about "[DAG] Add generic expansion for ISD::FCANONICALIZE nodes"?

The summary description still needs work

I have rebased and reworded my commits. Does that suffice? I can also reword/edit the PR description if needed.

Previously, this expansion for the intrinsic has been implemented in the X86 backend. As suggested in the comments for the PR that merges that implementation (llvm#106370), this commit replicates that implementation (at least for `f(16|32|64|128)`) in common code. Canonicalization is thus implemented here as multiplication with 1.0. This commit also touches the following backends: AArch64, AMDGPU, Mips, PowerPC, and LoongArch. These backends previously added their own legalization for `ISD::FCANONICALIZE`, but did not set the OperationAction accordingly, since there was no default OperationAction in common code. Thus, this commit adds the required `setOperationAction` calls and is otherwise intended as NFC for these backends.

This new test file essentially checks the code generation for the common code expansion of `ISD::FCANONICALIZE`, since SystemZ has no custom handling for it at this time. The tests are heavily based on the equivalent tests for X86.

dominik-steenken · 2025-07-02T07:59:22Z

I believe the CI check fails were issues introduced into main by #146048 and #146247, but which should be fixed by now. I have rebased and the branch, CI is running.

RKSimon

A couple of minors but otherwise LGTM @arsenm ?

RKSimon · 2025-07-04T07:48:29Z

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

+    SDValue Mul = DAG.getNode(ISD::STRICT_FMUL, dl, {VT, MVT::Other},
+                              {Chain, Operand, One});
+
+    Results.push_back(Mul);


@dominik-steenken Add a comment mentioning that ignoring the output chain was intended.

RKSimon · 2025-07-04T07:50:22Z

llvm/test/CodeGen/SystemZ/canonicalize-vars.ll

+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --default-march s390x-unknown-linux-gnu --version 5
+; RUN: llc -mtriple=s390x-linux-gnu -mcpu=z16 < %s | FileCheck %s -check-prefixes=Z16
+
+define half @canonicalize_fp16(half %a) {


use ) nounwind { to hide all the cfi noise from all the tests

llvmbot added backend:SystemZ llvm:SelectionDAG SelectionDAGISel as well labels May 30, 2025

arsenm reviewed May 30, 2025

View reviewed changes

llvmbot added backend:AArch64 backend:AMDGPU labels Jun 3, 2025

dominik-steenken force-pushed the generic-canonicalize branch from f08a196 to ce1fea8 Compare June 5, 2025 08:41

llvmbot added backend:PowerPC llvm:codegen labels Jun 5, 2025

dominik-steenken force-pushed the generic-canonicalize branch from ce1fea8 to 524580c Compare June 5, 2025 09:11

llvmbot added the backend:loongarch label Jun 5, 2025

dominik-steenken force-pushed the generic-canonicalize branch from 524580c to 07f5741 Compare June 5, 2025 09:13

dominik-steenken changed the title ~~__builtin.canonicalize in common code~~ llvm.canonicalize in common code Jun 6, 2025

RKSimon mentioned this pull request Jun 12, 2025

[X86] Remove LowerFCanonicalize and use generic expansion #143862

Open

RKSimon reviewed Jun 25, 2025

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Outdated Show resolved Hide resolved

llvmbot added the backend:MIPS label Jun 25, 2025

dominik-steenken changed the title ~~llvm.canonicalize in common code~~ [DAG] Add generic expansion for ISD::FCANONICALIZE nodes Jun 25, 2025

RKSimon reviewed Jun 26, 2025

View reviewed changes

llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Outdated Show resolved Hide resolved

arsenm reviewed Jun 27, 2025

View reviewed changes

dominik-steenken force-pushed the generic-canonicalize branch from badd5c1 to fdb8651 Compare June 30, 2025 10:09

dominik-steenken added 2 commits July 2, 2025 09:57

dominik-steenken force-pushed the generic-canonicalize branch from fdb8651 to 4ea37a3 Compare July 2, 2025 07:57

RKSimon reviewed Jul 4, 2025

View reviewed changes

		// This implements llvm.canonicalize.f* by multiplication with 1.0,
		// as suggested in https://llvm.org/docs/LangRef.html#id2335.

		// Insert custom handling default for llvm.canonicalize.*.
		setOperationAction(ISD::FCANONICALIZE, {MVT::f32, MVT::f64}, Expand);

[DAG] Add generic expansion for ISD::FCANONICALIZE nodes #142105

Are you sure you want to change the base?

[DAG] Add generic expansion for ISD::FCANONICALIZE nodes #142105

Conversation

dominik-steenken commented May 30, 2025 • edited by RKSimon Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dominik-steenken commented May 30, 2025

Uh oh!

llvmbot commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented May 30, 2025

Uh oh!

github-actions bot commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pawan-nirpal-031 commented May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dominik-steenken commented Jun 5, 2025

Uh oh!

jayfoad commented Jun 5, 2025

Uh oh!

dominik-steenken commented Jun 12, 2025

Uh oh!

arsenm commented Jun 12, 2025

Uh oh!

dominik-steenken commented Jun 25, 2025

Uh oh!

Uh oh!

RKSimon commented Jun 25, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dominik-steenken commented Jun 30, 2025

Uh oh!

dominik-steenken commented Jul 2, 2025

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dominik-steenken commented May 30, 2025 •

edited by RKSimon

Loading

llvmbot commented May 30, 2025 •

edited

Loading

github-actions bot commented May 30, 2025 •

edited

Loading

pawan-nirpal-031 commented May 30, 2025 •

edited

Loading