[VPlan] Directly unroll VectorEndPointerRecipe #172372

artagnon · 2025-12-15T21:33:16Z

Directly unroll VectorEndPointerRecipe following 0636225 ([VPlan] Directly unroll VectorPointerRecipe). It allows us to leverage existing VPlan simplifications to optimize.

llvmbot · 2025-12-15T21:33:49Z

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-backend-powerpc

Author: Ramkumar Ramachandra (artagnon)

Changes

Directly unroll VectorEndPointerRecipe following 0636225 ([VPlan] Directly unroll VectorPointerRecipe). It allows us to leverage existing VPlan simplifications to optimize.

Patch is 183.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/172372.diff

43 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+22-3)
(modified) llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h (+9)
(modified) llvm/lib/Transforms/Vectorize/VPlan.h (+13-13)
(modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+4-22)
(modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+45-16)
(modified) llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp (+20-5)
(modified) llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+6-10)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse.ll (+12-24)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse.ll (+4-6)
(modified) llvm/test/Transforms/LoopVectorize/ARM/mve-gather-scatter-tailpred.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/ARM/tail-folding-not-allowed.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/PowerPC/optimal-epilog-vectorization.ll (+18-36)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll (+84-168)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reverse-load-store.ll (+30-68)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-uniform-store.ll (+5-6)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/vplan-riscv-vector-reverse.ll (+15-2)
(modified) llvm/test/Transforms/LoopVectorize/WebAssembly/induction-branch-cost.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/X86/drop-inbounds-flags-for-reverse-vector-pointer.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll (+46-70)
(modified) llvm/test/Transforms/LoopVectorize/X86/multi-exit-cost.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/X86/pr81872.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/consecutive-ptr-uniforms.ll (+3-6)
(modified) llvm/test/Transforms/LoopVectorize/if-reduction.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/induction-wrapflags.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/interleave-with-i65-induction.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses-gep-nowrap-flags.ll (+6-12)
(modified) llvm/test/Transforms/LoopVectorize/iv-select-cmp-decreasing.ll (+12-24)
(modified) llvm/test/Transforms/LoopVectorize/iv_outside_user.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/load-deref-pred-align.ll (+4-8)
(modified) llvm/test/Transforms/LoopVectorize/optimal-epilog-vectorization.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/pointer-induction.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/pr37248.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/reuse-lcssa-phi-scev-expansion.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/reverse-induction-gep-nowrap-flags.ll (+3-6)
(modified) llvm/test/Transforms/LoopVectorize/reverse_induction.ll (+10-20)
(modified) llvm/test/Transforms/LoopVectorize/runtime-check-known-true.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/runtime-check-needed-but-empty.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/runtime-checks-hoist.ll (+18-20)
(modified) llvm/test/Transforms/LoopVectorize/single-early-exit-interleave.ll (+16-32)
(modified) llvm/test/Transforms/LoopVectorize/single_early_exit_live_outs.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/skeleton-lcssa-crash.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/vplan-sink-scalars-and-merge.ll (+3-1)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 082489f70f1c6..05e8aa81ecf46 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7660,9 +7660,9 @@ VPWidenMemoryRecipe *VPRecipeBuilder::tryToWidenMemory(VPInstruction *VPI,
           CM.foldTailByMasking() || !GEP
               ? GEPNoWrapFlags::none()
               : GEP->getNoWrapFlags().withoutNoUnsignedWrap();
-      VectorPtr = new VPVectorEndPointerRecipe(
-          Ptr, &Plan.getVF(), getLoadStoreType(I),
-          /*Stride*/ -1, Flags, VPI->getDebugLoc());
+      VectorPtr = createVectorEndPointerRecipe(Ptr, getLoadStoreType(I),
+                                               /*Stride*/ -1, Flags,
+                                               VPI->getDebugLoc());
     } else {
       VectorPtr = new VPVectorPointerRecipe(Ptr, getLoadStoreType(I),
                                             GEP ? GEP->getNoWrapFlags()
@@ -8351,6 +8351,25 @@ VPRecipeBuilder::tryToCreatePartialReduction(VPInstruction *Reduction,
       RdxUnordered{/*VFScaleFactor=*/ScaleFactor}, ReductionI->getDebugLoc());
 }
 
+VPVectorEndPointerRecipe *VPRecipeBuilder::createVectorEndPointerRecipe(
+    VPValue *Ptr, Type *SourceElementType, int64_t Stride,
+    GEPNoWrapFlags GEPFlags, DebugLoc DbgLoc) {
+  // Offset for Part 0 = Stride * (VF - 1).
+  VPTypeAnalysis TypeInfo(Plan);
+  const DataLayout &DL =
+      Plan.getScalarHeader()->getIRBasicBlock()->getDataLayout();
+  Type *IndexTy = DL.getIndexType(TypeInfo.inferScalarType(Ptr));
+  Type *VFTy = TypeInfo.inferScalarType(&Plan.getVF());
+  VPValue *VF = Builder.createScalarZExtOrTrunc(&Plan.getVF(), IndexTy, VFTy,
+                                                DebugLoc::getUnknown());
+  VPValue *VFMinusOne = Builder.createOverflowingOp(
+      Instruction::Sub, {VF, Plan.getConstantInt(IndexTy, 1)}, {true, true});
+  VPValue *StridexVFMinusOne = Builder.createOverflowingOp(
+      Instruction::Mul, {VFMinusOne, Plan.getConstantInt(IndexTy, Stride)});
+  return new VPVectorEndPointerRecipe(Ptr, StridexVFMinusOne, SourceElementType,
+                                      Stride, GEPFlags, DbgLoc);
+}
+
 void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF,
                                                         ElementCount MaxVF) {
   if (ElementCount::isKnownGT(MinVF, MaxVF))
diff --git a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
index 1808be118cd2a..54d21ec00a932 100644
--- a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
+++ b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
@@ -149,6 +149,9 @@ class VPRecipeBuilder {
                                           : std::make_optional(It->second);
   }
 
+  void setInsertPoint(VPRecipeBase *IP) { Builder.setInsertPoint(IP); }
+  VPBuilder &getBuilder() const { return Builder; }
+
   /// Find all possible partial reductions in the loop and track all of those
   /// that are valid so recipes can be formed later.
   void collectScaledReductions(VFRange &Range);
@@ -162,6 +165,12 @@ class VPRecipeBuilder {
   VPRecipeBase *tryToCreatePartialReduction(VPInstruction *Reduction,
                                             unsigned ScaleFactor);
 
+  /// Create and return a VectorEndPointer recipe.
+  VPVectorEndPointerRecipe *
+  createVectorEndPointerRecipe(VPValue *Ptr, Type *SourceElementType,
+                               int64_t Stride, GEPNoWrapFlags GEPFlags,
+                               DebugLoc DbgLoc);
+
   /// Set the recipe created for given ingredient.
   void setRecipe(Instruction *I, VPRecipeBase *R) {
     assert(!Ingredient2Recipe.contains(I) &&
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 7023a5d4af176..c4010afbe3ef1 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1907,27 +1907,27 @@ class LLVM_ABI_FOR_TEST VPWidenGEPRecipe : public VPRecipeWithIRFlags {
 /// A recipe to compute a pointer to the last element of each part of a widened
 /// memory access for widened memory accesses of IndexedTy. Used for
 /// VPWidenMemoryRecipes or VPInterleaveRecipes that are reversed.
-class VPVectorEndPointerRecipe : public VPRecipeWithIRFlags,
-                                 public VPUnrollPartAccessor<2> {
-  Type *IndexedTy;
+class VPVectorEndPointerRecipe : public VPRecipeWithIRFlags {
+  Type *SourceElementTy;
 
   /// The constant stride of the pointer computed by this recipe, expressed in
-  /// units of IndexedTy.
+  /// units of SourceElementTy.
   int64_t Stride;
 
 public:
-  VPVectorEndPointerRecipe(VPValue *Ptr, VPValue *VF, Type *IndexedTy,
+  VPVectorEndPointerRecipe(VPValue *Ptr, VPValue *Offset, Type *SourceElementTy,
                            int64_t Stride, GEPNoWrapFlags GEPFlags, DebugLoc DL)
-      : VPRecipeWithIRFlags(VPDef::VPVectorEndPointerSC,
-                            ArrayRef<VPValue *>({Ptr, VF}), GEPFlags, DL),
-        IndexedTy(IndexedTy), Stride(Stride) {
+      : VPRecipeWithIRFlags(VPDef::VPVectorEndPointerSC, {Ptr, Offset},
+                            GEPFlags, DL),
+        SourceElementTy(SourceElementTy), Stride(Stride) {
     assert(Stride < 0 && "Stride must be negative");
   }
 
   VP_CLASSOF_IMPL(VPDef::VPVectorEndPointerSC)
 
-  VPValue *getVFValue() { return getOperand(1); }
-  const VPValue *getVFValue() const { return getOperand(1); }
+  int64_t getStride() const { return Stride; }
+  Type *getSourceElementType() const { return SourceElementTy; }
+  VPValue *getOffset() const { return getOperand(1); }
 
   void execute(VPTransformState &State) override;
 
@@ -1953,9 +1953,9 @@ class VPVectorEndPointerRecipe : public VPRecipeWithIRFlags,
   }
 
   VPVectorEndPointerRecipe *clone() override {
-    return new VPVectorEndPointerRecipe(getOperand(0), getVFValue(), IndexedTy,
-                                        Stride, getGEPNoWrapFlags(),
-                                        getDebugLoc());
+    return new VPVectorEndPointerRecipe(getOperand(0), getOffset(),
+                                        getSourceElementType(), getStride(),
+                                        getGEPNoWrapFlags(), getDebugLoc());
   }
 
 protected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index fb7aaec4e93a9..443a95104e722 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -2488,28 +2488,10 @@ void VPWidenGEPRecipe::printRecipe(raw_ostream &O, const Twine &Indent,
 
 void VPVectorEndPointerRecipe::execute(VPTransformState &State) {
   auto &Builder = State.Builder;
-  unsigned CurrentPart = getUnrollPart(*this);
-  const DataLayout &DL = Builder.GetInsertBlock()->getDataLayout();
-  Type *IndexTy = DL.getIndexType(State.TypeAnalysis.inferScalarType(this));
-
-  // The wide store needs to start at the last vector element.
-  Value *RunTimeVF = State.get(getVFValue(), VPLane(0));
-  if (IndexTy != RunTimeVF->getType())
-    RunTimeVF = Builder.CreateZExtOrTrunc(RunTimeVF, IndexTy);
-  // NumElt = Stride * CurrentPart * RunTimeVF
-  Value *NumElt = Builder.CreateMul(
-      ConstantInt::get(IndexTy, Stride * (int64_t)CurrentPart), RunTimeVF);
-  // LastLane = Stride * (RunTimeVF - 1)
-  Value *LastLane = Builder.CreateSub(RunTimeVF, ConstantInt::get(IndexTy, 1));
-  if (Stride != 1)
-    LastLane =
-        Builder.CreateMul(ConstantInt::getSigned(IndexTy, Stride), LastLane);
-  Value *Ptr = State.get(getOperand(0), VPLane(0));
-  Value *ResultPtr =
-      Builder.CreateGEP(IndexedTy, Ptr, NumElt, "", getGEPNoWrapFlags());
-  ResultPtr = Builder.CreateGEP(IndexedTy, ResultPtr, LastLane, "",
-                                getGEPNoWrapFlags());
-
+  Value *Ptr = State.get(getOperand(0), true);
+  Value *Offset = State.get(getOffset(), true);
+  Value *ResultPtr = Builder.CreateGEP(getSourceElementType(), Ptr, Offset, "",
+                                       getGEPNoWrapFlags());
   State.set(this, ResultPtr, /*IsScalar*/ true);
 }
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 9c4f4246deb41..45238d21996a0 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -783,6 +783,15 @@ static SmallVector<VPUser *> collectUsersRecursively(VPValue *V) {
   return Users.takeVector();
 }
 
+static SmallVector<VPValue *> collectOperandsRecursively(VPRecipeBase *R) {
+  SetVector<VPValue *> Operands(llvm::from_range, R->operands());
+  for (unsigned I = 0; I != Operands.size(); ++I) {
+    if (VPRecipeBase *Cur = Operands[I]->getDefiningRecipe())
+      Operands.insert_range(Cur->operands());
+  }
+  return Operands.takeVector();
+}
+
 /// Scalarize a VPWidenPointerInductionRecipe by replacing it with a PtrAdd
 /// (IndStart, ScalarIVSteps (0, Step)). This is used when the recipe only
 /// generates scalar values.
@@ -2843,11 +2852,30 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
   VPValue *Addr, *Mask, *EndPtr;
 
   /// Adjust any end pointers so that they point to the end of EVL lanes not VF.
-  auto AdjustEndPtr = [&CurRecipe, &EVL](VPValue *EndPtr) {
-    auto *EVLEndPtr = cast<VPVectorEndPointerRecipe>(EndPtr)->clone();
-    EVLEndPtr->insertBefore(&CurRecipe);
-    EVLEndPtr->setOperand(1, &EVL);
-    return EVLEndPtr;
+  auto AdjustEndPtr = [&EVL, &Plan, &TypeInfo](VPValue *EndPtr) {
+    auto *VEPR = cast<VPVectorEndPointerRecipe>(EndPtr);
+    SmallVector<VPValue *> Operands({VEPR->getOffset()});
+    VPRecipeBase *OffsetR = VEPR->getOffset()->getDefiningRecipe();
+    append_range(Operands, collectOperandsRecursively(OffsetR));
+    for (VPValue *Op : reverse(Operands)) {
+      if (!Op->hasDefiningRecipe())
+        continue;
+      VPRecipeBase *Clone = Op->getDefiningRecipe()->clone();
+      Clone->insertBefore(VEPR);
+      VPRecipeBase *EVLR = EVL.getDefiningRecipe();
+      VPBuilder Builder(EVLR->getParent(), std::next(EVLR->getIterator()));
+      Type *EVLType = TypeInfo.inferScalarType(&EVL);
+      VPValue *EVLCast = Builder.createScalarZExtOrTrunc(
+          &EVL, TypeInfo.inferScalarType(&Plan->getVF()), EVLType,
+          EVLR->getDebugLoc());
+      Clone->replaceUsesOfWith(&Plan->getVF(), EVLCast);
+      Op->replaceUsesWithIf(
+          Clone->getVPSingleValue(), [&Operands](VPUser &U, unsigned Idx) {
+            return !is_contained(Operands,
+                                 cast<VPRecipeBase>(U).getVPSingleValue());
+          });
+    }
+    return VEPR;
   };
 
   if (match(&CurRecipe,
@@ -2858,7 +2886,7 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
 
   if (match(&CurRecipe,
             m_MaskedLoad(m_VPValue(EndPtr), m_RemoveMask(HeaderMask, Mask))) &&
-      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_Specific(&Plan->getVF()))) &&
+      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_VPValue())) &&
       cast<VPWidenLoadRecipe>(CurRecipe).isReverse())
     return new VPWidenLoadEVLRecipe(cast<VPWidenLoadRecipe>(CurRecipe),
                                     AdjustEndPtr(EndPtr), EVL, Mask);
@@ -2871,7 +2899,7 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
 
   if (match(&CurRecipe, m_MaskedStore(m_VPValue(EndPtr), m_VPValue(),
                                       m_RemoveMask(HeaderMask, Mask))) &&
-      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_Specific(&Plan->getVF()))) &&
+      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_VPValue())) &&
       cast<VPWidenStoreRecipe>(CurRecipe).isReverse())
     return new VPWidenStoreEVLRecipe(cast<VPWidenStoreRecipe>(CurRecipe),
                                      AdjustEndPtr(EndPtr), EVL, Mask);
@@ -2916,10 +2944,10 @@ static void transformRecipestoEVLRecipes(VPlan &Plan, VPValue &EVL) {
   VPRegionBlock *LoopRegion = Plan.getVectorLoopRegion();
   VPBasicBlock *Header = LoopRegion->getEntryBasicBlock();
 
-  assert(all_of(Plan.getVF().users(),
-                IsaPred<VPVectorEndPointerRecipe, VPScalarIVStepsRecipe,
-                        VPWidenIntOrFpInductionRecipe>) &&
-         "User of VF that we can't transform to EVL.");
+  assert(
+      all_of(Plan.getVF().users(), IsaPred<VPInstruction, VPScalarIVStepsRecipe,
+                                           VPWidenIntOrFpInductionRecipe>) &&
+      "User of VF that we can't transform to EVL.");
   Plan.getVF().replaceUsesWithIf(&EVL, [](VPUser &U, unsigned Idx) {
     return isa<VPWidenIntOrFpInductionRecipe, VPScalarIVStepsRecipe>(U);
   });
@@ -3412,6 +3440,7 @@ void VPlanTransforms::createInterleaveGroups(
     Instruction *IRInsertPos = IG->getInsertPos();
     auto *InsertPos =
         cast<VPWidenMemoryRecipe>(RecipeBuilder.getRecipe(IRInsertPos));
+    RecipeBuilder.setInsertPoint(InsertPos);
 
     GEPNoWrapFlags NW = GEPNoWrapFlags::none();
     if (auto *Gep = dyn_cast<GetElementPtrInst>(
@@ -3421,6 +3450,7 @@ void VPlanTransforms::createInterleaveGroups(
     // Get or create the start address for the interleave group.
     VPValue *Addr = Start->getAddr();
     VPRecipeBase *AddrDef = Addr->getDefiningRecipe();
+    VPBuilder &B = RecipeBuilder.getBuilder();
     if (AddrDef && !VPDT.properlyDominates(AddrDef, InsertPos)) {
       // We cannot re-use the address of member zero because it does not
       // dominate the insert position. Instead, use the address of the insert
@@ -3436,7 +3466,6 @@ void VPlanTransforms::createInterleaveGroups(
                        IG->getIndex(IRInsertPos),
                    /*IsSigned=*/true);
       VPValue *OffsetVPV = Plan.getConstantInt(-Offset);
-      VPBuilder B(InsertPos);
       Addr = B.createNoWrapPtrAdd(InsertPos->getAddr(), OffsetVPV, NW);
     }
     // If the group is reverse, adjust the index to refer to the last vector
@@ -3444,10 +3473,10 @@ void VPlanTransforms::createInterleaveGroups(
     // lane, rather than directly getting the pointer for lane VF - 1, because
     // the pointer operand of the interleaved access is supposed to be uniform.
     if (IG->isReverse()) {
-      auto *ReversePtr = new VPVectorEndPointerRecipe(
-          Addr, &Plan.getVF(), getLoadStoreType(IRInsertPos),
-          -(int64_t)IG->getFactor(), NW, InsertPos->getDebugLoc());
-      ReversePtr->insertBefore(InsertPos);
+      auto *ReversePtr = RecipeBuilder.createVectorEndPointerRecipe(
+          Addr, getLoadStoreType(IRInsertPos), -(int64_t)IG->getFactor(), NW,
+          InsertPos->getDebugLoc());
+      B.insert(ReversePtr);
       Addr = ReversePtr;
     }
     auto *VPIG = new VPInterleaveRecipe(IG, Addr, StoredValues,
diff --git a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
index 8198945764936..d35171929dc5f 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
@@ -313,6 +313,25 @@ void UnrollState::unrollRecipeByUF(VPRecipeBase &R) {
       Copy->addOperand(VFxPart);
       continue;
     }
+    if (auto *VEPR = dyn_cast<VPVectorEndPointerRecipe>(&R)) {
+      VPBuilder Builder(VEPR);
+      VPValue *PrevOffset =
+          cast<VPVectorEndPointerRecipe>(getValueForPart(VEPR, Part - 1))
+              ->getOffset();
+      Type *IndexTy = TypeInfo.inferScalarType(PrevOffset);
+      Type *VFTy = TypeInfo.inferScalarType(&Plan.getVF());
+      VPValue *VF = Builder.createScalarZExtOrTrunc(
+          &Plan.getVF(), IndexTy, VFTy, DebugLoc::getUnknown());
+      // Offset = PrevOffset + Stride * VF.
+      VPValue *VFxStride = Builder.createOverflowingOp(
+          Instruction::Mul,
+          {VF, Plan.getConstantInt(IndexTy, VEPR->getStride())});
+      VPValue *Offset = Builder.createOverflowingOp(Instruction::Add,
+                                                    {PrevOffset, VFxStride});
+      Copy->setOperand(0, VEPR->getOperand(0));
+      Copy->setOperand(1, Offset);
+      continue;
+    }
     if (auto *Red = dyn_cast<VPReductionRecipe>(&R)) {
       auto *Phi = dyn_cast<VPReductionPHIRecipe>(R.getOperand(0));
       if (Phi && Phi->isOrdered()) {
@@ -329,14 +348,10 @@ void UnrollState::unrollRecipeByUF(VPRecipeBase &R) {
 
     // Add operand indicating the part to generate code for, to recipes still
     // requiring it.
-    if (isa<VPScalarIVStepsRecipe, VPWidenCanonicalIVRecipe,
-            VPVectorEndPointerRecipe>(Copy) ||
+    if (isa<VPScalarIVStepsRecipe, VPWidenCanonicalIVRecipe>(Copy) ||
         match(Copy,
               m_VPInstruction<VPInstruction::CanonicalIVIncrementForPart>()))
       Copy->addOperand(getConstantInt(Part));
-
-    if (isa<VPVectorEndPointerRecipe>(R))
-      Copy->setOperand(0, R.getOperand(0));
   }
 }
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp b/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
index 2d63d2a787f88..86c08a9fc2be2 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
@@ -171,8 +171,7 @@ bool VPlanVerifier::verifyEVLRecipe(const VPInstruction &EVL) const {
           }
           return VerifyEVLUse(*R, 2);
         })
-        .Case<VPWidenLoadEVLRecipe, VPVectorEndPointerRecipe,
-              VPInterleaveEVLRecipe>(
+        .Case<VPWidenLoadEVLRecipe, VPInterleaveEVLRecipe>(
             [&](const VPRecipeBase *R) { return VerifyEVLUse(*R, 1); })
         .Case<VPInstructionWithType>(
             [&](const VPInstructionWithType *S) { return VerifyEVLUse(*S, 0); })
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll b/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
index 8935010e71676..cd0edf92622bc 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
@@ -359,6 +359,8 @@ define void @test_reversed_load2_store2(ptr noalias nocapture readonly %A, ptr n
 ; CHECK:       vector.ph:
 ; CHECK-NEXT:    [[TMP0:%.*]] = call i64 @llvm.vscale.i64()
 ; CHECK-NEXT:    [[TMP1:%.*]] = shl nuw nsw i64 [[TMP0]], 2
+; CHECK-NEXT:    [[TMP5:%.*]] = shl nuw nsw i64 [[TMP0]], 3
+; CHECK-NEXT:    [[TMP8:%.*]] = sub nsw i64 2, [[TMP5]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = call <vscale x 4 x i32> @llvm.stepvector.nxv4i32()
 ; CHECK-NEXT:    [[INDUCTION:%.*]] = sub <vscale x 4 x i32> splat (i32 1023), [[TMP2]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = trunc nuw nsw i64 [[TMP1]] to i32
@@ -371,8 +373,6 @@ define void @test_reversed_load2_store2(ptr noalias nocapture readonly %A, ptr n
 ; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <vscale x 4 x i32> [ [[INDUCTION]], [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], [[VECTOR_BODY]] ]
 ; CHECK-NEXT:    [[OFFSET_IDX:%.*]] = sub i64 1023, [[INDEX]]
 ; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_ST2:%.*]], ptr [[A:%.*]], i64 [[OFFSET_IDX]]
-; CHECK-NEXT:    [[TMP6:%.*]] = shl nuw nsw i64 [[TMP0]], 3
-; CHECK-NEXT:    [[TMP8:%.*]] = sub nsw i64 2, [[TMP6]]
 ; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i64 [[TMP8]]
 ; CHECK-NEXT:    [[WIDE_VEC:%.*]] = load <vscale x 8 x i32>, ptr [[TMP9]], align 4
 ; CHECK-NEXT:    [[STRIDED_VEC:%.*]] = call { <vscale x 4 x i32>, <vscale x 4 x i32> } @llvm.vector.deinterleave2.nxv8i32(<vscale x 8 x i32> [[WIDE_VEC]])
@@ -383,9 +383,7 @@ define void @test_reversed_load2_store2(ptr noalias nocapture readonly %A, ptr n
 ; CHECK-NEXT:    [[TMP12:%.*]] = add nsw <vscale x 4 x i32> [[REVERSE]], [[VEC_IND]]
 ; CHECK-NEXT:    [[TMP13:%.*]] = sub nsw <vscale x 4 x i32> [[REVERSE1]], [[VEC_IND]]
 ; CHECK-NEXT:    [[TMP14:%.*]] = getelementptr inbounds [[STRUCT_ST2]], ptr [[B:%.*]], i64 [[OFFSET_IDX]]
-; CHECK-NEXT:    [[TMP15:%.*]] = shl nuw nsw i64 [[TMP0]], 3
-; CHECK-NEXT:    [[TMP18:%.*]] = sub nsw i64 2, [[TMP15]]
-; CHECK-NEXT:    [[TMP19:%.*]] = getelementptr inbounds i32, ptr [[TMP14]], i64 [[TMP18]]
+; CHECK-NEXT:    [[TMP19:%.*]] = getelementptr inbounds i32, ptr [[TMP14]], i64 [[TMP8]]
 ; CHECK-NEXT:    [[REVERSE2:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP12]])
 ; CHECK-NEXT:    [[REVERSE3:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP13]])
 ; CHECK-NEXT:    [[INTERLEAVED_VEC:%.*]] = call <vscale x 8 x i32> @llvm.v...
[truncated]

llvmbot · 2025-12-15T21:33:49Z

@llvm/pr-subscribers-llvm-transforms

Author: Ramkumar Ramachandra (artagnon)

Changes

Directly unroll VectorEndPointerRecipe following 0636225 ([VPlan] Directly unroll VectorPointerRecipe). It allows us to leverage existing VPlan simplifications to optimize.

Patch is 183.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/172372.diff

43 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+22-3)
(modified) llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h (+9)
(modified) llvm/lib/Transforms/Vectorize/VPlan.h (+13-13)
(modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+4-22)
(modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+45-16)
(modified) llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp (+20-5)
(modified) llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+6-10)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse.ll (+12-24)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse.ll (+4-6)
(modified) llvm/test/Transforms/LoopVectorize/ARM/mve-gather-scatter-tailpred.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/ARM/tail-folding-not-allowed.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/PowerPC/optimal-epilog-vectorization.ll (+18-36)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll (+84-168)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reverse-load-store.ll (+30-68)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-uniform-store.ll (+5-6)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/vplan-riscv-vector-reverse.ll (+15-2)
(modified) llvm/test/Transforms/LoopVectorize/WebAssembly/induction-branch-cost.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/X86/drop-inbounds-flags-for-reverse-vector-pointer.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll (+46-70)
(modified) llvm/test/Transforms/LoopVectorize/X86/multi-exit-cost.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/X86/pr81872.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/consecutive-ptr-uniforms.ll (+3-6)
(modified) llvm/test/Transforms/LoopVectorize/if-reduction.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/induction-wrapflags.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/interleave-with-i65-induction.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses-gep-nowrap-flags.ll (+6-12)
(modified) llvm/test/Transforms/LoopVectorize/iv-select-cmp-decreasing.ll (+12-24)
(modified) llvm/test/Transforms/LoopVectorize/iv_outside_user.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/load-deref-pred-align.ll (+4-8)
(modified) llvm/test/Transforms/LoopVectorize/optimal-epilog-vectorization.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/pointer-induction.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/pr37248.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/reuse-lcssa-phi-scev-expansion.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/reverse-induction-gep-nowrap-flags.ll (+3-6)
(modified) llvm/test/Transforms/LoopVectorize/reverse_induction.ll (+10-20)
(modified) llvm/test/Transforms/LoopVectorize/runtime-check-known-true.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/runtime-check-needed-but-empty.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/runtime-checks-hoist.ll (+18-20)
(modified) llvm/test/Transforms/LoopVectorize/single-early-exit-interleave.ll (+16-32)
(modified) llvm/test/Transforms/LoopVectorize/single_early_exit_live_outs.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/skeleton-lcssa-crash.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/vplan-sink-scalars-and-merge.ll (+3-1)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 082489f70f1c6..05e8aa81ecf46 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7660,9 +7660,9 @@ VPWidenMemoryRecipe *VPRecipeBuilder::tryToWidenMemory(VPInstruction *VPI,
           CM.foldTailByMasking() || !GEP
               ? GEPNoWrapFlags::none()
               : GEP->getNoWrapFlags().withoutNoUnsignedWrap();
-      VectorPtr = new VPVectorEndPointerRecipe(
-          Ptr, &Plan.getVF(), getLoadStoreType(I),
-          /*Stride*/ -1, Flags, VPI->getDebugLoc());
+      VectorPtr = createVectorEndPointerRecipe(Ptr, getLoadStoreType(I),
+                                               /*Stride*/ -1, Flags,
+                                               VPI->getDebugLoc());
     } else {
       VectorPtr = new VPVectorPointerRecipe(Ptr, getLoadStoreType(I),
                                             GEP ? GEP->getNoWrapFlags()
@@ -8351,6 +8351,25 @@ VPRecipeBuilder::tryToCreatePartialReduction(VPInstruction *Reduction,
       RdxUnordered{/*VFScaleFactor=*/ScaleFactor}, ReductionI->getDebugLoc());
 }
 
+VPVectorEndPointerRecipe *VPRecipeBuilder::createVectorEndPointerRecipe(
+    VPValue *Ptr, Type *SourceElementType, int64_t Stride,
+    GEPNoWrapFlags GEPFlags, DebugLoc DbgLoc) {
+  // Offset for Part 0 = Stride * (VF - 1).
+  VPTypeAnalysis TypeInfo(Plan);
+  const DataLayout &DL =
+      Plan.getScalarHeader()->getIRBasicBlock()->getDataLayout();
+  Type *IndexTy = DL.getIndexType(TypeInfo.inferScalarType(Ptr));
+  Type *VFTy = TypeInfo.inferScalarType(&Plan.getVF());
+  VPValue *VF = Builder.createScalarZExtOrTrunc(&Plan.getVF(), IndexTy, VFTy,
+                                                DebugLoc::getUnknown());
+  VPValue *VFMinusOne = Builder.createOverflowingOp(
+      Instruction::Sub, {VF, Plan.getConstantInt(IndexTy, 1)}, {true, true});
+  VPValue *StridexVFMinusOne = Builder.createOverflowingOp(
+      Instruction::Mul, {VFMinusOne, Plan.getConstantInt(IndexTy, Stride)});
+  return new VPVectorEndPointerRecipe(Ptr, StridexVFMinusOne, SourceElementType,
+                                      Stride, GEPFlags, DbgLoc);
+}
+
 void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF,
                                                         ElementCount MaxVF) {
   if (ElementCount::isKnownGT(MinVF, MaxVF))
diff --git a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
index 1808be118cd2a..54d21ec00a932 100644
--- a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
+++ b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
@@ -149,6 +149,9 @@ class VPRecipeBuilder {
                                           : std::make_optional(It->second);
   }
 
+  void setInsertPoint(VPRecipeBase *IP) { Builder.setInsertPoint(IP); }
+  VPBuilder &getBuilder() const { return Builder; }
+
   /// Find all possible partial reductions in the loop and track all of those
   /// that are valid so recipes can be formed later.
   void collectScaledReductions(VFRange &Range);
@@ -162,6 +165,12 @@ class VPRecipeBuilder {
   VPRecipeBase *tryToCreatePartialReduction(VPInstruction *Reduction,
                                             unsigned ScaleFactor);
 
+  /// Create and return a VectorEndPointer recipe.
+  VPVectorEndPointerRecipe *
+  createVectorEndPointerRecipe(VPValue *Ptr, Type *SourceElementType,
+                               int64_t Stride, GEPNoWrapFlags GEPFlags,
+                               DebugLoc DbgLoc);
+
   /// Set the recipe created for given ingredient.
   void setRecipe(Instruction *I, VPRecipeBase *R) {
     assert(!Ingredient2Recipe.contains(I) &&
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 7023a5d4af176..c4010afbe3ef1 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1907,27 +1907,27 @@ class LLVM_ABI_FOR_TEST VPWidenGEPRecipe : public VPRecipeWithIRFlags {
 /// A recipe to compute a pointer to the last element of each part of a widened
 /// memory access for widened memory accesses of IndexedTy. Used for
 /// VPWidenMemoryRecipes or VPInterleaveRecipes that are reversed.
-class VPVectorEndPointerRecipe : public VPRecipeWithIRFlags,
-                                 public VPUnrollPartAccessor<2> {
-  Type *IndexedTy;
+class VPVectorEndPointerRecipe : public VPRecipeWithIRFlags {
+  Type *SourceElementTy;
 
   /// The constant stride of the pointer computed by this recipe, expressed in
-  /// units of IndexedTy.
+  /// units of SourceElementTy.
   int64_t Stride;
 
 public:
-  VPVectorEndPointerRecipe(VPValue *Ptr, VPValue *VF, Type *IndexedTy,
+  VPVectorEndPointerRecipe(VPValue *Ptr, VPValue *Offset, Type *SourceElementTy,
                            int64_t Stride, GEPNoWrapFlags GEPFlags, DebugLoc DL)
-      : VPRecipeWithIRFlags(VPDef::VPVectorEndPointerSC,
-                            ArrayRef<VPValue *>({Ptr, VF}), GEPFlags, DL),
-        IndexedTy(IndexedTy), Stride(Stride) {
+      : VPRecipeWithIRFlags(VPDef::VPVectorEndPointerSC, {Ptr, Offset},
+                            GEPFlags, DL),
+        SourceElementTy(SourceElementTy), Stride(Stride) {
     assert(Stride < 0 && "Stride must be negative");
   }
 
   VP_CLASSOF_IMPL(VPDef::VPVectorEndPointerSC)
 
-  VPValue *getVFValue() { return getOperand(1); }
-  const VPValue *getVFValue() const { return getOperand(1); }
+  int64_t getStride() const { return Stride; }
+  Type *getSourceElementType() const { return SourceElementTy; }
+  VPValue *getOffset() const { return getOperand(1); }
 
   void execute(VPTransformState &State) override;
 
@@ -1953,9 +1953,9 @@ class VPVectorEndPointerRecipe : public VPRecipeWithIRFlags,
   }
 
   VPVectorEndPointerRecipe *clone() override {
-    return new VPVectorEndPointerRecipe(getOperand(0), getVFValue(), IndexedTy,
-                                        Stride, getGEPNoWrapFlags(),
-                                        getDebugLoc());
+    return new VPVectorEndPointerRecipe(getOperand(0), getOffset(),
+                                        getSourceElementType(), getStride(),
+                                        getGEPNoWrapFlags(), getDebugLoc());
   }
 
 protected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index fb7aaec4e93a9..443a95104e722 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -2488,28 +2488,10 @@ void VPWidenGEPRecipe::printRecipe(raw_ostream &O, const Twine &Indent,
 
 void VPVectorEndPointerRecipe::execute(VPTransformState &State) {
   auto &Builder = State.Builder;
-  unsigned CurrentPart = getUnrollPart(*this);
-  const DataLayout &DL = Builder.GetInsertBlock()->getDataLayout();
-  Type *IndexTy = DL.getIndexType(State.TypeAnalysis.inferScalarType(this));
-
-  // The wide store needs to start at the last vector element.
-  Value *RunTimeVF = State.get(getVFValue(), VPLane(0));
-  if (IndexTy != RunTimeVF->getType())
-    RunTimeVF = Builder.CreateZExtOrTrunc(RunTimeVF, IndexTy);
-  // NumElt = Stride * CurrentPart * RunTimeVF
-  Value *NumElt = Builder.CreateMul(
-      ConstantInt::get(IndexTy, Stride * (int64_t)CurrentPart), RunTimeVF);
-  // LastLane = Stride * (RunTimeVF - 1)
-  Value *LastLane = Builder.CreateSub(RunTimeVF, ConstantInt::get(IndexTy, 1));
-  if (Stride != 1)
-    LastLane =
-        Builder.CreateMul(ConstantInt::getSigned(IndexTy, Stride), LastLane);
-  Value *Ptr = State.get(getOperand(0), VPLane(0));
-  Value *ResultPtr =
-      Builder.CreateGEP(IndexedTy, Ptr, NumElt, "", getGEPNoWrapFlags());
-  ResultPtr = Builder.CreateGEP(IndexedTy, ResultPtr, LastLane, "",
-                                getGEPNoWrapFlags());
-
+  Value *Ptr = State.get(getOperand(0), true);
+  Value *Offset = State.get(getOffset(), true);
+  Value *ResultPtr = Builder.CreateGEP(getSourceElementType(), Ptr, Offset, "",
+                                       getGEPNoWrapFlags());
   State.set(this, ResultPtr, /*IsScalar*/ true);
 }
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 9c4f4246deb41..45238d21996a0 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -783,6 +783,15 @@ static SmallVector<VPUser *> collectUsersRecursively(VPValue *V) {
   return Users.takeVector();
 }
 
+static SmallVector<VPValue *> collectOperandsRecursively(VPRecipeBase *R) {
+  SetVector<VPValue *> Operands(llvm::from_range, R->operands());
+  for (unsigned I = 0; I != Operands.size(); ++I) {
+    if (VPRecipeBase *Cur = Operands[I]->getDefiningRecipe())
+      Operands.insert_range(Cur->operands());
+  }
+  return Operands.takeVector();
+}
+
 /// Scalarize a VPWidenPointerInductionRecipe by replacing it with a PtrAdd
 /// (IndStart, ScalarIVSteps (0, Step)). This is used when the recipe only
 /// generates scalar values.
@@ -2843,11 +2852,30 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
   VPValue *Addr, *Mask, *EndPtr;
 
   /// Adjust any end pointers so that they point to the end of EVL lanes not VF.
-  auto AdjustEndPtr = [&CurRecipe, &EVL](VPValue *EndPtr) {
-    auto *EVLEndPtr = cast<VPVectorEndPointerRecipe>(EndPtr)->clone();
-    EVLEndPtr->insertBefore(&CurRecipe);
-    EVLEndPtr->setOperand(1, &EVL);
-    return EVLEndPtr;
+  auto AdjustEndPtr = [&EVL, &Plan, &TypeInfo](VPValue *EndPtr) {
+    auto *VEPR = cast<VPVectorEndPointerRecipe>(EndPtr);
+    SmallVector<VPValue *> Operands({VEPR->getOffset()});
+    VPRecipeBase *OffsetR = VEPR->getOffset()->getDefiningRecipe();
+    append_range(Operands, collectOperandsRecursively(OffsetR));
+    for (VPValue *Op : reverse(Operands)) {
+      if (!Op->hasDefiningRecipe())
+        continue;
+      VPRecipeBase *Clone = Op->getDefiningRecipe()->clone();
+      Clone->insertBefore(VEPR);
+      VPRecipeBase *EVLR = EVL.getDefiningRecipe();
+      VPBuilder Builder(EVLR->getParent(), std::next(EVLR->getIterator()));
+      Type *EVLType = TypeInfo.inferScalarType(&EVL);
+      VPValue *EVLCast = Builder.createScalarZExtOrTrunc(
+          &EVL, TypeInfo.inferScalarType(&Plan->getVF()), EVLType,
+          EVLR->getDebugLoc());
+      Clone->replaceUsesOfWith(&Plan->getVF(), EVLCast);
+      Op->replaceUsesWithIf(
+          Clone->getVPSingleValue(), [&Operands](VPUser &U, unsigned Idx) {
+            return !is_contained(Operands,
+                                 cast<VPRecipeBase>(U).getVPSingleValue());
+          });
+    }
+    return VEPR;
   };
 
   if (match(&CurRecipe,
@@ -2858,7 +2886,7 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
 
   if (match(&CurRecipe,
             m_MaskedLoad(m_VPValue(EndPtr), m_RemoveMask(HeaderMask, Mask))) &&
-      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_Specific(&Plan->getVF()))) &&
+      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_VPValue())) &&
       cast<VPWidenLoadRecipe>(CurRecipe).isReverse())
     return new VPWidenLoadEVLRecipe(cast<VPWidenLoadRecipe>(CurRecipe),
                                     AdjustEndPtr(EndPtr), EVL, Mask);
@@ -2871,7 +2899,7 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
 
   if (match(&CurRecipe, m_MaskedStore(m_VPValue(EndPtr), m_VPValue(),
                                       m_RemoveMask(HeaderMask, Mask))) &&
-      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_Specific(&Plan->getVF()))) &&
+      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_VPValue())) &&
       cast<VPWidenStoreRecipe>(CurRecipe).isReverse())
     return new VPWidenStoreEVLRecipe(cast<VPWidenStoreRecipe>(CurRecipe),
                                      AdjustEndPtr(EndPtr), EVL, Mask);
@@ -2916,10 +2944,10 @@ static void transformRecipestoEVLRecipes(VPlan &Plan, VPValue &EVL) {
   VPRegionBlock *LoopRegion = Plan.getVectorLoopRegion();
   VPBasicBlock *Header = LoopRegion->getEntryBasicBlock();
 
-  assert(all_of(Plan.getVF().users(),
-                IsaPred<VPVectorEndPointerRecipe, VPScalarIVStepsRecipe,
-                        VPWidenIntOrFpInductionRecipe>) &&
-         "User of VF that we can't transform to EVL.");
+  assert(
+      all_of(Plan.getVF().users(), IsaPred<VPInstruction, VPScalarIVStepsRecipe,
+                                           VPWidenIntOrFpInductionRecipe>) &&
+      "User of VF that we can't transform to EVL.");
   Plan.getVF().replaceUsesWithIf(&EVL, [](VPUser &U, unsigned Idx) {
     return isa<VPWidenIntOrFpInductionRecipe, VPScalarIVStepsRecipe>(U);
   });
@@ -3412,6 +3440,7 @@ void VPlanTransforms::createInterleaveGroups(
     Instruction *IRInsertPos = IG->getInsertPos();
     auto *InsertPos =
         cast<VPWidenMemoryRecipe>(RecipeBuilder.getRecipe(IRInsertPos));
+    RecipeBuilder.setInsertPoint(InsertPos);
 
     GEPNoWrapFlags NW = GEPNoWrapFlags::none();
     if (auto *Gep = dyn_cast<GetElementPtrInst>(
@@ -3421,6 +3450,7 @@ void VPlanTransforms::createInterleaveGroups(
     // Get or create the start address for the interleave group.
     VPValue *Addr = Start->getAddr();
     VPRecipeBase *AddrDef = Addr->getDefiningRecipe();
+    VPBuilder &B = RecipeBuilder.getBuilder();
     if (AddrDef && !VPDT.properlyDominates(AddrDef, InsertPos)) {
       // We cannot re-use the address of member zero because it does not
       // dominate the insert position. Instead, use the address of the insert
@@ -3436,7 +3466,6 @@ void VPlanTransforms::createInterleaveGroups(
                        IG->getIndex(IRInsertPos),
                    /*IsSigned=*/true);
       VPValue *OffsetVPV = Plan.getConstantInt(-Offset);
-      VPBuilder B(InsertPos);
       Addr = B.createNoWrapPtrAdd(InsertPos->getAddr(), OffsetVPV, NW);
     }
     // If the group is reverse, adjust the index to refer to the last vector
@@ -3444,10 +3473,10 @@ void VPlanTransforms::createInterleaveGroups(
     // lane, rather than directly getting the pointer for lane VF - 1, because
     // the pointer operand of the interleaved access is supposed to be uniform.
     if (IG->isReverse()) {
-      auto *ReversePtr = new VPVectorEndPointerRecipe(
-          Addr, &Plan.getVF(), getLoadStoreType(IRInsertPos),
-          -(int64_t)IG->getFactor(), NW, InsertPos->getDebugLoc());
-      ReversePtr->insertBefore(InsertPos);
+      auto *ReversePtr = RecipeBuilder.createVectorEndPointerRecipe(
+          Addr, getLoadStoreType(IRInsertPos), -(int64_t)IG->getFactor(), NW,
+          InsertPos->getDebugLoc());
+      B.insert(ReversePtr);
       Addr = ReversePtr;
     }
     auto *VPIG = new VPInterleaveRecipe(IG, Addr, StoredValues,
diff --git a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
index 8198945764936..d35171929dc5f 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
@@ -313,6 +313,25 @@ void UnrollState::unrollRecipeByUF(VPRecipeBase &R) {
       Copy->addOperand(VFxPart);
       continue;
     }
+    if (auto *VEPR = dyn_cast<VPVectorEndPointerRecipe>(&R)) {
+      VPBuilder Builder(VEPR);
+      VPValue *PrevOffset =
+          cast<VPVectorEndPointerRecipe>(getValueForPart(VEPR, Part - 1))
+              ->getOffset();
+      Type *IndexTy = TypeInfo.inferScalarType(PrevOffset);
+      Type *VFTy = TypeInfo.inferScalarType(&Plan.getVF());
+      VPValue *VF = Builder.createScalarZExtOrTrunc(
+          &Plan.getVF(), IndexTy, VFTy, DebugLoc::getUnknown());
+      // Offset = PrevOffset + Stride * VF.
+      VPValue *VFxStride = Builder.createOverflowingOp(
+          Instruction::Mul,
+          {VF, Plan.getConstantInt(IndexTy, VEPR->getStride())});
+      VPValue *Offset = Builder.createOverflowingOp(Instruction::Add,
+                                                    {PrevOffset, VFxStride});
+      Copy->setOperand(0, VEPR->getOperand(0));
+      Copy->setOperand(1, Offset);
+      continue;
+    }
     if (auto *Red = dyn_cast<VPReductionRecipe>(&R)) {
       auto *Phi = dyn_cast<VPReductionPHIRecipe>(R.getOperand(0));
       if (Phi && Phi->isOrdered()) {
@@ -329,14 +348,10 @@ void UnrollState::unrollRecipeByUF(VPRecipeBase &R) {
 
     // Add operand indicating the part to generate code for, to recipes still
     // requiring it.
-    if (isa<VPScalarIVStepsRecipe, VPWidenCanonicalIVRecipe,
-            VPVectorEndPointerRecipe>(Copy) ||
+    if (isa<VPScalarIVStepsRecipe, VPWidenCanonicalIVRecipe>(Copy) ||
         match(Copy,
               m_VPInstruction<VPInstruction::CanonicalIVIncrementForPart>()))
       Copy->addOperand(getConstantInt(Part));
-
-    if (isa<VPVectorEndPointerRecipe>(R))
-      Copy->setOperand(0, R.getOperand(0));
   }
 }
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp b/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
index 2d63d2a787f88..86c08a9fc2be2 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
@@ -171,8 +171,7 @@ bool VPlanVerifier::verifyEVLRecipe(const VPInstruction &EVL) const {
           }
           return VerifyEVLUse(*R, 2);
         })
-        .Case<VPWidenLoadEVLRecipe, VPVectorEndPointerRecipe,
-              VPInterleaveEVLRecipe>(
+        .Case<VPWidenLoadEVLRecipe, VPInterleaveEVLRecipe>(
             [&](const VPRecipeBase *R) { return VerifyEVLUse(*R, 1); })
         .Case<VPInstructionWithType>(
             [&](const VPInstructionWithType *S) { return VerifyEVLUse(*S, 0); })
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll b/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
index 8935010e71676..cd0edf92622bc 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
@@ -359,6 +359,8 @@ define void @test_reversed_load2_store2(ptr noalias nocapture readonly %A, ptr n
 ; CHECK:       vector.ph:
 ; CHECK-NEXT:    [[TMP0:%.*]] = call i64 @llvm.vscale.i64()
 ; CHECK-NEXT:    [[TMP1:%.*]] = shl nuw nsw i64 [[TMP0]], 2
+; CHECK-NEXT:    [[TMP5:%.*]] = shl nuw nsw i64 [[TMP0]], 3
+; CHECK-NEXT:    [[TMP8:%.*]] = sub nsw i64 2, [[TMP5]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = call <vscale x 4 x i32> @llvm.stepvector.nxv4i32()
 ; CHECK-NEXT:    [[INDUCTION:%.*]] = sub <vscale x 4 x i32> splat (i32 1023), [[TMP2]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = trunc nuw nsw i64 [[TMP1]] to i32
@@ -371,8 +373,6 @@ define void @test_reversed_load2_store2(ptr noalias nocapture readonly %A, ptr n
 ; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <vscale x 4 x i32> [ [[INDUCTION]], [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], [[VECTOR_BODY]] ]
 ; CHECK-NEXT:    [[OFFSET_IDX:%.*]] = sub i64 1023, [[INDEX]]
 ; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_ST2:%.*]], ptr [[A:%.*]], i64 [[OFFSET_IDX]]
-; CHECK-NEXT:    [[TMP6:%.*]] = shl nuw nsw i64 [[TMP0]], 3
-; CHECK-NEXT:    [[TMP8:%.*]] = sub nsw i64 2, [[TMP6]]
 ; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i64 [[TMP8]]
 ; CHECK-NEXT:    [[WIDE_VEC:%.*]] = load <vscale x 8 x i32>, ptr [[TMP9]], align 4
 ; CHECK-NEXT:    [[STRIDED_VEC:%.*]] = call { <vscale x 4 x i32>, <vscale x 4 x i32> } @llvm.vector.deinterleave2.nxv8i32(<vscale x 8 x i32> [[WIDE_VEC]])
@@ -383,9 +383,7 @@ define void @test_reversed_load2_store2(ptr noalias nocapture readonly %A, ptr n
 ; CHECK-NEXT:    [[TMP12:%.*]] = add nsw <vscale x 4 x i32> [[REVERSE]], [[VEC_IND]]
 ; CHECK-NEXT:    [[TMP13:%.*]] = sub nsw <vscale x 4 x i32> [[REVERSE1]], [[VEC_IND]]
 ; CHECK-NEXT:    [[TMP14:%.*]] = getelementptr inbounds [[STRUCT_ST2]], ptr [[B:%.*]], i64 [[OFFSET_IDX]]
-; CHECK-NEXT:    [[TMP15:%.*]] = shl nuw nsw i64 [[TMP0]], 3
-; CHECK-NEXT:    [[TMP18:%.*]] = sub nsw i64 2, [[TMP15]]
-; CHECK-NEXT:    [[TMP19:%.*]] = getelementptr inbounds i32, ptr [[TMP14]], i64 [[TMP18]]
+; CHECK-NEXT:    [[TMP19:%.*]] = getelementptr inbounds i32, ptr [[TMP14]], i64 [[TMP8]]
 ; CHECK-NEXT:    [[REVERSE2:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP12]])
 ; CHECK-NEXT:    [[REVERSE3:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP13]])
 ; CHECK-NEXT:    [[INTERLEAVED_VEC:%.*]] = call <vscale x 8 x i32> @llvm.v...
[truncated]

llvmbot · 2025-12-15T21:33:50Z

@llvm/pr-subscribers-backend-risc-v

Author: Ramkumar Ramachandra (artagnon)

Changes

Directly unroll VectorEndPointerRecipe following 0636225 ([VPlan] Directly unroll VectorPointerRecipe). It allows us to leverage existing VPlan simplifications to optimize.

Patch is 183.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/172372.diff

43 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+22-3)
(modified) llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h (+9)
(modified) llvm/lib/Transforms/Vectorize/VPlan.h (+13-13)
(modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+4-22)
(modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+45-16)
(modified) llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp (+20-5)
(modified) llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+6-10)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-vector-reverse.ll (+12-24)
(modified) llvm/test/Transforms/LoopVectorize/AArch64/vector-reverse.ll (+4-6)
(modified) llvm/test/Transforms/LoopVectorize/ARM/mve-gather-scatter-tailpred.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/ARM/tail-folding-not-allowed.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/PowerPC/optimal-epilog-vectorization.ll (+18-36)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll (+84-168)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reverse-load-store.ll (+30-68)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-uniform-store.ll (+5-6)
(modified) llvm/test/Transforms/LoopVectorize/RISCV/vplan-riscv-vector-reverse.ll (+15-2)
(modified) llvm/test/Transforms/LoopVectorize/WebAssembly/induction-branch-cost.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/X86/drop-inbounds-flags-for-reverse-vector-pointer.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/X86/masked_load_store.ll (+46-70)
(modified) llvm/test/Transforms/LoopVectorize/X86/multi-exit-cost.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/X86/pr81872.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/consecutive-ptr-uniforms.ll (+3-6)
(modified) llvm/test/Transforms/LoopVectorize/if-reduction.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/induction-wrapflags.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/interleave-with-i65-induction.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses-gep-nowrap-flags.ll (+6-12)
(modified) llvm/test/Transforms/LoopVectorize/iv-select-cmp-decreasing.ll (+12-24)
(modified) llvm/test/Transforms/LoopVectorize/iv_outside_user.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/load-deref-pred-align.ll (+4-8)
(modified) llvm/test/Transforms/LoopVectorize/optimal-epilog-vectorization.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/pointer-induction.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/pr37248.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/reuse-lcssa-phi-scev-expansion.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/reverse-induction-gep-nowrap-flags.ll (+3-6)
(modified) llvm/test/Transforms/LoopVectorize/reverse_induction.ll (+10-20)
(modified) llvm/test/Transforms/LoopVectorize/runtime-check-known-true.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/runtime-check-needed-but-empty.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/runtime-checks-hoist.ll (+18-20)
(modified) llvm/test/Transforms/LoopVectorize/single-early-exit-interleave.ll (+16-32)
(modified) llvm/test/Transforms/LoopVectorize/single_early_exit_live_outs.ll (+2-4)
(modified) llvm/test/Transforms/LoopVectorize/skeleton-lcssa-crash.ll (+1-2)
(modified) llvm/test/Transforms/LoopVectorize/vplan-sink-scalars-and-merge.ll (+3-1)

diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 082489f70f1c6..05e8aa81ecf46 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7660,9 +7660,9 @@ VPWidenMemoryRecipe *VPRecipeBuilder::tryToWidenMemory(VPInstruction *VPI,
           CM.foldTailByMasking() || !GEP
               ? GEPNoWrapFlags::none()
               : GEP->getNoWrapFlags().withoutNoUnsignedWrap();
-      VectorPtr = new VPVectorEndPointerRecipe(
-          Ptr, &Plan.getVF(), getLoadStoreType(I),
-          /*Stride*/ -1, Flags, VPI->getDebugLoc());
+      VectorPtr = createVectorEndPointerRecipe(Ptr, getLoadStoreType(I),
+                                               /*Stride*/ -1, Flags,
+                                               VPI->getDebugLoc());
     } else {
       VectorPtr = new VPVectorPointerRecipe(Ptr, getLoadStoreType(I),
                                             GEP ? GEP->getNoWrapFlags()
@@ -8351,6 +8351,25 @@ VPRecipeBuilder::tryToCreatePartialReduction(VPInstruction *Reduction,
       RdxUnordered{/*VFScaleFactor=*/ScaleFactor}, ReductionI->getDebugLoc());
 }
 
+VPVectorEndPointerRecipe *VPRecipeBuilder::createVectorEndPointerRecipe(
+    VPValue *Ptr, Type *SourceElementType, int64_t Stride,
+    GEPNoWrapFlags GEPFlags, DebugLoc DbgLoc) {
+  // Offset for Part 0 = Stride * (VF - 1).
+  VPTypeAnalysis TypeInfo(Plan);
+  const DataLayout &DL =
+      Plan.getScalarHeader()->getIRBasicBlock()->getDataLayout();
+  Type *IndexTy = DL.getIndexType(TypeInfo.inferScalarType(Ptr));
+  Type *VFTy = TypeInfo.inferScalarType(&Plan.getVF());
+  VPValue *VF = Builder.createScalarZExtOrTrunc(&Plan.getVF(), IndexTy, VFTy,
+                                                DebugLoc::getUnknown());
+  VPValue *VFMinusOne = Builder.createOverflowingOp(
+      Instruction::Sub, {VF, Plan.getConstantInt(IndexTy, 1)}, {true, true});
+  VPValue *StridexVFMinusOne = Builder.createOverflowingOp(
+      Instruction::Mul, {VFMinusOne, Plan.getConstantInt(IndexTy, Stride)});
+  return new VPVectorEndPointerRecipe(Ptr, StridexVFMinusOne, SourceElementType,
+                                      Stride, GEPFlags, DbgLoc);
+}
+
 void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF,
                                                         ElementCount MaxVF) {
   if (ElementCount::isKnownGT(MinVF, MaxVF))
diff --git a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
index 1808be118cd2a..54d21ec00a932 100644
--- a/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
+++ b/llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
@@ -149,6 +149,9 @@ class VPRecipeBuilder {
                                           : std::make_optional(It->second);
   }
 
+  void setInsertPoint(VPRecipeBase *IP) { Builder.setInsertPoint(IP); }
+  VPBuilder &getBuilder() const { return Builder; }
+
   /// Find all possible partial reductions in the loop and track all of those
   /// that are valid so recipes can be formed later.
   void collectScaledReductions(VFRange &Range);
@@ -162,6 +165,12 @@ class VPRecipeBuilder {
   VPRecipeBase *tryToCreatePartialReduction(VPInstruction *Reduction,
                                             unsigned ScaleFactor);
 
+  /// Create and return a VectorEndPointer recipe.
+  VPVectorEndPointerRecipe *
+  createVectorEndPointerRecipe(VPValue *Ptr, Type *SourceElementType,
+                               int64_t Stride, GEPNoWrapFlags GEPFlags,
+                               DebugLoc DbgLoc);
+
   /// Set the recipe created for given ingredient.
   void setRecipe(Instruction *I, VPRecipeBase *R) {
     assert(!Ingredient2Recipe.contains(I) &&
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 7023a5d4af176..c4010afbe3ef1 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1907,27 +1907,27 @@ class LLVM_ABI_FOR_TEST VPWidenGEPRecipe : public VPRecipeWithIRFlags {
 /// A recipe to compute a pointer to the last element of each part of a widened
 /// memory access for widened memory accesses of IndexedTy. Used for
 /// VPWidenMemoryRecipes or VPInterleaveRecipes that are reversed.
-class VPVectorEndPointerRecipe : public VPRecipeWithIRFlags,
-                                 public VPUnrollPartAccessor<2> {
-  Type *IndexedTy;
+class VPVectorEndPointerRecipe : public VPRecipeWithIRFlags {
+  Type *SourceElementTy;
 
   /// The constant stride of the pointer computed by this recipe, expressed in
-  /// units of IndexedTy.
+  /// units of SourceElementTy.
   int64_t Stride;
 
 public:
-  VPVectorEndPointerRecipe(VPValue *Ptr, VPValue *VF, Type *IndexedTy,
+  VPVectorEndPointerRecipe(VPValue *Ptr, VPValue *Offset, Type *SourceElementTy,
                            int64_t Stride, GEPNoWrapFlags GEPFlags, DebugLoc DL)
-      : VPRecipeWithIRFlags(VPDef::VPVectorEndPointerSC,
-                            ArrayRef<VPValue *>({Ptr, VF}), GEPFlags, DL),
-        IndexedTy(IndexedTy), Stride(Stride) {
+      : VPRecipeWithIRFlags(VPDef::VPVectorEndPointerSC, {Ptr, Offset},
+                            GEPFlags, DL),
+        SourceElementTy(SourceElementTy), Stride(Stride) {
     assert(Stride < 0 && "Stride must be negative");
   }
 
   VP_CLASSOF_IMPL(VPDef::VPVectorEndPointerSC)
 
-  VPValue *getVFValue() { return getOperand(1); }
-  const VPValue *getVFValue() const { return getOperand(1); }
+  int64_t getStride() const { return Stride; }
+  Type *getSourceElementType() const { return SourceElementTy; }
+  VPValue *getOffset() const { return getOperand(1); }
 
   void execute(VPTransformState &State) override;
 
@@ -1953,9 +1953,9 @@ class VPVectorEndPointerRecipe : public VPRecipeWithIRFlags,
   }
 
   VPVectorEndPointerRecipe *clone() override {
-    return new VPVectorEndPointerRecipe(getOperand(0), getVFValue(), IndexedTy,
-                                        Stride, getGEPNoWrapFlags(),
-                                        getDebugLoc());
+    return new VPVectorEndPointerRecipe(getOperand(0), getOffset(),
+                                        getSourceElementType(), getStride(),
+                                        getGEPNoWrapFlags(), getDebugLoc());
   }
 
 protected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index fb7aaec4e93a9..443a95104e722 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -2488,28 +2488,10 @@ void VPWidenGEPRecipe::printRecipe(raw_ostream &O, const Twine &Indent,
 
 void VPVectorEndPointerRecipe::execute(VPTransformState &State) {
   auto &Builder = State.Builder;
-  unsigned CurrentPart = getUnrollPart(*this);
-  const DataLayout &DL = Builder.GetInsertBlock()->getDataLayout();
-  Type *IndexTy = DL.getIndexType(State.TypeAnalysis.inferScalarType(this));
-
-  // The wide store needs to start at the last vector element.
-  Value *RunTimeVF = State.get(getVFValue(), VPLane(0));
-  if (IndexTy != RunTimeVF->getType())
-    RunTimeVF = Builder.CreateZExtOrTrunc(RunTimeVF, IndexTy);
-  // NumElt = Stride * CurrentPart * RunTimeVF
-  Value *NumElt = Builder.CreateMul(
-      ConstantInt::get(IndexTy, Stride * (int64_t)CurrentPart), RunTimeVF);
-  // LastLane = Stride * (RunTimeVF - 1)
-  Value *LastLane = Builder.CreateSub(RunTimeVF, ConstantInt::get(IndexTy, 1));
-  if (Stride != 1)
-    LastLane =
-        Builder.CreateMul(ConstantInt::getSigned(IndexTy, Stride), LastLane);
-  Value *Ptr = State.get(getOperand(0), VPLane(0));
-  Value *ResultPtr =
-      Builder.CreateGEP(IndexedTy, Ptr, NumElt, "", getGEPNoWrapFlags());
-  ResultPtr = Builder.CreateGEP(IndexedTy, ResultPtr, LastLane, "",
-                                getGEPNoWrapFlags());
-
+  Value *Ptr = State.get(getOperand(0), true);
+  Value *Offset = State.get(getOffset(), true);
+  Value *ResultPtr = Builder.CreateGEP(getSourceElementType(), Ptr, Offset, "",
+                                       getGEPNoWrapFlags());
   State.set(this, ResultPtr, /*IsScalar*/ true);
 }
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 9c4f4246deb41..45238d21996a0 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -783,6 +783,15 @@ static SmallVector<VPUser *> collectUsersRecursively(VPValue *V) {
   return Users.takeVector();
 }
 
+static SmallVector<VPValue *> collectOperandsRecursively(VPRecipeBase *R) {
+  SetVector<VPValue *> Operands(llvm::from_range, R->operands());
+  for (unsigned I = 0; I != Operands.size(); ++I) {
+    if (VPRecipeBase *Cur = Operands[I]->getDefiningRecipe())
+      Operands.insert_range(Cur->operands());
+  }
+  return Operands.takeVector();
+}
+
 /// Scalarize a VPWidenPointerInductionRecipe by replacing it with a PtrAdd
 /// (IndStart, ScalarIVSteps (0, Step)). This is used when the recipe only
 /// generates scalar values.
@@ -2843,11 +2852,30 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
   VPValue *Addr, *Mask, *EndPtr;
 
   /// Adjust any end pointers so that they point to the end of EVL lanes not VF.
-  auto AdjustEndPtr = [&CurRecipe, &EVL](VPValue *EndPtr) {
-    auto *EVLEndPtr = cast<VPVectorEndPointerRecipe>(EndPtr)->clone();
-    EVLEndPtr->insertBefore(&CurRecipe);
-    EVLEndPtr->setOperand(1, &EVL);
-    return EVLEndPtr;
+  auto AdjustEndPtr = [&EVL, &Plan, &TypeInfo](VPValue *EndPtr) {
+    auto *VEPR = cast<VPVectorEndPointerRecipe>(EndPtr);
+    SmallVector<VPValue *> Operands({VEPR->getOffset()});
+    VPRecipeBase *OffsetR = VEPR->getOffset()->getDefiningRecipe();
+    append_range(Operands, collectOperandsRecursively(OffsetR));
+    for (VPValue *Op : reverse(Operands)) {
+      if (!Op->hasDefiningRecipe())
+        continue;
+      VPRecipeBase *Clone = Op->getDefiningRecipe()->clone();
+      Clone->insertBefore(VEPR);
+      VPRecipeBase *EVLR = EVL.getDefiningRecipe();
+      VPBuilder Builder(EVLR->getParent(), std::next(EVLR->getIterator()));
+      Type *EVLType = TypeInfo.inferScalarType(&EVL);
+      VPValue *EVLCast = Builder.createScalarZExtOrTrunc(
+          &EVL, TypeInfo.inferScalarType(&Plan->getVF()), EVLType,
+          EVLR->getDebugLoc());
+      Clone->replaceUsesOfWith(&Plan->getVF(), EVLCast);
+      Op->replaceUsesWithIf(
+          Clone->getVPSingleValue(), [&Operands](VPUser &U, unsigned Idx) {
+            return !is_contained(Operands,
+                                 cast<VPRecipeBase>(U).getVPSingleValue());
+          });
+    }
+    return VEPR;
   };
 
   if (match(&CurRecipe,
@@ -2858,7 +2886,7 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
 
   if (match(&CurRecipe,
             m_MaskedLoad(m_VPValue(EndPtr), m_RemoveMask(HeaderMask, Mask))) &&
-      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_Specific(&Plan->getVF()))) &&
+      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_VPValue())) &&
       cast<VPWidenLoadRecipe>(CurRecipe).isReverse())
     return new VPWidenLoadEVLRecipe(cast<VPWidenLoadRecipe>(CurRecipe),
                                     AdjustEndPtr(EndPtr), EVL, Mask);
@@ -2871,7 +2899,7 @@ static VPRecipeBase *optimizeMaskToEVL(VPValue *HeaderMask,
 
   if (match(&CurRecipe, m_MaskedStore(m_VPValue(EndPtr), m_VPValue(),
                                       m_RemoveMask(HeaderMask, Mask))) &&
-      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_Specific(&Plan->getVF()))) &&
+      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_VPValue())) &&
       cast<VPWidenStoreRecipe>(CurRecipe).isReverse())
     return new VPWidenStoreEVLRecipe(cast<VPWidenStoreRecipe>(CurRecipe),
                                      AdjustEndPtr(EndPtr), EVL, Mask);
@@ -2916,10 +2944,10 @@ static void transformRecipestoEVLRecipes(VPlan &Plan, VPValue &EVL) {
   VPRegionBlock *LoopRegion = Plan.getVectorLoopRegion();
   VPBasicBlock *Header = LoopRegion->getEntryBasicBlock();
 
-  assert(all_of(Plan.getVF().users(),
-                IsaPred<VPVectorEndPointerRecipe, VPScalarIVStepsRecipe,
-                        VPWidenIntOrFpInductionRecipe>) &&
-         "User of VF that we can't transform to EVL.");
+  assert(
+      all_of(Plan.getVF().users(), IsaPred<VPInstruction, VPScalarIVStepsRecipe,
+                                           VPWidenIntOrFpInductionRecipe>) &&
+      "User of VF that we can't transform to EVL.");
   Plan.getVF().replaceUsesWithIf(&EVL, [](VPUser &U, unsigned Idx) {
     return isa<VPWidenIntOrFpInductionRecipe, VPScalarIVStepsRecipe>(U);
   });
@@ -3412,6 +3440,7 @@ void VPlanTransforms::createInterleaveGroups(
     Instruction *IRInsertPos = IG->getInsertPos();
     auto *InsertPos =
         cast<VPWidenMemoryRecipe>(RecipeBuilder.getRecipe(IRInsertPos));
+    RecipeBuilder.setInsertPoint(InsertPos);
 
     GEPNoWrapFlags NW = GEPNoWrapFlags::none();
     if (auto *Gep = dyn_cast<GetElementPtrInst>(
@@ -3421,6 +3450,7 @@ void VPlanTransforms::createInterleaveGroups(
     // Get or create the start address for the interleave group.
     VPValue *Addr = Start->getAddr();
     VPRecipeBase *AddrDef = Addr->getDefiningRecipe();
+    VPBuilder &B = RecipeBuilder.getBuilder();
     if (AddrDef && !VPDT.properlyDominates(AddrDef, InsertPos)) {
       // We cannot re-use the address of member zero because it does not
       // dominate the insert position. Instead, use the address of the insert
@@ -3436,7 +3466,6 @@ void VPlanTransforms::createInterleaveGroups(
                        IG->getIndex(IRInsertPos),
                    /*IsSigned=*/true);
       VPValue *OffsetVPV = Plan.getConstantInt(-Offset);
-      VPBuilder B(InsertPos);
       Addr = B.createNoWrapPtrAdd(InsertPos->getAddr(), OffsetVPV, NW);
     }
     // If the group is reverse, adjust the index to refer to the last vector
@@ -3444,10 +3473,10 @@ void VPlanTransforms::createInterleaveGroups(
     // lane, rather than directly getting the pointer for lane VF - 1, because
     // the pointer operand of the interleaved access is supposed to be uniform.
     if (IG->isReverse()) {
-      auto *ReversePtr = new VPVectorEndPointerRecipe(
-          Addr, &Plan.getVF(), getLoadStoreType(IRInsertPos),
-          -(int64_t)IG->getFactor(), NW, InsertPos->getDebugLoc());
-      ReversePtr->insertBefore(InsertPos);
+      auto *ReversePtr = RecipeBuilder.createVectorEndPointerRecipe(
+          Addr, getLoadStoreType(IRInsertPos), -(int64_t)IG->getFactor(), NW,
+          InsertPos->getDebugLoc());
+      B.insert(ReversePtr);
       Addr = ReversePtr;
     }
     auto *VPIG = new VPInterleaveRecipe(IG, Addr, StoredValues,
diff --git a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
index 8198945764936..d35171929dc5f 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
@@ -313,6 +313,25 @@ void UnrollState::unrollRecipeByUF(VPRecipeBase &R) {
       Copy->addOperand(VFxPart);
       continue;
     }
+    if (auto *VEPR = dyn_cast<VPVectorEndPointerRecipe>(&R)) {
+      VPBuilder Builder(VEPR);
+      VPValue *PrevOffset =
+          cast<VPVectorEndPointerRecipe>(getValueForPart(VEPR, Part - 1))
+              ->getOffset();
+      Type *IndexTy = TypeInfo.inferScalarType(PrevOffset);
+      Type *VFTy = TypeInfo.inferScalarType(&Plan.getVF());
+      VPValue *VF = Builder.createScalarZExtOrTrunc(
+          &Plan.getVF(), IndexTy, VFTy, DebugLoc::getUnknown());
+      // Offset = PrevOffset + Stride * VF.
+      VPValue *VFxStride = Builder.createOverflowingOp(
+          Instruction::Mul,
+          {VF, Plan.getConstantInt(IndexTy, VEPR->getStride())});
+      VPValue *Offset = Builder.createOverflowingOp(Instruction::Add,
+                                                    {PrevOffset, VFxStride});
+      Copy->setOperand(0, VEPR->getOperand(0));
+      Copy->setOperand(1, Offset);
+      continue;
+    }
     if (auto *Red = dyn_cast<VPReductionRecipe>(&R)) {
       auto *Phi = dyn_cast<VPReductionPHIRecipe>(R.getOperand(0));
       if (Phi && Phi->isOrdered()) {
@@ -329,14 +348,10 @@ void UnrollState::unrollRecipeByUF(VPRecipeBase &R) {
 
     // Add operand indicating the part to generate code for, to recipes still
     // requiring it.
-    if (isa<VPScalarIVStepsRecipe, VPWidenCanonicalIVRecipe,
-            VPVectorEndPointerRecipe>(Copy) ||
+    if (isa<VPScalarIVStepsRecipe, VPWidenCanonicalIVRecipe>(Copy) ||
         match(Copy,
               m_VPInstruction<VPInstruction::CanonicalIVIncrementForPart>()))
       Copy->addOperand(getConstantInt(Part));
-
-    if (isa<VPVectorEndPointerRecipe>(R))
-      Copy->setOperand(0, R.getOperand(0));
   }
 }
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp b/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
index 2d63d2a787f88..86c08a9fc2be2 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp
@@ -171,8 +171,7 @@ bool VPlanVerifier::verifyEVLRecipe(const VPInstruction &EVL) const {
           }
           return VerifyEVLUse(*R, 2);
         })
-        .Case<VPWidenLoadEVLRecipe, VPVectorEndPointerRecipe,
-              VPInterleaveEVLRecipe>(
+        .Case<VPWidenLoadEVLRecipe, VPInterleaveEVLRecipe>(
             [&](const VPRecipeBase *R) { return VerifyEVLUse(*R, 1); })
         .Case<VPInstructionWithType>(
             [&](const VPInstructionWithType *S) { return VerifyEVLUse(*S, 0); })
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll b/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
index 8935010e71676..cd0edf92622bc 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll
@@ -359,6 +359,8 @@ define void @test_reversed_load2_store2(ptr noalias nocapture readonly %A, ptr n
 ; CHECK:       vector.ph:
 ; CHECK-NEXT:    [[TMP0:%.*]] = call i64 @llvm.vscale.i64()
 ; CHECK-NEXT:    [[TMP1:%.*]] = shl nuw nsw i64 [[TMP0]], 2
+; CHECK-NEXT:    [[TMP5:%.*]] = shl nuw nsw i64 [[TMP0]], 3
+; CHECK-NEXT:    [[TMP8:%.*]] = sub nsw i64 2, [[TMP5]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = call <vscale x 4 x i32> @llvm.stepvector.nxv4i32()
 ; CHECK-NEXT:    [[INDUCTION:%.*]] = sub <vscale x 4 x i32> splat (i32 1023), [[TMP2]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = trunc nuw nsw i64 [[TMP1]] to i32
@@ -371,8 +373,6 @@ define void @test_reversed_load2_store2(ptr noalias nocapture readonly %A, ptr n
 ; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <vscale x 4 x i32> [ [[INDUCTION]], [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], [[VECTOR_BODY]] ]
 ; CHECK-NEXT:    [[OFFSET_IDX:%.*]] = sub i64 1023, [[INDEX]]
 ; CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds [[STRUCT_ST2:%.*]], ptr [[A:%.*]], i64 [[OFFSET_IDX]]
-; CHECK-NEXT:    [[TMP6:%.*]] = shl nuw nsw i64 [[TMP0]], 3
-; CHECK-NEXT:    [[TMP8:%.*]] = sub nsw i64 2, [[TMP6]]
 ; CHECK-NEXT:    [[TMP9:%.*]] = getelementptr inbounds i32, ptr [[TMP4]], i64 [[TMP8]]
 ; CHECK-NEXT:    [[WIDE_VEC:%.*]] = load <vscale x 8 x i32>, ptr [[TMP9]], align 4
 ; CHECK-NEXT:    [[STRIDED_VEC:%.*]] = call { <vscale x 4 x i32>, <vscale x 4 x i32> } @llvm.vector.deinterleave2.nxv8i32(<vscale x 8 x i32> [[WIDE_VEC]])
@@ -383,9 +383,7 @@ define void @test_reversed_load2_store2(ptr noalias nocapture readonly %A, ptr n
 ; CHECK-NEXT:    [[TMP12:%.*]] = add nsw <vscale x 4 x i32> [[REVERSE]], [[VEC_IND]]
 ; CHECK-NEXT:    [[TMP13:%.*]] = sub nsw <vscale x 4 x i32> [[REVERSE1]], [[VEC_IND]]
 ; CHECK-NEXT:    [[TMP14:%.*]] = getelementptr inbounds [[STRUCT_ST2]], ptr [[B:%.*]], i64 [[OFFSET_IDX]]
-; CHECK-NEXT:    [[TMP15:%.*]] = shl nuw nsw i64 [[TMP0]], 3
-; CHECK-NEXT:    [[TMP18:%.*]] = sub nsw i64 2, [[TMP15]]
-; CHECK-NEXT:    [[TMP19:%.*]] = getelementptr inbounds i32, ptr [[TMP14]], i64 [[TMP18]]
+; CHECK-NEXT:    [[TMP19:%.*]] = getelementptr inbounds i32, ptr [[TMP14]], i64 [[TMP8]]
 ; CHECK-NEXT:    [[REVERSE2:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP12]])
 ; CHECK-NEXT:    [[REVERSE3:%.*]] = call <vscale x 4 x i32> @llvm.vector.reverse.nxv4i32(<vscale x 4 x i32> [[TMP13]])
 ; CHECK-NEXT:    [[INTERLEAVED_VEC:%.*]] = call <vscale x 8 x i32> @llvm.v...
[truncated]

lukel97 · 2025-12-16T05:28:40Z

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

+  Value *Ptr = State.get(getOperand(0), true);
+  Value *Offset = State.get(getOffset(), true);
+  Value *ResultPtr = Builder.CreateGEP(getSourceElementType(), Ptr, Offset, "",
+                                       getGEPNoWrapFlags());


Is this now identical to VPVectorPointerRecipe? Not for this PR but maybe can we unify them at some point.

lukel97 · 2025-12-16T05:35:58Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

+          Clone->getVPSingleValue(), [&Operands](VPUser &U, unsigned Idx) {
+            return !is_contained(Operands,
+                                 cast<VPRecipeBase>(U).getVPSingleValue());
+          });


It's pretty unfortunate that changing the VF is a lot more complicated now. If it makes it an easier we should never have an ExplicitVectorLength with UF != 1. Maybe you can just assert here instead of trying to handle it fully generally.

lukel97 · 2025-12-16T05:38:18Z

llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp

  if (match(&CurRecipe,
            m_MaskedLoad(m_VPValue(EndPtr), m_RemoveMask(HeaderMask, Mask))) &&
-      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_Specific(&Plan->getVF()))) &&
+      match(EndPtr, m_VecEndPtr(m_VPValue(Addr), m_VPValue())) &&


You need to still match here for -1 * (VF - 1) otherwise it's not a correct transform. I guess you could add a helper matcher for that?

[VPlan] Directly unroll VectorEndPointerRecipe

69e0e23

Directly unroll VectorEndPointerRecipe following 0636225 ([VPlan] Directly unroll VectorPointerRecipe). It allows us to leverage existing VPlan simplifications to optimize.

artagnon requested review from fhahn and lukel97 December 15, 2025 21:33

llvmbot added backend:RISC-V backend:PowerPC vectorizers llvm:transforms labels Dec 15, 2025

lukel97 reviewed Dec 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[VPlan] Directly unroll VectorEndPointerRecipe #172372

[VPlan] Directly unroll VectorEndPointerRecipe #172372

Uh oh!

artagnon commented Dec 15, 2025

Uh oh!

llvmbot commented Dec 15, 2025 •

edited

Loading

Uh oh!

llvmbot commented Dec 15, 2025

Uh oh!

llvmbot commented Dec 15, 2025

Uh oh!

lukel97 Dec 16, 2025

Uh oh!

lukel97 Dec 16, 2025

Uh oh!

lukel97 Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[VPlan] Directly unroll VectorEndPointerRecipe #172372

Are you sure you want to change the base?

[VPlan] Directly unroll VectorEndPointerRecipe #172372

Uh oh!

Conversation

artagnon commented Dec 15, 2025

Uh oh!

llvmbot commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Dec 15, 2025

Uh oh!

llvmbot commented Dec 15, 2025

Uh oh!

lukel97 Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

lukel97 Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

lukel97 Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

llvmbot commented Dec 15, 2025 •

edited

Loading