Skip to content

Conversation

abidh
Copy link
Contributor

@abidh abidh commented Sep 5, 2025

Consider the following small OpenMP target region:

!$omp target map(tofrom: x)
  x = x + 1
!$omp end target

Currently, when compiled with flang, it will generate an outlined function like below (with irrelevant bits removed).

void @__omp_offloading_10303_14e8afc__QQmain_l13(ptr %0, ptr %1) { entry:
  %2 = alloca ptr, align 8, addrspace(5)
  %3 = addrspacecast ptr addrspace(5) %2 to ptr
...
  br i1 %exec_user_code, label %user_code.entry, label %worker.exit

user_code.entry:
  %5 = load ptr, ptr %3, align 8, !align !19
  br label %omp.region.after_alloca

omp.region.after_alloca:
  br label %outlined.body

outlined.body:
  br label %omp.target

omp.target:
    #dbg_declare(ptr addrspace(5) %2, !20, !DIExpression(), !21)
  ...
  br label %omp.region.cont, !dbg !23

omp.region.cont:
  call void @__kmpc_target_deinit()
  ret void

worker.exit:
  ret void
}

Due to how various basic blocks are generated to implement target region for device, the debug record for variable x end up in a different block to the location of the variable (%5 in this case). The backend can drop such debug records.

This PR moves such records in the correct block. If the location field of the debug record is an Instruction, it will ensure that debug record is in same block as the instruction. If the location is an Argument, the debug record will be moved to the entry block.

Some minor cleanup in the UpdateDebugRecord as that brings it to use the similar logic as in MoveDebugRecordToCorrectBlock.

Consider the following small OpenMP target region:

!$omp target map(tofrom: x)
  x = x + 1
!$omp end target

Currently, when compiled with flang, it will generate an outlined
function like below (with irrelevent bits removed).

void @__omp_offloading_10303_14e8afc__QQmain_l13(ptr %0, ptr %1) {
entry:
  %2 = alloca ptr, align 8, addrspace(5)
  %3 = addrspacecast ptr addrspace(5) %2 to ptr
...
  br i1 %exec_user_code, label %user_code.entry, label %worker.exit

user_code.entry:
  %5 = load ptr, ptr %3, align 8, !align !19
  br label %omp.region.after_alloca

omp.region.after_alloca:
  br label %outlined.body

outlined.body:
  br label %omp.target

omp.target:
    #dbg_declare(ptr addrspace(5) %2, !20, !DIExpression(), !21)
  ...
  br label %omp.region.cont, !dbg !23

omp.region.cont:
  call void @__kmpc_target_deinit()
  ret void

worker.exit:                                      ; preds = %entry
  ret void
}

Due to how various basic blocks are generated, the debug record for
variable x end up in a different block to the location of the variable
(%5 in this case). The backend drops such debug records.

This PR moves such records in the correct block. If the location field
of the debug record is an instruction, it will ensure that debug record
is in same block as the instruction. If the location is an argument, the
debug record will be moved to the entry block.
@llvmbot llvmbot added mlir:llvm mlir flang:openmp clang:openmp OpenMP related changes to Clang labels Sep 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 5, 2025

@llvm/pr-subscribers-mlir-llvm
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-flang-openmp

Author: Abid Qadeer (abidh)

Changes

Consider the following small OpenMP target region:

!$omp target map(tofrom: x)
  x = x + 1
!$omp end target

Currently, when compiled with flang, it will generate an outlined function like below (with irrelevant bits removed).

void @<!-- -->__omp_offloading_10303_14e8afc__QQmain_l13(ptr %0, ptr %1) { entry:
  %2 = alloca ptr, align 8, addrspace(5)
  %3 = addrspacecast ptr addrspace(5) %2 to ptr
...
  br i1 %exec_user_code, label %user_code.entry, label %worker.exit

user_code.entry:
  %5 = load ptr, ptr %3, align 8, !align !19
  br label %omp.region.after_alloca

omp.region.after_alloca:
  br label %outlined.body

outlined.body:
  br label %omp.target

omp.target:
    #dbg_declare(ptr addrspace(5) %2, !20, !DIExpression(), !21)
  ...
  br label %omp.region.cont, !dbg !23

omp.region.cont:
  call void @<!-- -->__kmpc_target_deinit()
  ret void

worker.exit:
  ret void
}

Due to how various basic blocks are generated to implement target region for device, the debug record for variable x end up in a different block to the location of the variable (%5 in this case). The backend can drop such debug records.

This PR moves such records in the correct block. If the location field of the debug record is an Instruction, it will ensure that debug record is in same block as the instruction. If the location is an Argument, the debug record will be moved to the entry block.

Some minor cleanup in the UpdateDebugRecord as that brings it to use the similar logic as in MoveDebugRecordToCorrectBlock.


Full diff: https://github.com/llvm/llvm-project/pull/157125.diff

4 Files Affected:

  • (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+34-9)
  • (modified) llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp (+122)
  • (added) mlir/test/Target/LLVMIR/omptarget-debug-record-pos.mlir (+47)
  • (modified) mlir/test/Target/LLVMIR/omptarget-debug-var-1.mlir (+4-4)
diff --git a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
index 03ea58318d4a9..4a9fa382f9638 100644
--- a/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+++ b/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
@@ -7364,29 +7364,54 @@ static void FixupDebugInfoForOutlinedFunction(
         OldVar->getFlags(), OldVar->getAlignInBits(), OldVar->getAnnotations());
     return NewVar;
   };
-
-  auto UpdateDebugRecord = [&](auto *DR) {
+  auto UpdateDebugRecord = [&](DbgVariableRecord *DR) {
     DILocalVariable *OldVar = DR->getVariable();
     unsigned ArgNo = 0;
-    for (auto Loc : DR->location_ops()) {
-      auto Iter = ValueReplacementMap.find(Loc);
-      if (Iter != ValueReplacementMap.end()) {
-        DR->replaceVariableLocationOp(Loc, std::get<0>(Iter->second));
-        ArgNo = std::get<1>(Iter->second) + 1;
-      }
+    if (DR->getNumVariableLocationOps() != 1u)
+      return;
+    auto Loc = DR->getVariableLocationOp(0u);
+    auto Iter = ValueReplacementMap.find(Loc);
+    if (Iter != ValueReplacementMap.end()) {
+      DR->replaceVariableLocationOp(Loc, std::get<0>(Iter->second));
+      ArgNo = std::get<1>(Iter->second) + 1;
     }
     if (ArgNo != 0)
       DR->setVariable(GetUpdatedDIVariable(OldVar, ArgNo));
   };
 
+  SmallVector<DbgVariableRecord *, 4> DVRsToDelete;
+  auto MoveDebugRecordToCorrectBlock = [&](DbgVariableRecord *DVR) {
+    if (DVR->getNumVariableLocationOps() != 1u)
+      return;
+    auto Loc = DVR->getVariableLocationOp(0u);
+    BasicBlock *CurBB = DVR->getParent();
+    BasicBlock *RequiredBB = nullptr;
+
+    if (Instruction *LocInst = dyn_cast<Instruction>(Loc))
+      RequiredBB = LocInst->getParent();
+    else if (isa<llvm::Argument>(Loc))
+      RequiredBB = &(DVR->getFunction()->getEntryBlock());
+
+    if (RequiredBB && RequiredBB != CurBB) {
+      assert(!RequiredBB->empty());
+      RequiredBB->insertDbgRecordBefore(DVR->clone(),
+                                        RequiredBB->back().getIterator());
+      DVRsToDelete.push_back(DVR);
+    }
+  };
+
   // The location and scope of variable intrinsics and records still point to
   // the parent function of the target region. Update them.
   for (Instruction &I : instructions(Func)) {
     assert(!isa<llvm::DbgVariableIntrinsic>(&I) &&
            "Unexpected debug intrinsic");
-    for (DbgVariableRecord &DVR : filterDbgVars(I.getDbgRecordRange()))
+    for (DbgVariableRecord &DVR : filterDbgVars(I.getDbgRecordRange())) {
       UpdateDebugRecord(&DVR);
+      MoveDebugRecordToCorrectBlock(&DVR);
+    }
   }
+  for (auto *DVR : DVRsToDelete)
+    DVR->getMarker()->MarkedInstr->dropOneDbgRecord(DVR);
   // An extra argument is passed to the device. Create the debug data for it.
   if (OMPBuilder.Config.isTargetDevice()) {
     DICompileUnit *CU = NewSP->getUnit();
diff --git a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
index c13570dc803b3..01a4383b089f2 100644
--- a/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
+++ b/llvm/unittests/Frontend/OpenMPIRBuilderTest.cpp
@@ -7078,6 +7078,128 @@ TEST_F(OpenMPIRBuilderTest, ConstantAllocaRaise) {
   EXPECT_TRUE(isa<ReturnInst>(ExitBlock->getFirstNonPHIIt()));
 }
 
+TEST_F(OpenMPIRBuilderTest, DebugRecordLoc) {
+  OpenMPIRBuilder OMPBuilder(*M);
+  OMPBuilder.setConfig(
+      OpenMPIRBuilderConfig(true, false, false, false, false, false, false));
+  OMPBuilder.initialize();
+
+  Function *OutlinedFn = nullptr;
+  F->setName("func");
+  IRBuilder<> Builder(BB);
+  OpenMPIRBuilder::LocationDescription Loc({Builder.saveIP(), DL});
+  Builder.SetCurrentDebugLocation(DL);
+  auto *Alloca = Builder.CreateAlloca(Builder.getInt32Ty());
+
+  llvm::SmallVector<llvm::Value *, 1> CapturedArgs = {
+      Alloca, Constant::getNullValue(PointerType::get(Ctx, 0))};
+
+  auto SimpleArgAccessorCB =
+      [&](llvm::Argument &Arg, llvm::Value *Input, llvm::Value *&RetVal,
+          llvm::OpenMPIRBuilder::InsertPointTy AllocaIP,
+          llvm::OpenMPIRBuilder::InsertPointTy CodeGenIP) {
+        IRBuilderBase::InsertPointGuard guard(Builder);
+        Builder.SetCurrentDebugLocation(llvm::DebugLoc());
+        if (!OMPBuilder.Config.isTargetDevice()) {
+          RetVal = cast<llvm::Value>(&Arg);
+          return CodeGenIP;
+        }
+        Builder.restoreIP(AllocaIP);
+        llvm::Value *Addr = Builder.CreateAlloca(
+            Arg.getType()->isPointerTy()
+                ? Arg.getType()
+                : Type::getInt64Ty(Builder.getContext()),
+            OMPBuilder.M.getDataLayout().getAllocaAddrSpace());
+        llvm::Value *AddrAscast =
+            Builder.CreatePointerBitCastOrAddrSpaceCast(Addr, Input->getType());
+        Builder.CreateStore(&Arg, AddrAscast);
+        Builder.restoreIP(CodeGenIP);
+        RetVal = Builder.CreateLoad(Arg.getType(), AddrAscast);
+        return Builder.saveIP();
+      };
+
+  llvm::OpenMPIRBuilder::MapInfosTy CombinedInfos;
+  auto GenMapInfoCB = [&](llvm::OpenMPIRBuilder::InsertPointTy codeGenIP)
+      -> llvm::OpenMPIRBuilder::MapInfosTy & {
+    CreateDefaultMapInfos(OMPBuilder, CapturedArgs, CombinedInfos);
+    return CombinedInfos;
+  };
+
+  auto CustomMapperCB = [&](unsigned int I) { return nullptr; };
+  auto BodyGenCB = [&](OpenMPIRBuilder::InsertPointTy AllocaIP,
+                       OpenMPIRBuilder::InsertPointTy CodeGenIP)
+      -> OpenMPIRBuilder::InsertPointTy {
+    IRBuilderBase::InsertPointGuard guard(Builder);
+    Builder.SetCurrentDebugLocation(llvm::DebugLoc());
+    Builder.restoreIP(CodeGenIP);
+    auto *mainSP = F->getSubprogram();
+    DIBuilder DIB(*M, false, mainSP->getUnit());
+    auto Type = DIB.createSubroutineType(DIB.getOrCreateTypeArray({}));
+    auto SP = DIB.createFunction(
+        mainSP->getScope(), "target", "", mainSP->getFile(), 2, Type, 2,
+        DINode::FlagZero,
+        DISubprogram::SPFlagDefinition | DISubprogram::SPFlagOptimized);
+    OutlinedFn = CodeGenIP.getBlock()->getParent();
+    OutlinedFn->setSubprogram(SP);
+    DebugLoc Loc = DILocation::get(Ctx, 3, 7, SP);
+    DIType *VoidPtrTy =
+        DIB.createQualifiedType(dwarf::DW_TAG_pointer_type, nullptr);
+    // The location of this variable is in the CapturedArgs so it will get the
+    // alloca/load/store chain in the auto SimpleArgAccessorCB and the location
+    // will change to the load instruction.
+    DILocalVariable *Var1 = DIB.createParameterVariable(
+        SP, "test1", /*ArgNo*/ 1, SP->getFile(), /*LineNo=*/1, VoidPtrTy);
+    DIB.insertDeclare(Alloca, Var1, DIB.createExpression(), Loc,
+                      Builder.GetInsertPoint());
+    // This variable will point directly to the function argument.
+    DILocalVariable *Var2 = DIB.createParameterVariable(
+        SP, "test2", /*ArgNo*/ 3, SP->getFile(), /*LineNo=*/1, VoidPtrTy);
+    DIB.insertDeclare(OutlinedFn->getArg(2), Var2, DIB.createExpression(), Loc,
+                      Builder.GetInsertPoint());
+
+    return Builder.saveIP();
+  };
+
+  IRBuilder<>::InsertPoint EntryIP(&F->getEntryBlock(),
+                                   F->getEntryBlock().end());
+  TargetRegionEntryInfo EntryInfo("parent", /*DeviceID=*/1, /*FileID=*/2,
+                                  /*Line=*/3, /*Count=*/0);
+  OpenMPIRBuilder::TargetKernelRuntimeAttrs RuntimeAttrs;
+  OpenMPIRBuilder::TargetKernelDefaultAttrs DefaultAttrs = {
+      /*ExecFlags=*/omp::OMPTgtExecModeFlags::OMP_TGT_EXEC_MODE_GENERIC,
+      /*MaxTeams=*/{-1}, /*MinTeams=*/0, /*MaxThreads=*/{0}, /*MinThreads=*/0};
+  llvm::OpenMPIRBuilder::TargetDataInfo Info(
+      /*RequiresDevicePointerInfo=*/false,
+      /*SeparateBeginEndCalls=*/true);
+
+  ASSERT_EXPECTED_INIT(
+      OpenMPIRBuilder::InsertPointTy, AfterIP,
+      OMPBuilder.createTarget(Loc, /*IsOffloadEntry=*/true, EntryIP, EntryIP,
+                              Info, EntryInfo, DefaultAttrs, RuntimeAttrs,
+                              /*IfCond=*/nullptr, CapturedArgs, GenMapInfoCB,
+                              BodyGenCB, SimpleArgAccessorCB, CustomMapperCB,
+                              {}, false));
+  EXPECT_EQ(DL, Builder.getCurrentDebugLocation());
+  Builder.restoreIP(AfterIP);
+
+  Builder.CreateRetVoid();
+  OMPBuilder.finalize();
+
+  // Check outlined function
+  EXPECT_FALSE(verifyModule(*M, &errs()));
+  EXPECT_NE(OutlinedFn, nullptr);
+  for (Instruction &I : instructions(OutlinedFn)) {
+    for (DbgVariableRecord &DVR : filterDbgVars(I.getDbgRecordRange())) {
+      EXPECT_EQ(DVR.getNumVariableLocationOps(), 1u);
+      auto Loc = DVR.getVariableLocationOp(0u);
+      if (Instruction *LocInst = dyn_cast<Instruction>(Loc))
+        EXPECT_EQ(DVR.getParent(), LocInst->getParent());
+      else if (isa<llvm::Argument>(Loc))
+        EXPECT_EQ(DVR.getParent(), &(OutlinedFn->getEntryBlock()));
+    }
+  }
+}
+
 TEST_F(OpenMPIRBuilderTest, CreateTask) {
   using InsertPointTy = OpenMPIRBuilder::InsertPointTy;
   OpenMPIRBuilder OMPBuilder(*M);
diff --git a/mlir/test/Target/LLVMIR/omptarget-debug-record-pos.mlir b/mlir/test/Target/LLVMIR/omptarget-debug-record-pos.mlir
new file mode 100644
index 0000000000000..bd89a5c1cdfeb
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/omptarget-debug-record-pos.mlir
@@ -0,0 +1,47 @@
+// RUN: mlir-translate -mlir-to-llvmir %s
+
+#di_basic_type = #llvm.di_basic_type<tag = DW_TAG_base_type, name = "real",
+ sizeInBits = 32, encoding = DW_ATE_float>
+#file = #llvm.di_file<"test.f90" in "">
+#di_null_type = #llvm.di_null_type
+#cu = #llvm.di_compile_unit<id = distinct[0]<>,
+ sourceLanguage = DW_LANG_Fortran95, file = #file, isOptimized = false,
+ emissionKind = Full>
+#sp_ty = #llvm.di_subroutine_type<callingConvention = DW_CC_program,
+ types = #di_null_type>
+#sp = #llvm.di_subprogram<compileUnit = #cu, name = "main", file=#file,
+ subprogramFlags = "Definition", type = #sp_ty>
+#sp1 = #llvm.di_subprogram<compileUnit = #cu, name = "target", file=#file,
+ subprogramFlags = "Definition", type = #sp_ty>
+#var1 = #llvm.di_local_variable<scope = #sp, name = "x", file = #file, line = 2,
+ type = #di_basic_type>
+#var2 = #llvm.di_local_variable<scope = #sp1, name = "x", file = #file,
+ line = 2, type = #di_basic_type>
+
+module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<"dlti.alloca_memory_space", 5 : ui32>>, llvm.target_triple = "amdgcn-amd-amdhsa", omp.is_target_device = true} {
+  llvm.func @main() {
+    %0 = llvm.mlir.constant(1 : i64) : i64 loc(#loc2)
+    %1 = llvm.alloca %0 x i1 : (i64) -> !llvm.ptr<5> loc(#loc2)
+    %2 = llvm.addrspacecast %1 : !llvm.ptr<5> to !llvm.ptr loc(#loc2)
+    llvm.intr.dbg.declare #var1 = %2 : !llvm.ptr loc(#loc2)
+    %4 = omp.map.info var_ptr(%2 : !llvm.ptr, f32) map_clauses(tofrom) capture(ByRef) -> !llvm.ptr {name = "x"} loc(#loc2)
+    omp.target map_entries(%4 -> %arg0 : !llvm.ptr) {
+      %5 = llvm.mlir.constant(1.000000e+00 : f32) : f32 loc(#loc3)
+      llvm.intr.dbg.declare #var2 = %arg0 : !llvm.ptr loc(#loc3)
+      %6 = llvm.load %arg0 : !llvm.ptr -> f32 loc(#loc3)
+      %7 = llvm.fadd %6, %5 {fastmathFlags = #llvm.fastmath<contract>} : f32 loc(#loc3)
+      llvm.store %7, %arg0 : f32, !llvm.ptr loc(#loc3)
+      omp.terminator loc(#loc3)
+    } loc(#loc4)
+    llvm.return loc(#loc2)
+  } loc(#loc5)
+}
+
+#loc2 = loc("test.f90":6:7)
+#loc3 = loc("test.f90":8:7)
+#loc4 = loc(fused<#sp1>[#loc3])
+#loc5 = loc(fused<#sp>[#loc2])
+
+// CHECK-LABEL: user_code.entry
+// CHECK: %[[LOAD:.*]] = load ptr
+// CHECK-NEXT:     #dbg_declare(ptr %[[LOAD]]{{.*}})
diff --git a/mlir/test/Target/LLVMIR/omptarget-debug-var-1.mlir b/mlir/test/Target/LLVMIR/omptarget-debug-var-1.mlir
index 8f42995af23a8..24d8d01a396b9 100644
--- a/mlir/test/Target/LLVMIR/omptarget-debug-var-1.mlir
+++ b/mlir/test/Target/LLVMIR/omptarget-debug-var-1.mlir
@@ -62,7 +62,7 @@ module attributes {dlti.dl_spec = #dlti.dl_spec<#dlti.dl_entry<"dlti.alloca_memo
 #loc5 = loc(fused<#sp1>[#loc2])
 
 // CHECK: ![[SP:[0-9]+]] = distinct !DISubprogram(name: "target"{{.*}})
-// CHECK: !DILocalVariable(name: "dyn_ptr", arg: 1, scope: ![[SP]]{{.*}}flags: DIFlagArtificial)
-// CHECK: !DILocalVariable(name: "x", arg: 2, scope: ![[SP]]{{.*}})
-// CHECK: !DILocalVariable(name: "arr", arg: 3, scope: ![[SP]]{{.*}})
-// CHECK: !DILocalVariable(name: "i", arg: 4, scope: ![[SP]]{{.*}})
+// CHECK-DAG: !DILocalVariable(name: "dyn_ptr", arg: 1, scope: ![[SP]]{{.*}}flags: DIFlagArtificial)
+// CHECK-DAG: !DILocalVariable(name: "x", arg: 2, scope: ![[SP]]{{.*}})
+// CHECK-DAG: !DILocalVariable(name: "arr", arg: 3, scope: ![[SP]]{{.*}})
+// CHECK-DAG: !DILocalVariable(name: "i", arg: 4, scope: ![[SP]]{{.*}})

@abidh abidh closed this Sep 11, 2025
@abidh abidh deleted the dbg_record_move branch September 11, 2025 11:03
@abidh abidh restored the dbg_record_move branch September 11, 2025 16:48
@abidh abidh reopened this Sep 11, 2025
@abidh
Copy link
Contributor Author

abidh commented Sep 11, 2025

This was accidentally closed. It is re-opened now. I would appreciate any review comments.

@abidh
Copy link
Contributor Author

abidh commented Sep 19, 2025

Polite ping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants