Skip to content

[LinkerWrapper] Fix -fsave-optimization-record default file #149003

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

jdenny-ornl
Copy link
Collaborator

@jdenny-ornl jdenny-ornl commented Jul 16, 2025

As discussed in PR #145603, the following command seems to fail to produce a YAML remarks file for offload LTO passes and thus for kernel-info:

clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
  -Rpass=kernel-info -fsave-optimization-record

The problem is that, in clang-linker-wrapper's clang call, clang names the file based on clang's main output file (from -o). That is a temporary file, so the YAML file becomes a temporary file, which the user never sees.

This patch:

  • Makes clang honor -dumpdir for the default YAML remarks file in the case of LTO.
  • Extends clang-linker-wrapper to specify that option to clang.

To demonstrate the appeal of the generality of -dumpdir (as opposed to a one-off -fsave-optimization-record solution in clang-linker-wrapper), this patch also fixes -gsplit-dwarf. Without this patch, when using -gsplit-dwarf and later debugging using rocgdb, the dwo directory for offload is a temporary file, so temporary file cleanup causes rocgdb to lose debug symbols for offload code.

TODO: The clang driver passes -dumpdir to various clang frontend calls. For LTO, that was previously being ignored, and now it's not. That changes some auxiliary file names, as revealed by changes in some existing tests' expected output: clang/test/Driver/opt-record.c and clang/test/Driver/lto-dwo.c. Are there backward compatibility concerns here?

@jdenny-ornl jdenny-ornl requested review from jdoerfert and jhuber6 July 16, 2025 01:08
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Jul 16, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 16, 2025

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Joel E. Denny (jdenny-ornl)

Changes

As discussed in PR #145603, the following command fails to produce a YAML remarks file for offload LTO passes and thus for kernel-info:

clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
  -Rpass=kernel-info -fsave-optimization-record

The problem is that, in clang-linker-wrapper's clang call, clang names the file based on clang's main output file (from -o). That is a temporary file, so the YAML file becomes a temporary file, which the user never sees.

This patch:

  • Extends clang with a hidden -foutput-file-base=BASE option that overrides the main output file as the base for other output files.
  • Makes clang honor that option only for the default YAML remarks file, but future patches could use it for other output files too.
  • Extends clang-linker-wrapper to specify that option to clang.

Full diff: https://github.com/llvm/llvm-project/pull/149003.diff

6 Files Affected:

  • (modified) clang/include/clang/Driver/Options.td (+6)
  • (modified) clang/include/clang/Frontend/FrontendOptions.h (+3)
  • (modified) clang/lib/Driver/ToolChains/CommonArgs.cpp (+6-2)
  • (modified) clang/test/Driver/linker-wrapper.c (+16-16)
  • (modified) clang/test/Driver/opt-record.c (+11)
  • (modified) clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp (+7-4)
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index d0b54a446309b..d96e1437f0f40 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5880,6 +5880,12 @@ def o : JoinedOrSeparate<["-"], "o">,
   Visibility<[ClangOption, CC1Option, CC1AsOption, FC1Option, FlangOption]>,
   HelpText<"Write output to <file>">, MetaVarName<"<file>">,
   MarshallingInfoString<FrontendOpts<"OutputFile">>;
+def foutput_file_base : Joined<["-"], "foutput-file-base=">,
+  Flags<[NoXarchOption, HelpHidden]>,
+  Visibility<[ClangOption]>,
+  HelpText<"Name extra output files after <base> not the main output file">,
+  MetaVarName<"<base>">,
+  MarshallingInfoString<FrontendOpts<"OutputFileBase">>;
 def object_file_name_EQ : Joined<["-"], "object-file-name=">,
   Visibility<[ClangOption, CC1Option, CC1AsOption, CLOption, DXCOption]>,
   HelpText<"Set the output <file> for debug infos">, MetaVarName<"<file>">,
diff --git a/clang/include/clang/Frontend/FrontendOptions.h b/clang/include/clang/Frontend/FrontendOptions.h
index c919a53ae089e..a03165e05962c 100644
--- a/clang/include/clang/Frontend/FrontendOptions.h
+++ b/clang/include/clang/Frontend/FrontendOptions.h
@@ -439,6 +439,9 @@ class FrontendOptions {
   /// The output file, if any.
   std::string OutputFile;
 
+  /// The base, if any, to use instead of OutputFile for extra output files.
+  std::string OutputFileBase;
+
   /// If given, the new suffix for fix-it rewritten files.
   std::string FixItSuffix;
 
diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 097d186ad8ea4..513425a855511 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -294,8 +294,9 @@ static void renderRemarksOptions(const ArgList &Args, ArgStringList &CmdArgs,
     Format = A->getValue();
 
   SmallString<128> F;
-  const Arg *A = Args.getLastArg(options::OPT_foptimization_record_file_EQ);
-  if (A)
+  if (const Arg *A = Args.getLastArg(options::OPT_foptimization_record_file_EQ))
+    F = A->getValue();
+  else if (const Arg *A = Args.getLastArg(options::OPT_foutput_file_base))
     F = A->getValue();
   else if (Output.isFilename())
     F = Output.getFilename();
@@ -1320,6 +1321,9 @@ void tools::addLTOOptions(const ToolChain &ToolChain, const ArgList &Args,
   if (Args.hasArg(options::OPT_ftime_report))
     CmdArgs.push_back(
         Args.MakeArgString(Twine(PluginOptPrefix) + "-time-passes"));
+
+  // clang-linker-wrapper adds this without checking if it is needed.
+  Args.ClaimAllArgs(options::OPT_foutput_file_base);
 }
 
 void tools::addOpenMPRuntimeLibraryPath(const ToolChain &TC,
diff --git a/clang/test/Driver/linker-wrapper.c b/clang/test/Driver/linker-wrapper.c
index 80b1a5745a123..d97328b6600a8 100644
--- a/clang/test/Driver/linker-wrapper.c
+++ b/clang/test/Driver/linker-wrapper.c
@@ -22,7 +22,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run \
 // RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=NVPTX-LINK
 
-// NVPTX-LINK: clang{{.*}} -o {{.*}}.img --target=nvptx64-nvidia-cuda -march=sm_70 {{.*}}.o {{.*}}.o
+// NVPTX-LINK: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.nvptx64.sm_70.img --target=nvptx64-nvidia-cuda -march=sm_70 {{.*}}.o {{.*}}.o
 
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70 \
@@ -40,7 +40,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run \
 // RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=AMDGPU-LINK
 
-// AMDGPU-LINK: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx908 -flto -Wl,--no-undefined {{.*}}.o {{.*}}.o
+// AMDGPU-LINK: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.amdgcn.gfx908.img --target=amdgcn-amd-amdhsa -mcpu=gfx908 -flto -Wl,--no-undefined {{.*}}.o {{.*}}.o
 
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%t.amdgpu.bc,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx1030 \
@@ -57,7 +57,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN: not clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run \
 // RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=SPIRV-LINK
 
-// SPIRV-LINK: clang{{.*}} -o {{.*}}.img --target=spirv64-unknown-unknown {{.*}}.o --sycl-link -Xlinker -triple=spirv64-unknown-unknown -Xlinker -arch=
+// SPIRV-LINK: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.spirv64..img --target=spirv64-unknown-unknown {{.*}}.o --sycl-link -Xlinker -triple=spirv64-unknown-unknown -Xlinker -arch=
 
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu \
@@ -68,7 +68,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN:   --linker-path=/usr/bin/ld.lld --whole-archive %t.a --no-whole-archive \
 // RUN:   %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=CPU-LINK
 
-// CPU-LINK: clang{{.*}} -o {{.*}}.img --target=x86_64-unknown-linux-gnu -Wl,--no-undefined {{.*}}.o {{.*}}.o -Wl,-Bsymbolic -shared -Wl,--whole-archive {{.*}}.a -Wl,--no-whole-archive
+// CPU-LINK: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.x86_64..img --target=x86_64-unknown-linux-gnu -Wl,--no-undefined {{.*}}.o {{.*}}.o -Wl,-Bsymbolic -shared -Wl,--whole-archive {{.*}}.a -Wl,--no-whole-archive
 
 // RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o
 // RUN: clang-linker-wrapper --dry-run --host-triple=x86_64-unknown-linux-gnu -mllvm -openmp-opt-disable \
@@ -100,8 +100,8 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN: clang-linker-wrapper --dry-run --host-triple=x86_64-unknown-linux-gnu \
 // RUN: --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=CUDA
 
-// CUDA: clang{{.*}} -o [[IMG_SM70:.+]] --target=nvptx64-nvidia-cuda -march=sm_70
-// CUDA: clang{{.*}} -o [[IMG_SM52:.+]] --target=nvptx64-nvidia-cuda -march=sm_52
+// CUDA: clang{{.*}} -o [[IMG_SM70:.+]] -foutput-file-base=a.out.nvptx64.sm_70.img --target=nvptx64-nvidia-cuda -march=sm_70
+// CUDA: clang{{.*}} -o [[IMG_SM52:.+]] -foutput-file-base=a.out.nvptx64.sm_52.img --target=nvptx64-nvidia-cuda -march=sm_52
 // CUDA: fatbinary{{.*}}-64 --create {{.*}}.fatbin --image=profile=sm_70,file=[[IMG_SM70]] --image=profile=sm_52,file=[[IMG_SM52]]
 // CUDA: usr/bin/ld{{.*}} {{.*}}.openmp.image.{{.*}}.o {{.*}}.cuda.image.{{.*}}.o
 
@@ -127,8 +127,8 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN:   --compress --compression-level=6 \
 // RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=HIP
 
-// HIP: clang{{.*}} -o [[IMG_GFX90A:.+]] --target=amdgcn-amd-amdhsa -mcpu=gfx90a
-// HIP: clang{{.*}} -o [[IMG_GFX908:.+]] --target=amdgcn-amd-amdhsa -mcpu=gfx908
+// HIP: clang{{.*}} -o [[IMG_GFX90A:.+]] -foutput-file-base=a.out.amdgcn.gfx90a.img --target=amdgcn-amd-amdhsa -mcpu=gfx90a
+// HIP: clang{{.*}} -o [[IMG_GFX908:.+]] -foutput-file-base=a.out.amdgcn.gfx908.img --target=amdgcn-amd-amdhsa -mcpu=gfx908
 // HIP: clang-offload-bundler{{.*}}-type=o -bundle-align=4096 -compress -compression-level=6 -targets=host-x86_64-unknown-linux-gnu,hip-amdgcn-amd-amdhsa--gfx90a,hip-amdgcn-amd-amdhsa--gfx908 -input={{/dev/null|NUL}} -input=[[IMG_GFX90A]] -input=[[IMG_GFX908]] -output={{.*}}.hipfb
 
 // RUN: clang-offload-packager -o %t.out \
@@ -157,7 +157,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run --clang-backend \
 // RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=CLANG-BACKEND
 
-// CLANG-BACKEND: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx908 -flto -Wl,--no-undefined {{.*}}.o
+// CLANG-BACKEND: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.amdgcn.gfx908.img --target=amdgcn-amd-amdhsa -mcpu=gfx908 -flto -Wl,--no-undefined {{.*}}.o
 
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70
@@ -180,8 +180,8 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run \
 // RUN:   --linker-path=/usr/bin/ld %t-on.o %t-off.o %t.a -o a.out 2>&1 | FileCheck %s --check-prefix=AMD-TARGET-ID
 
-// AMD-TARGET-ID: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx90a:xnack+ -flto -Wl,--no-undefined {{.*}}.o {{.*}}.o
-// AMD-TARGET-ID: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx90a:xnack- -flto -Wl,--no-undefined {{.*}}.o {{.*}}.o
+// AMD-TARGET-ID: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.amdgcn.gfx90a:xnack+.img --target=amdgcn-amd-amdhsa -mcpu=gfx90a:xnack+ -flto -Wl,--no-undefined {{.*}}.o {{.*}}.o
+// AMD-TARGET-ID: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.amdgcn.gfx90a:xnack-.img --target=amdgcn-amd-amdhsa -mcpu=gfx90a:xnack- -flto -Wl,--no-undefined {{.*}}.o {{.*}}.o
 
 // RUN: clang-offload-packager -o %t-lib.out \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=amdgcn-amd-amdhsa,arch=generic
@@ -196,8 +196,8 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run \
 // RUN:   --linker-path=/usr/bin/ld %t1.o %t2.o %t.a -o a.out 2>&1 | FileCheck %s --check-prefix=ARCH-ALL
 
-// ARCH-ALL: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx90a -flto -Wl,--no-undefined {{.*}}.o {{.*}}.o
-// ARCH-ALL: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx908 -flto -Wl,--no-undefined {{.*}}.o {{.*}}.o
+// ARCH-ALL: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.amdgcn.gfx90a.img --target=amdgcn-amd-amdhsa -mcpu=gfx90a -flto -Wl,--no-undefined {{.*}}.o {{.*}}.o
+// ARCH-ALL: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.amdgcn.gfx908.img --target=amdgcn-amd-amdhsa -mcpu=gfx908 -flto -Wl,--no-undefined {{.*}}.o {{.*}}.o
 
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu \
@@ -207,7 +207,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN:   --linker-path=/usr/bin/ld.lld -r %t.o \
 // RUN:   %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=RELOCATABLE-LINK
 
-// RELOCATABLE-LINK: clang{{.*}} -o {{.*}}.img --target=x86_64-unknown-linux-gnu
+// RELOCATABLE-LINK: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.x86_64..img --target=x86_64-unknown-linux-gnu
 // RELOCATABLE-LINK: /usr/bin/ld.lld{{.*}}-r
 // RELOCATABLE-LINK: llvm-objcopy{{.*}}a.out --remove-section .llvm.offloading
 
@@ -219,7 +219,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN:   --linker-path=/usr/bin/ld.lld -r %t.o \
 // RUN:   %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=RELOCATABLE-LINK-HIP
 
-// RELOCATABLE-LINK-HIP: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa
+// RELOCATABLE-LINK-HIP: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.amdgcn.gfx90a.img --target=amdgcn-amd-amdhsa
 // RELOCATABLE-LINK-HIP: clang-offload-bundler{{.*}} -type=o -bundle-align=4096 -targets=host-x86_64-unknown-linux-gnu,hip-amdgcn-amd-amdhsa--gfx90a -input={{/dev/null|NUL}} -input={{.*}} -output={{.*}}
 // RELOCATABLE-LINK-HIP: /usr/bin/ld.lld{{.*}}-r
 // RELOCATABLE-LINK-HIP: llvm-objcopy{{.*}}a.out --remove-section .llvm.offloading
@@ -233,7 +233,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN:   --linker-path=/usr/bin/ld.lld -r %t.o \
 // RUN:   %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=RELOCATABLE-LINK-CUDA
 
-// RELOCATABLE-LINK-CUDA: clang{{.*}} -o {{.*}}.img --target=nvptx64-nvidia-cuda
+// RELOCATABLE-LINK-CUDA: clang{{.*}} -o {{.*}}.img -foutput-file-base=a.out.nvptx64.sm_89.img --target=nvptx64-nvidia-cuda
 // RELOCATABLE-LINK-CUDA: fatbinary{{.*}} -64 --create {{.*}}.fatbin --image=profile=sm_89,file={{.*}}.img
 // RELOCATABLE-LINK-CUDA: /usr/bin/ld.lld{{.*}}-r
 // RELOCATABLE-LINK-CUDA: llvm-objcopy{{.*}}a.out --remove-section .llvm.offloading
diff --git a/clang/test/Driver/opt-record.c b/clang/test/Driver/opt-record.c
index 220f5db7fbca2..afbc2a012abb1 100644
--- a/clang/test/Driver/opt-record.c
+++ b/clang/test/Driver/opt-record.c
@@ -78,3 +78,14 @@
 // CHECK-PASS-RPASS-SAME: "-plugin-opt=opt-remarks-hotness-threshold=100"
 
 // CHECK-PASS-AUTO:   "-plugin-opt=opt-remarks-hotness-threshold=auto"
+
+// Check -foutput-file-base effect on -foptimization-record-file.
+// RUN: %clang --target=x86_64-linux -### -fuse-ld=lld -B%S/Inputs/lld -flto -fsave-optimization-record -foutput-file-base=/dir/file.ext %s 2>&1 | FileCheck %s -check-prefix=CHECK-BASE
+// RUN: %clang --target=x86_64-linux -### -o FOO -fuse-ld=lld -B%S/Inputs/lld -flto -fsave-optimization-record -foutput-file-base=/dir/file.ext %s 2>&1 | FileCheck %s -check-prefix=CHECK-BASE
+// RUN: %clang --target=x86_64-linux -### -fuse-ld=lld -B%S/Inputs/lld -flto -fsave-optimization-record -foptimization-record-file=user-file.ext -foutput-file-base=/dir/file.ext %s 2>&1 | FileCheck %s -check-prefix=CHECK-IGNORE-BASE
+
+// CHECK-BASE:      "-plugin-opt=opt-remarks-filename=/dir/file.ext.opt.ld.yaml"
+// CHECK-BASE-SAME: "-plugin-opt=opt-remarks-format=yaml"
+
+// CHECK-IGNORE-BASE:      "-plugin-opt=opt-remarks-filename=user-file.ext.opt.ld.yaml"
+// CHECK-IGNORE-BASE-SAME: "-plugin-opt=opt-remarks-format=yaml"
diff --git a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
index 9d34b62da20f5..65555d8608882 100644
--- a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -16,6 +16,7 @@
 
 #include "clang/Basic/TargetID.h"
 #include "clang/Basic/Version.h"
+#include "clang/Driver/Options.h"
 #include "llvm/ADT/MapVector.h"
 #include "llvm/BinaryFormat/Magic.h"
 #include "llvm/Bitcode/BitcodeWriter.h"
@@ -474,10 +475,10 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args,
   StringRef Arch = Args.getLastArgValue(OPT_arch_EQ);
   // Create a new file to write the linked device image to. Assume that the
   // input filename already has the device and architecture.
-  auto TempFileOrErr =
-      createOutputFile(sys::path::filename(ExecutableName) + "." +
-                           Triple.getArchName() + "." + Arch,
-                       "img");
+  std::string OutputFileBase = "." + Triple.getArchName().str() + "." +
+                               Arch.str();
+  auto TempFileOrErr = createOutputFile(
+    sys::path::filename(ExecutableName) + OutputFileBase, "img");
   if (!TempFileOrErr)
     return TempFileOrErr.takeError();
 
@@ -486,6 +487,8 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args,
       "--no-default-config",
       "-o",
       *TempFileOrErr,
+      Args.MakeArgString("-foutput-file-base=" + ExecutableName +
+                         OutputFileBase + ".img"),
       Args.MakeArgString("--target=" + Triple.getTriple()),
   };
 

As discussed in PR llvm#145603, the following command fails to produce a
YAML remarks file for offload LTO passes and thus for kernel-info:

```
clang -O2 -g -fopenmp --offload-arch=native test.c -foffload-lto \
  -Rpass=kernel-info -fsave-optimization-record
```

The problem is that, in clang-linker-wrapper's clang call, clang names
the file based on clang's main output file (from `-o`).  That is a
temporary file, so the YAML file becomes a temporary file, which the
user never sees.

This patch:
- Extends clang with a hidden `-foutput-file-base=BASE` option that
  overrides the main output file as the base for other output files.
- Makes clang honor that option only for the default YAML remarks
  file, but future patches could use it for other output files too.
- Extends clang-linker-wrapper to specify that option to clang.
@jdenny-ornl jdenny-ornl force-pushed the fix-clang-linker-wrapper-remarks-yaml branch from b5dd483 to 670d2ab Compare July 16, 2025 01:10
Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the problem is the names getting overwritten between the host and the device? Could we append the filename with the target / offloading kind when we're in offloading mode? If it's in the embedded clang job, I'm wondering why the names conflict since it's supposed to use a generated name that includes the triple.

@jdenny-ornl
Copy link
Collaborator Author

So the problem is the names getting overwritten between the host and the device?

No. The problem is the default name is temporary file because the main output file (-o) is a temporary file.

@jhuber6
Copy link
Contributor

jhuber6 commented Jul 16, 2025

Could we not just pass the filename manually with OPT_foptimization_record_file_EQ?

@jdenny-ornl
Copy link
Collaborator Author

We = the user? Then you run into the clobbering problem you mentioned for other yaml files. Moreover, why shouldn't -fsave-optimization-record just work like it does elsewhere?

@jhuber6
Copy link
Contributor

jhuber6 commented Jul 16, 2025

We = the user? Then you run into the clobbering problem you mentioned for other yaml files. Moreover, why shouldn't -fsave-optimization-record just work like it does elsewhere?

The issue is the embedded job, I figured we could just create the appropriate filename in the linker wrapper.

@jdenny-ornl
Copy link
Collaborator Author

So we = clang-linker-wrapper? Yes, we could. But I figured this problem might prove not to be a one-off, and -foutput-file-base is a more general solution that could be used elsewhere.

@jhuber6
Copy link
Contributor

jhuber6 commented Jul 16, 2025

So we = clang-linker-wrapper? Yes, we could. But I figured this problem might prove not to be a one-off, and -foutput-file-base is a more general solution that could be used elsewhere.

I'm just hesitant to add a new flag that's used just for this (Though I've done similar hacks in the past.) I'll add some more clang people to get their opinion.

@jdenny-ornl
Copy link
Collaborator Author

So we = clang-linker-wrapper? Yes, we could. But I figured this problem might prove not to be a one-off, and -foutput-file-base is a more general solution that could be used elsewhere.

I'm just hesitant to add a new flag that's used just for this (Though I've done similar hacks in the past.) I'll add some more clang people to get their opinion.

Thanks. Looking through clang/lib/Driver/ToolChains/CommonArgs.cpp, I see other places where the main output file appears to be used to compute new file names (OPT_gsplit_dwarf, getStatsFileName). I'm not familiar with those features, and I haven't investigated them specifically yet to prove the same problem arises. Maybe someone else already knows.

Copy link

github-actions bot commented Jul 16, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@jdenny-ornl
Copy link
Collaborator Author

jdenny-ornl commented Jul 16, 2025

I was able to extend the patch to benefit -gsplit-dwarf when using rocgdb. Without this patch, the dwo directory for offload is placed in /tmp, so /tmp cleanup causes rocgdb to lose debug symbols for offload code.

Hopefully 2 use cases instead of 1 increases the appeal of -foutput-file-base.

@@ -5895,6 +5895,11 @@ def o : JoinedOrSeparate<["-"], "o">,
Visibility<[ClangOption, CC1Option, CC1AsOption, FC1Option, FlangOption]>,
HelpText<"Write output to <file>">, MetaVarName<"<file>">,
MarshallingInfoString<FrontendOpts<"OutputFile">>;
def foutput_file_base : Joined<["-"], "foutput-file-base=">,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a new driver option? For extra output files we can use -dumpdir https://maskray.me/blog/2023-04-25-compiler-output-files

Copy link
Collaborator Author

@jdenny-ornl jdenny-ornl Jul 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I didn't know about -dumpdir. It would indeed be better to use an existing option.

However, I ran into a couple of issues:

  1. The primary output file name has an arbitrary hash because it goes in /tmp. For example: /tmp/a.out.amdgcn.gfx906-78e4d4.img. With the -dumpdir solution, the auxiliary output files, which the user actually wants, would end up with those too, making their names unpredictable. Would implementing gcc's -dumpbase be the right way to avoid that?
  2. Because -foutput-file-base is an internal option not really meant for users, I took the liberty of having Clang always disable any warnings about it being unused. Is there some way for clang-linker-wrapper to tell Clang not to warn if -dumpdir is unused without suppressing warnings about all other options? Of course, we could add another option for that, but I'm hoping there's some existing way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 2, I just discovered --start-no-unused-arguments, so never mind that one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 2, I later discovered that unused warnings don't happen for -dumpdir in this case anyway due to another use of it.

For 1, I had misunderstood the behavior of -dumpdir: it replaces the output file name too, not just its directory. So, I've replaced my -foutput-file-base with -dumpdir.

I ran into a potential issue: The clang driver passes -dumpdir to various clang frontend calls. For LTO, that was previously being ignored, and now it's not. That changes some auxiliary file names, as revealed by changes in some existing tests' expected output: clang/test/Driver/opt-record.c and clang/test/Driver/lto-dwo.c. Are there backward compatibility concerns here?

Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't leave garbage in the output directory right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants