Skip to content

[Driver][SYCL] Remove object upon failure #18190

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: sycl
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion clang/lib/Driver/Driver.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2708,7 +2708,11 @@ int Driver::ExecuteCompilation(
// Remove result files if we're not saving temps.
if (!isSaveTempsEnabled()) {
const JobAction *JA = cast<JobAction>(&FailingCommand->getSource());
C.CleanupFileMap(C.getResultFiles(), JA, true);
// When performing offload compilations, the result files may not match
// the JobAction that fails. In that case, do not pass in the JobAction
Copy link
Contributor

@srividya-sundaram srividya-sundaram Apr 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// When performing offload compilations, the result files may not match the JobAction that fails.

Can you add some details on why they don't match?
And also give an example Offload JobAction for this scenario?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generated result files are associated with a specific JobAction. One example here is with the old offloading model. Here, the output object from clang++ -fsycl file.cpp -o file.o is generated from the clang-offload-bundler call. Both the host and device compilations have temporary output files, so the JobAction associated with the actual fail (if one fails) including the output file do not match.

Here, the failing action is a BackendCompileJobAction but the file that needs to be removed is from the OffloadBundlingJobAction

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the output object from clang++ -fsycl file.cpp -o file.o is generated from the clang-offload-bundler call.

clang++ -fsycl file.cpp -o file.o will trigger both compilation and linking at the same invocation and will not call clang-offload-bundler.

AFAIK, when compilation is separated from linking, the compilation step ends with engaging the offload bundler to generate "fat object" (<host object, device code IR> pair).

Both the host and device compilations have temporary output files, so the JobAction associated with the actual fail (if one fails) including the output file do not match

Can you clarify this?
If host or device compilation fails, there will be no host object or device object file generated right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh - yeah -c is needed in that compilation line. Thanks for pointing that out.

Can you clarify this?
If host or device compilation fails, there will be no host object or device object file generated right?

Yes, that is right, no object will be generated. The issue is an existing object that needs to be cleaned up. During the object compilation, the compiler is responsible for cleaning up the output file upon failure. Since the output file in those instances are temporary - there is no association with the actual output file pointed to with -o. The driver knows what the final output object file is, so we need to take the steps to clean up upon failure.

Copy link
Contributor

@srividya-sundaram srividya-sundaram Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did I understand it correctly that if the host and device object files have been created, but the clang-offload-bundler fails to produce the fat object, we should still proceed with deleting the host and device object files?
In this case, the failing JobAction is OffloadBundlingJobAction but the host and device object files were created by a different JobAction.

Is it not possible to pass the JobAction associated with generating the host and device object files instead of not passing any JA at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the failing JobAction is OffloadBundlingJobAction but the host and device object files were created by a different JobAction.

Is it not possible to pass the JobAction associated with generating the host and device object files instead of not passing any JA at all?

The JobAction associated with the host/device object generation is the the BackendCompileJobAction. The JobAction associated with the clang-offload-bundler call is the OffloadBundlingJobAction. Passing in the JobAction associated with the host and device compilation will not match the JobAction that is associated with the actual output file we want to remove, so ultimately the file will not be removed when we want it to be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the actual output file we want to remove

What is the actual output file we want to remove in the case where OffloadBundlingJobAction fails?
My understanding is we want to remove the host and device object files, since those will be the only ones created.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the OffloadBundlingJobAction fails, we still want to remove the resulting output file, which is the 'fat' object. This is the named object file.o from clang++ -c -fsycl file.cpp -o file.o

// to allow for the proper resulting file to be removed upon failure.
C.CleanupFileMap(C.getResultFiles(),
C.getActiveOffloadKinds() ? nullptr : JA, true);

// Failure result files are valid unless we crashed.
if (CommandRes < 0)
Expand Down
26 changes: 26 additions & 0 deletions clang/test/Driver/sycl-obj-remove.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
/// Verify object removal when the offload compilation fails.

// REQUIRES: system-linux

/// Force failure during device compilation
// RUN: touch %t.o
// RUN: not %clangxx -fsycl -Xsycl-target-frontend -DCOMPILE_FAIL=1 -c -o %t.o %s
// RUN: not ls %t.o

// RUN: touch %t.o
// RUN: not %clangxx --offload-new-driver -fsycl -Xsycl-target-frontend -DCOMPILE_FAIL=1 -c -o %t.o %s
// RUN: not ls %t.o

/// Force failure during compilation
// RUN: touch %t.o
// RUN: not %clangxx -fsycl -DCOMPILE_FAIL=1 -c -o %t.o %s
// RUN: not ls %t.o

// RUN: touch %t.o
// RUN: not %clangxx --offload-new-driver -fsycl -DCOMPILE_FAIL=1 -c -o %t.o %s
// RUN: not ls %t.o

void func(){};
#ifdef COMPILE_FAIL
#error FAIL
#endif // COMPILE_FAIL