Skip to content

Conversation

aelovikov-intel
Copy link
Contributor

@aelovikov-intel aelovikov-intel commented Sep 26, 2025

Compilation of #include <sycl/sycl.hpp> is slow and that's especially problematic for SYCL RTC (run-time compilation). One way to overcome this is fine-grained includes that are being pursued separately. Another way is to employ clang's precompiled headers support which this PR is doing. Those two approaches can be combined, and this PR adds test-e2e/PerformanceTests/KernelCompiler/auto-pch.cpp that gives some idea of the PCH impact. The test shows PCH benefits when compiling some of the fine-grained includes on top of absolute minimum required to compiled SYCL RTC's "Hello world". From one of the CI runs:

Extra Headers Without PCH With auto-PCH
176ms 137ms 136ms 136ms 136ms 226ms 64ms 64ms 64ms 64ms
sycl/half_type.hpp 165ms 165ms 165ms 165ms 165ms 267ms 71ms 72ms 72ms 72ms
sycl/ext/oneapi/bfloat16.hpp 174ms 173ms 173ms 173ms 173ms 279ms 76ms 73ms 73ms 74ms
sycl/marray.hpp 142ms 143ms 142ms 142ms 143ms 235ms 66ms 66ms 66ms 66ms
sycl/vector.hpp 296ms 290ms 290ms 290ms 290ms 487ms 124ms 125ms 125ms 125ms
sycl/multi_ptr.hpp 278ms 278ms 276ms 275ms 274ms 441ms 125ms 125ms 125ms 125ms
sycl/builtins.hpp 537ms 533ms 531ms 531ms 531ms 883ms 218ms 218ms 219ms 218ms

It misses sycl/sycl.hpp line because that currently crashes FE when reading the generated PCH, the crash is being investigated/fixed separately.

Implementation-wise I'm reusing existing upstream clang::PrecompiledPreamble with one minor modification. It seems that PrecompiledPreamble's main usage is for things like clangd so it ignores errors in the code. I've modified it so that those errors would break pch-generation the same way normal compilation would break. I'm also not sure if we'd want that long-term, because it seems that making such "auto-pch" persistent would deviate from the upstream version of PrecompiledPreamble even more. I can imagine that in some near future we'd need to "fork" it into a separate utility. Still, seems to be fine for the first step.

Driver modifications are for the --auto-pch option support that should only be present on the SYCL RTC path and not for the regular clang invocations from the command line. I'm relatively confident those will stay in future.

@aelovikov-intel aelovikov-intel changed the title [DRAFT][SYCL RTC] Implement --auto-pch support [SYCL RTC] Introduce --auto-pch support Sep 30, 2025
Copy link
Contributor

@gmlueck gmlueck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! This seems like a great feature. I have some comments / questions on the documentation and exact behavior, though. See below.

@gmlueck
Copy link
Contributor

gmlueck commented Oct 1, 2025

It misses sycl/sycl.hpp line because that currently crashes FE when reading the generated PCH, the crash is being investigated/fixed separately.

@tahonermann I assume your team is the one investigating this? I think this needs to be fixed before we can make a release with the --auto-pch option because the main use case (and the example in the documentation) does #include <sycl/sycl.hpp>. How should we handle this dependency? Is it safe to merge this PR with the expectation that the FE will be fixed before the next release? Or, should we hold this PR until there is a fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants