[SYCL][NVPTX] Optimize ID queries when they fit in int #18999
Merged
+35
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The NVPTX target was unable to properly optimize the global ID query, despite the user specifying the -fsycl-id-queries-fit-in-int flag.
This is because, once linked, the compiler sees the global ID builtin as (i64 add (mul (i64 zext i32 A), (i64 zext i32 B), (i64 zext i32 C))). Despite knowing that each of A, B and C are 32-bit values, and the final result fits in a 32-bit value, it is not legal to replace this sequence with (i64 zext (add i32 (mul i32 A, B), C)), which is the ideal code here.
The solution to this problem is a new opt-in 'reflection' in the NVPTX implementation of the global ID builtin, which selects a more optimal version. The driver enables this reflection only when the user passes -fsycl-id-queries-fit-in-int.