-
Notifications
You must be signed in to change notification settings - Fork 392
batch: suppress warning for SIMD loop vectorization failure with clang #2032
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| OSL_OMP_COMPLEX_SIMD_LOOP(simdlen(__OSL_WIDTH)) | ||
| for (int lane = 0; lane < __OSL_WIDTH; ++lane) { | ||
| Vec3 V = wV[lane]; | ||
| if (wr.mask()[lane]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These hardly seem complex, just turning around and calling Vec3::length(), for your build which Vec3 implementation is being used?
Does this version of clang not have a vectorized version of sqrt?
Perhaps there is a required clang compiler flag to allow the SIMD version of sqrt?
If not, perhaps better to conditional dependency around those specific usages for this specific clang version.
Can you reproduce in a godbolt.org?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed strategy as noted in my PR update. Try to vectorize, but suppress the warning for those loops that were problematic.
When building with a newer version of clang, I uncovered a few more SIMD pragma loops fail to vectorize on clang. Since it seems to depend on the compiler version, and I don't want to disable attempted vectorization, I thought a better strategy was just to disable the warnings when it fails to vectorize in those specific places. Add a "latest dependencies with clang" test, that's how I stumbled across these in the first place. Signed-off-by: Larry Gritz <[email protected]>
|
I've amended the PR to change strategy -- instead of not trying to vectorize those loops, instead I just suppress the warning that was breaking the build when vectorization failed. The new description of the PR is as follows: Background: We knew all along that some of the omp SIMD loops that worked So we made a After adding a "latest dependencies with clang" test that test against clang 18 First, I tried using |
|
Note: the OptiX test failure is happening across all branch and is not related to this PR. |
|
This approach is probably fine. Just worry compilers will not get feedback they need improvement, but I suppose we still get full SIMD benefit of all the JIT'd code. |
|
This is only affecting clang. People interested in squeezing all possible benefit from the batch shading are advised to use icx or gcc. If somebody wants to try to chase down exactly which of these loops are or aren't vectorizing for a particular version of clang, probably the best approach is to make a way to control whether that warning is or isn't disabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Background: We knew all along that some of the omp SIMD loops that worked
fine with icc, icx, and even gcc, failed to parallelize the loop for clang,
and the warnings issued as a result would break the build in CI where we
use compile options to turn warnings into hard errors.
So we made a
OSL_OMP_COMPLEX_SIMD_LOOPmacro that would issue the ompsimd pragma for other compilers, but not clang. These were set up by Alex
Wells, who through trial and error found the loops that clang wouldn't vectorize
and so marked them. This was a few years ago, so probably involved clang 14
or so? We hoped that in the future, new clang releases would be better at
vectorizing those loops and we could eventually reduce the number of places
we used the special "anywhere but clang" macro.
After adding a "latest dependencies with clang" test that test against clang 18
(sorry, that's not exactly state of the art, but still several releases newer
than when Alex did this work), it sure seems that FEWER such loops vectorize
properly, so it breaks the build with errors that look like this
First, I tried using
OSL_OMP_COMPLEX_SIMD_LOOPin more places, but eventuallydecided that rather than not even try to vectorize (and somehow have to discern
exactly which clang versions could or could not do it with each loop), a
better strategy was to try to vectorize, but simply disable the warning that is
issued when we can't vectorize.