-
Notifications
You must be signed in to change notification settings - Fork 73
abi: Make glam
types abi layout match CPU
#380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: tool_rename
Are you sure you want to change the base?
Conversation
a881c58
to
acf572d
Compare
acf572d
to
5c6932a
Compare
0c2df59
to
23fe024
Compare
d8b1b46
to
11c141a
Compare
Hard for me to grasp the why. What are the pros and cons here? Will this break projects like https://github.com/LegNeato/rust-gpu-chimera or will it align spirv with what glam w/ cuda feature and/or rust-cuda does? |
4669fa0
to
e77ed1c
Compare
f4af86b
to
1b4b2d9
Compare
e77ed1c
to
c8e6c60
Compare
#[repr(transparent)] | ||
#[derive(Copy, Clone, Default, Eq, PartialEq)] | ||
#[cfg_attr(feature = "bytemuck", derive(bytemuck::Zeroable, bytemuck::Pod))] | ||
pub struct SubgroupMask(pub glam::UVec4); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of changing it from a separate struct to a typedef. The #[repr(transparent)]
stopped working since Vec4
from rustc's perspective is just a regular struct, which causes us to emit a struct wrapping OpTypeVector f32 4
instead of the OpTypeVector
directly. In glsl you explicitly use UVec4
for the bitmasks, which I'm not a big fan of since they represent something quite different than just a UVec4
. So I've opted for a wrapper struct that you may manually construct or destruct if you need to.
Alternative to keep the struct: I have a branch where I've adjusted all the intrinsics to unwrap the struct, which works flawlessly. The problem are entry point params like these:
rust-gpu/tests/compiletests/ui/arch/subgroup/subgroup_builtins.rs
Lines 11 to 15 in d9bb8aa
#[spirv(subgroup_eq_mask)] subgroup_eq_mask: SubgroupMask, | |
#[spirv(subgroup_ge_mask)] subgroup_ge_mask: SubgroupMask, | |
#[spirv(subgroup_gt_mask)] subgroup_gt_mask: SubgroupMask, | |
#[spirv(subgroup_le_mask)] subgroup_le_mask: SubgroupMask, | |
#[spirv(subgroup_lt_mask)] subgroup_lt_mask: SubgroupMask, |
I'd have to refactor our entry point logic to allow us to not just pass a plain BuiltIn
, but also modify it aka. wrap it with a SubgroupMask
. But since these are rarely used (like even more rare than any of the subgroup intrinsics themselves), I could see us changing their type to UVec4
and having the end user manually convert them, for now. Whenever we decide to revisit BuiltIns we can fix that. (I really want to change most read-only builtins to global functions in spirv-std, so they're not just magic values you have to know / research in the code)
aa8292c
to
f14db4e
Compare
Why is Why are |
glam is explicitly hiding the alignment specifiers on our spirv target. The table is a bit outdated, I already removed the cfg's in the glam branch this repo links to. I'm also on like the 3rd implementation, each time solving it slightly differently, but I'm happy with what I have right now. Though it'll block the release until glam has released a minor #[cfg_attr(not(target_arch = "spirv"), repr(align(16)))]
#[cfg_attr(not(target_arch = "spirv"), repr(C))]
#[cfg_attr(target_arch = "spirv", repr(simd))]
pub struct Vec3A {
pub x: f32,
pub y: f32,
pub z: f32,
} #[cfg_attr(
any(
not(any(feature = "scalar-math", target_arch = "spirv")),
feature = "cuda"
),
repr(align(16))
)]
#[cfg_attr(not(target_arch = "spirv"), repr(C))]
#[cfg_attr(target_arch = "spirv", repr(simd))]
pub struct Vec4 {
pub x: f32,
pub y: f32,
pub z: f32,
pub w: f32,
} |
With the glam patch will it only match on default settings, or if you use the |
I assumed there's going to me more issues with glam's |
f14db4e
to
a727c08
Compare
1b4b2d9
to
24a9bc9
Compare
a727c08
to
b17666f
Compare
38649e4
to
b91eda9
Compare
b91eda9
to
bbb81a2
Compare
This comment was marked as off-topic.
This comment was marked as off-topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I think this is a better way than continuing to sorta bastardize #[repr(simd)]
like my draft pr did. Thanks for pushing this forward and getting it done!
I'm not qualified to review this, so I will let others look at it |
d37bf4e
to
febdddf
Compare
febdddf
to
bdc2048
Compare
9950574
to
a769043
Compare
bdc2048
to
3cfe38a
Compare
(T, U) pairs in entrypoints and regular functions are now supported.
a769043
to
0bae80e
Compare
Old embark PR trying to do the same, though I made this PR is made from scratch due to the branches being too different: https://github.com/EmbarkStudios/rust-gpu/pull/1158/files
Objective
Currently, rust-gpu has special layout rules for glam types, which gives them a higher alignment than necessary. This causes structs using them to have a different layout (size, alignment, member offsets) in rust-gpu than they do on the CPU, causing a lot of UB when sending data between host and device. Worst of them is
Vec3
, which has a size of 12 on the CPU but 16 in rust-gpu, which is why we always recommended not to use them in structs shared between targets.Another common workaround was to use
glam
's featurecuda
, which would give all glam vectors a higher than necessary alignment. These alignments seem (not sure!) match those of rust-gpu, though they are defined in very different places and could drift out of sync.This is a mess, and this PR tries to clean it up by:
SpirvType::Vector
size and alignment to match layout of the type#[repr(SIMD)]
with a new custom#[rust_gpu::vector::v1]
#[rust_gpu::vector::v1]
is a new "stable" attribute, allows us to offer backwards compat in the futureglam/cuda
feature will continue to work, though must be turned on on both targets!glam::BVecN
(bool vecs) for freeSize and Alignment chart
(rust-gpu chooses chaotic evil)
difftest
SIMD hacks
feature gates
Notes:
#[repr(SIMD)]
hacks and replace them with a custom#[spirv(vector)]
that we control (and rustc can't change on a whim)abi::vector_layout*
difftest tests every single glam type, not just vecsglam/cuda
inabi::vector_layout_cuda
andglam/scalar-math
inabi::vector_layout_scalar_math