abi: Make `glam` types abi layout match CPU #380

Firestar99 · 2025-09-08T15:17:19Z

Requires glam spirv PRs:

spirv: derive Hash and impl AsRef bitshifter/glam-rs#675

spirv: vector attribute bitshifter/glam-rs#676

Revert: Don't use bytemuck derive on spirv. bitshifter/glam-rs#677

Requires rust-gpu PRs:

No longer required: ~~Support scalar pair ABI #381~~

Replaced by: abi: use PassMode::Direct even for data types that can be passed as scalar pairs. #437

versionize rust_gpu tool to ensure codegen and spirv-std versions match #401

Old embark PR trying to do the same, though I made this PR is made from scratch due to the branches being too different: https://github.com/EmbarkStudios/rust-gpu/pull/1158/files

Objective

Currently, rust-gpu has special layout rules for glam types, which gives them a higher alignment than necessary. This causes structs using them to have a different layout (size, alignment, member offsets) in rust-gpu than they do on the CPU, causing a lot of UB when sending data between host and device. Worst of them is Vec3, which has a size of 12 on the CPU but 16 in rust-gpu, which is why we always recommended not to use them in structs shared between targets.

Another common workaround was to use glam's feature cuda, which would give all glam vectors a higher than necessary alignment. These alignments seem (not sure!) match those of rust-gpu, though they are defined in very different places and could drift out of sync.

This is a mess, and this PR tries to clean it up by:

have SpirvType::Vector size and alignment to match layout of the type
remove all spirv-specific layout hacks in glam
replacing #[repr(SIMD)] with a new custom #[rust_gpu::vector::v1]
- #[rust_gpu::vector::v1] is a new "stable" attribute, allows us to offer backwards compat in the future
- we cab remove our hacks around SIMD repr
- allows glam to revert Don't use bytemuck derive on spirv. bitshifter/glam-rs#663
glam/cuda feature will continue to work, though must be turned on on both targets!
we get support for glam::BVecN (bool vecs) for free

Size and Alignment chart

(rust-gpu chooses chaotic evil)

type	offset in difftest	CPU		current		removing SIMD hacks		removing glam feature gates
type	offset in difftest	size	align	size	align	size	align	size	align
UVec2	0x400	8	4	8	8	8	4	8	4
IVec2	0x500	8	4	8	8	8	4	8	4
Vec2	0x600	8	4	8	8	8	4	8	4
UVec3	0x800	12	4	16	16	12	4	12	4
IVec3	0x900	12	4	16	16	12	4	12	4
Vec3	0xa00	12	4	16	16	12	4	12	4
Vec3A	0xb00	16	16	16	16	12	4	16	16
UVec4	0xc00	16	4	16	16	16	4	16	4
IVec4	0xd00	16	4	16	16	16	4	16	4
Vec4	0xe00	16	16	16	16	16	4	16	16

Notes:

removing SIMD hacks: misalignment caused by glam hiding alignment from spirv, see comment below
current implementation matches CPU exactly, but requires modifications in glam to remove these feature gates hiding alignment
if we have to update glam anyway, may as well remove our and glam's #[repr(SIMD)] hacks and replace them with a custom #[spirv(vector)] that we control (and rustc can't change on a whim)
the new abi::vector_layout* difftest tests every single glam type, not just vecs
- we also test with glam/cuda in abi::vector_layout_cuda and glam/scalar-math in abi::vector_layout_scalar_math

LegNeato · 2025-09-14T03:47:23Z

Hard for me to grasp the why. What are the pros and cons here? Will this break projects like https://github.com/LegNeato/rust-gpu-chimera or will it align spirv with what glam w/ cuda feature and/or rust-cuda does?

Firestar99 · 2025-09-17T10:25:29Z

crates/spirv-std/src/arch/subgroup.rs

-#[repr(transparent)]
-#[derive(Copy, Clone, Default, Eq, PartialEq)]
-#[cfg_attr(feature = "bytemuck", derive(bytemuck::Zeroable, bytemuck::Pod))]
-pub struct SubgroupMask(pub glam::UVec4);


I'm not a fan of changing it from a separate struct to a typedef. The #[repr(transparent)] stopped working since Vec4 from rustc's perspective is just a regular struct, which causes us to emit a struct wrapping OpTypeVector f32 4 instead of the OpTypeVector directly. In glsl you explicitly use UVec4 for the bitmasks, which I'm not a big fan of since they represent something quite different than just a UVec4. So I've opted for a wrapper struct that you may manually construct or destruct if you need to.

Alternative to keep the struct: I have a branch where I've adjusted all the intrinsics to unwrap the struct, which works flawlessly. The problem are entry point params like these:

rust-gpu/tests/compiletests/ui/arch/subgroup/subgroup_builtins.rs

Lines 11 to 15 in d9bb8aa

#[spirv(subgroup_eq_mask)] subgroup_eq_mask: SubgroupMask,

#[spirv(subgroup_ge_mask)] subgroup_ge_mask: SubgroupMask,

#[spirv(subgroup_gt_mask)] subgroup_gt_mask: SubgroupMask,

#[spirv(subgroup_le_mask)] subgroup_le_mask: SubgroupMask,

#[spirv(subgroup_lt_mask)] subgroup_lt_mask: SubgroupMask,

I'd have to refactor our entry point logic to allow us to not just pass a plain BuiltIn, but also modify it aka. wrap it with a SubgroupMask. But since these are rarely used (like even more rare than any of the subgroup intrinsics themselves), I could see us changing their type to UVec4 and having the end user manually convert them, for now. Whenever we decide to revisit BuiltIns we can fix that. (I really want to change most read-only builtins to global functions in spirv-std, so they're not just magic values you have to know / research in the code)

nnethercote · 2025-09-18T02:50:18Z

Vec3A 0xb00 16 16 16 16 *12* *4*
Vec4 0xe00 16 16 16 16 16 *4*

Why is Vec3A's size only 12 bytes?

Why are Vec3A and Vec4s alignment only 4 bytes?

Firestar99 · 2025-09-18T08:19:10Z

glam is explicitly hiding the alignment specifiers on our spirv target. The table is a bit outdated, I already removed the cfg's in the glam branch this repo links to.

I'm also on like the 3rd implementation, each time solving it slightly differently, but I'm happy with what I have right now. Though it'll block the release until glam has released a minor 0.30.6 patch for us fixing this on glam's side and we'll loose backwards compat.

#[cfg_attr(not(target_arch = "spirv"), repr(align(16)))]
#[cfg_attr(not(target_arch = "spirv"), repr(C))]
#[cfg_attr(target_arch = "spirv", repr(simd))]
pub struct Vec3A {
    pub x: f32,
    pub y: f32,
    pub z: f32,
}

https://github.com/bitshifter/glam-rs/blob/9a992a8bacba784053368bc0b7000c95aa5895b6/src/f32/scalar/vec3a.rs#L30-L37

#[cfg_attr(
    any(
        not(any(feature = "scalar-math", target_arch = "spirv")),
        feature = "cuda"
    ),
    repr(align(16))
)]
#[cfg_attr(not(target_arch = "spirv"), repr(C))]
#[cfg_attr(target_arch = "spirv", repr(simd))]
pub struct Vec4 {
    pub x: f32,
    pub y: f32,
    pub z: f32,
    pub w: f32,
}

https://github.com/bitshifter/glam-rs/blob/9a992a8bacba784053368bc0b7000c95aa5895b6/src/f32/scalar/vec4.rs#L27-L41

FacelessTiger · 2025-09-18T14:06:34Z

With the glam patch will it only match on default settings, or if you use the scalar-math feature will Vec4 be 4 byte alignement for example

Firestar99 · 2025-09-18T14:10:25Z

I assumed there's going to me more issues with glam's cuda and scalar-math features, since both of them modify alignments of structs. Currently modifying the abi::vector_layout difftest to also test with these features and verifying that feature unification didn't mess with the results.

Firestar99 · 2025-10-13T14:57:13Z

With all dependent PRs merged (#381 has been replaced by #437), this is ready to review and merge!

Firestar99 force-pushed the vec3-12-bytes branch from a881c58 to acf572d Compare September 8, 2025 15:19

Firestar99 changed the base branch from main to difftest_refactor September 8, 2025 15:20

Firestar99 force-pushed the vec3-12-bytes branch from acf572d to 5c6932a Compare September 8, 2025 16:09

Firestar99 force-pushed the difftest_refactor branch from 0c2df59 to 23fe024 Compare September 10, 2025 09:11

Firestar99 force-pushed the vec3-12-bytes branch 3 times, most recently from d8b1b46 to 11c141a Compare September 11, 2025 17:25

Firestar99 mentioned this pull request Sep 16, 2025

Support scalar pair ABI #381

Draft

Firestar99 force-pushed the vec3-12-bytes branch 2 times, most recently from 4669fa0 to e77ed1c Compare September 16, 2025 16:58

Firestar99 force-pushed the difftest_refactor branch from f4af86b to 1b4b2d9 Compare September 17, 2025 09:45

Firestar99 force-pushed the vec3-12-bytes branch from e77ed1c to c8e6c60 Compare September 17, 2025 09:56

Firestar99 commented Sep 17, 2025

View reviewed changes

Firestar99 mentioned this pull request Sep 17, 2025

asm: deny OpTypeVector, always infer type from asm params #392

Merged

Firestar99 force-pushed the vec3-12-bytes branch 2 times, most recently from aa8292c to f14db4e Compare September 17, 2025 16:08

Firestar99 mentioned this pull request Sep 18, 2025

RFC: glam-ification of spirv-std #393

Closed

Firestar99 force-pushed the vec3-12-bytes branch from f14db4e to a727c08 Compare September 18, 2025 14:54

Firestar99 force-pushed the difftest_refactor branch from 1b4b2d9 to 24a9bc9 Compare September 18, 2025 14:55

Firestar99 force-pushed the vec3-12-bytes branch from a727c08 to b17666f Compare September 18, 2025 15:13

Firestar99 changed the base branch from difftest_refactor to main September 18, 2025 15:14

Firestar99 force-pushed the vec3-12-bytes branch 2 times, most recently from 38649e4 to b91eda9 Compare September 18, 2025 17:17

Firestar99 mentioned this pull request Sep 22, 2025

versionize rust_gpu tool to ensure codegen and spirv-std versions match #401

Merged

Firestar99 force-pushed the vec3-12-bytes branch from b91eda9 to bbb81a2 Compare September 22, 2025 16:01

Firestar99 added 13 commits October 13, 2025 16:18

abi layout compiletest: fix invalid-matrix-type

e9828e0

abi layout: change Subgroup from transparent struct to typedef

977fe49

abi layout compiletest: bless complex_image_sample_inst

bf899ed

abi layout: remove #[repr(SIMD)] hack

807ea93

abi layout difftest: add all remaining glam types

3a1b41e

abi layout difftest: cuda and scalar-math feature forwarding

d4c2113

abi layout difftest: cuda and scalar-math feature testing

c414591

abi layout: minor code cleanups

dbfef21

abi layout: improve documentation on Scalar and Vector

fbb36aa

abi layout: limit vectors to at most 4 components, as spec states

c58aaec

abi layout: add some compiletests

582df86

abi layout: assert member offsets of vectors are as expected

8c715a2

compiletest: update readme normalize examples

ce5d860

Firestar99 force-pushed the vec3-12-bytes branch from db60a40 to ce5d860 Compare October 13, 2025 14:18

Firestar99 marked this pull request as ready for review October 13, 2025 14:52

Firestar99 requested review from LegNeato, eddyb and schell as code owners October 13, 2025 14:52

abi layout: remove pqp_cg_ssa patch for #[repr(SIMD)]

f81832c

This was referenced Oct 14, 2025

Compiletest normalization #439

Closed

Scalar and Vector types refactor #440

Merged

add trait ScalarOrVectorComposite #441

Open

eddyb approved these changes Oct 15, 2025

View reviewed changes

eddyb added this pull request to the merge queue Oct 15, 2025

Merged via the queue into main with commit 2aa4d4f Oct 15, 2025
13 checks passed

eddyb deleted the vec3-12-bytes branch October 15, 2025 10:09

This was referenced Oct 16, 2025

difftest: nextest support and speedups #334

Open

Add TestEnv for Nextest support Rust-GPU/cargo-gpu#120

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

abi: Make `glam` types abi layout match CPU #380

abi: Make `glam` types abi layout match CPU #380

Uh oh!

Firestar99 commented Sep 8, 2025 •

edited

Loading

Uh oh!

LegNeato commented Sep 14, 2025

Uh oh!

Firestar99 Sep 17, 2025

Uh oh!

nnethercote commented Sep 18, 2025 •

edited

Loading

Uh oh!

Firestar99 commented Sep 18, 2025

Uh oh!

FacelessTiger commented Sep 18, 2025

Uh oh!

Firestar99 commented Sep 18, 2025

Uh oh!

Firestar99 commented Oct 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

	#[spirv(subgroup_eq_mask)] subgroup_eq_mask: SubgroupMask,
	#[spirv(subgroup_ge_mask)] subgroup_ge_mask: SubgroupMask,
	#[spirv(subgroup_gt_mask)] subgroup_gt_mask: SubgroupMask,
	#[spirv(subgroup_le_mask)] subgroup_le_mask: SubgroupMask,
	#[spirv(subgroup_lt_mask)] subgroup_lt_mask: SubgroupMask,

abi: Make glam types abi layout match CPU #380

abi: Make glam types abi layout match CPU #380

Uh oh!

Conversation

Firestar99 commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Objective

Size and Alignment chart

Uh oh!

LegNeato commented Sep 14, 2025

Uh oh!

Firestar99 Sep 17, 2025

Choose a reason for hiding this comment

Uh oh!

nnethercote commented Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Firestar99 commented Sep 18, 2025

Uh oh!

FacelessTiger commented Sep 18, 2025

Uh oh!

Firestar99 commented Sep 18, 2025

Uh oh!

Firestar99 commented Oct 13, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

abi: Make `glam` types abi layout match CPU #380

abi: Make `glam` types abi layout match CPU #380

Firestar99 commented Sep 8, 2025 •

edited

Loading

nnethercote commented Sep 18, 2025 •

edited

Loading