Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 13 additions & 2 deletions compiler/rustc_codegen_ssa/src/size_of_val.rs
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,10 @@ pub fn size_and_align_of_dst<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>>(
// Furthermore, `align >= unsized_align`, and therefore we only need to do:
// let full_size = (unsized_offset_unadjusted + unsized_size).align_to(full_align);

let full_size = bx.add(unsized_offset_unadjusted, unsized_size);
// This is the unrounded size before alignment padding. It cannot exceed the
// rounded size, which itself cannot exceed `isize::MAX`. Thus the addition
// cannot overflow `isize::MAX`, let alone `usize::MAX`.
let unrounded_size = bx.unchecked_suadd(unsized_offset_unadjusted, unsized_size);

// Issue #27023: must add any necessary padding to `size`
// (to make it a multiple of `align`) before returning it.
Expand All @@ -173,10 +176,18 @@ pub fn size_and_align_of_dst<'a, 'tcx, Bx: BuilderMethods<'a, 'tcx>>(
// `(size + (align-1)) & -align`
let one = bx.const_usize(1);
let addend = bx.sub(full_align, one);
let add = bx.add(full_size, addend);
let add = bx.add(unrounded_size, addend);
let neg = bx.neg(full_align);
let full_size = bx.and(add, neg);

// Alignment rounding can only increase the size, never decrease it:
// `round_up(x, a) >= x` for power-of-two `a`. With the `nuw` on the
// addition above, LLVM can therefore deduce
// `full_size >= unrounded_size >= offset`, which proves `full_size > 0`
// for types with a non-zero-sized prefix (#152788).
let size_ge = bx.icmp(IntPredicate::IntUGE, full_size, unrounded_size);
bx.assume(size_ge);
Comment on lines +183 to +189
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on which things you tried and why this is the best one? Was it not enough to say that the alignment is a power-of-two? Or...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ask because most of the text in the OP is just useless LLM slop, and the updates to the tests make me suspicious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scottmcm

Can you elaborate on which things you tried and why this is the best one? Was it not enough to say that the alignment is a power-of-two? Or...

Tried nuw-only (unchecked_uadd) first. That gives LLVM unrounded >= offset > 0 but it stops at the rounding — LLVM can't prove (x + a-1) & -a >= x. Also checked whether feeding ctpop(align) == 1 would help, but there's no fold for "round-up is monotonic when alignment is pow2" in InstCombine/ValueTracking. So the assume tells LLVM the conclusion directly.

nsw (making it unchecked_suadd) is because unrounded ≤ rounded ≤ isize::MAX. Same reasoning as your #152867.

I ask because most of the text in the OP is just useless LLM slop, and the updates to the tests make me suspicious.

Sorry about the OP — English isn't my native language, I overwrite when trying to be precise. Will clean it up.

For the tests: CHECK-NOT: icmp broke because assume itself emits an icmp. The !range checks on the first two functions were dropped because the assume keeps the size computation alive, so there's now a size load before the alignment load — FileCheck hits the wrong one. Range metadata is still verified in align_load_from_align_of_val below. RANGE_METAALIGN_RANGE since it only covers alignment loads now. Range value {1, 0}{1, 0x20000001} is Align::max_for_target (same change as #152929).

Happy to close this if you'd rather land it as part of #152867.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Landing this separately is great -- I opened the issue because this particular bit about what LLVM can prove is different enough from the point of layout_of_val that it's better to have the changes separated. (That's why I pulled out #152929 too 🙂 )

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yeah, I experimented a bit https://llvm.godbolt.org/z/haGYz7aax and even getting lots of annotations on everything and assume it's still not able to understand what's happening properly.

(Also it's so annoying to see add nsw i64 %4, -1 since that used to be sub nuw nsw i64 %4, 1 but LLVM just insists on throwing that information away.)


(full_size, full_align)
}
_ => bug!("size_and_align_of_dst: {t} not supported"),
Expand Down
36 changes: 36 additions & 0 deletions tests/codegen-llvm/dst-size-of-val-not-zst.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
//@ compile-flags: -Copt-level=3 -Z merge-functions=disabled
//@ needs-deterministic-layouts

#![crate_type = "lib"]

// Regression test for #152788: `size_of_val(p) == 0` should optimize to `false`
// for types whose statically-known prefix makes them clearly not ZSTs.
//
// This works because:
// 1. The `offset + unsized_size` addition has NUW+NSW, so LLVM knows
// `unrounded_size >= offset`
// 2. An `llvm.assume` tells LLVM `aligned_size >= unrounded_size`
// 3. Together: `aligned_size >= unrounded_size >= offset > 0`

pub struct Foo<T: ?Sized>(pub [u32; 3], pub T);

// CHECK-LABEL: @size_of_val_dyn_not_zero
#[no_mangle]
pub fn size_of_val_dyn_not_zero(p: &Foo<dyn std::fmt::Debug>) -> bool {
// CHECK: ret i1 false
std::mem::size_of_val(p) == 0
}

// CHECK-LABEL: @size_of_val_slice_u8_not_zero
#[no_mangle]
pub fn size_of_val_slice_u8_not_zero(p: &Foo<[u8]>) -> bool {
// CHECK: ret i1 false
std::mem::size_of_val(p) == 0
}

// CHECK-LABEL: @size_of_val_slice_i32_not_zero
#[no_mangle]
pub fn size_of_val_slice_i32_not_zero(p: &Foo<[i32]>) -> bool {
// CHECK: ret i1 false
std::mem::size_of_val(p) == 0
}
7 changes: 3 additions & 4 deletions tests/codegen-llvm/dst-vtable-align-nonzero.rs
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,8 @@ pub struct Struct<W: ?Sized> {
pub fn eliminates_runtime_check_when_align_1(
x: &Struct<WrapperWithAlign1<dyn Trait>>,
) -> &WrapperWithAlign1<dyn Trait> {
// CHECK: load [[USIZE:i[0-9]+]], {{.+}} !range [[RANGE_META:![0-9]+]]
// CHECK: load [[USIZE:i[0-9]+]]
// CHECK-NOT: llvm.umax
// CHECK-NOT: icmp
// CHECK-NOT: select
// CHECK: ret
&x.dst
Expand All @@ -43,7 +42,7 @@ pub fn eliminates_runtime_check_when_align_1(
pub fn does_not_eliminate_runtime_check_when_align_2(
x: &Struct<WrapperWithAlign2<dyn Trait>>,
) -> &WrapperWithAlign2<dyn Trait> {
// CHECK: [[X0:%[0-9]+]] = load [[USIZE]], {{.+}} !range [[RANGE_META]]
// CHECK: load [[USIZE]]
// CHECK: {{icmp|llvm.umax}}
// CHECK: ret
&x.dst
Expand All @@ -52,7 +51,7 @@ pub fn does_not_eliminate_runtime_check_when_align_2(
// CHECK-LABEL: @align_load_from_align_of_val
#[no_mangle]
pub fn align_load_from_align_of_val(x: &dyn Trait) -> usize {
// CHECK: {{%[0-9]+}} = load [[USIZE]], {{.+}} !range [[RANGE_META]]
// CHECK: {{%[0-9]+}} = load [[USIZE]], {{.+}} !range [[RANGE_META:![0-9]+]]
core::mem::align_of_val(x)
}

Expand Down
Loading