Skip to content

size_of_val(p) == 0 doesn't optimize out for clearly-not-ZST values #152788

@scottmcm

Description

@scottmcm

The following rust

pub struct Foo<T: ?Sized>(pub [u32; 3], pub T);

#[unsafe(no_mangle)]
pub fn demo(p: &Foo<dyn std::fmt::Debug>) -> bool {
    std::mem::size_of_val(p) == 0
}

currently https://rust.godbolt.org/z/r1d5n6Phe optimizes to

define noundef zeroext i1 @demo(ptr noundef nonnull readnone align 4 captures(none) %p.0, ptr noalias noundef readonly align 8 captures(none) dereferenceable(32) %p.1) unnamed_addr #0 {
start:
  %0 = getelementptr inbounds nuw i8, ptr %p.1, i64 8
  %1 = load i64, ptr %0, align 8, !range !3, !invariant.load !4
  %2 = getelementptr inbounds nuw i8, ptr %p.1, i64 16
  %3 = load i64, ptr %2, align 8, !range !5, !invariant.load !4
  %4 = tail call i64 @llvm.umax.i64(i64 %3, i64 4)
  %5 = add nuw i64 %1, 11
  %6 = add i64 %5, %4
  %7 = sub i64 0, %4
  %8 = and i64 %6, %7
  %_0 = icmp eq i64 %8, 0
  ret i1 %_0
}

but clearly that type has to be at least 12 bytes no matter what.

Why does that matter? Well, whenever you drop a Box it needs to check for ZSTs before calling the allocator. So that check should optimize out in these cases, but it doesn't today.

Filed here because we're probably not telling LLVM enough to be able to do that itself. #152786 will give it a bit more information, but probably not enough to solve this.


For some DSTs it does optimize, notably

pub fn demo_simpleslice(p: &Foo<[i32]>) -> bool {
    std::mem::size_of_val(p) == 0
}

optimizes to false.

However, this one

pub fn demo_lessalignedslice(p: &Foo<[u8]>) -> bool {
    std::mem::size_of_val(p) == 0
}

doesn't today https://rust.godbolt.org/z/nnhGKz767

define noundef zeroext i1 @demo_lessalignedslice(ptr noalias noundef readonly align 4 captures(none) dereferenceable(12) %p.0, i64 noundef %p.1) unnamed_addr #0 {
start:
  %0 = add i64 %p.1, 15
  %_0 = icmp ult i64 %0, 4
  ret i1 %_0
}

(That last one might get fixed with #152786, though, because the icmp can only be true if it wraps, which definitely can't happen.)

Metadata

Metadata

Assignees

Labels

A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationC-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions