On the legality of introducing spurious loads of `&UnsafeCell` (aka: dereferenceable and noalias don't interact well)

We require that `&UnsafeCell` are "dereferenceable on function entry" in the sense of pointing to allocated memory. This is believed to allow the introduction of spurious loads, if the compiler can prove that memory has not been deallocated yet. But that is [far from obvious](https://discourse.llvm.org/t/interaction-of-noalias-and-dereferenceable/66979)...

Consider:
```rust
use std::cell::UnsafeCell;

fn internal(x: &UnsafeCell<i32>, y: &mut i32, choice: impl FnOnce() -> bool) {
  if choice() {
    *y = 0;
  } else {
    let v = unsafe { x.get().read() };
    println!("{}", v);
  }
}

pub fn public(choice: impl FnOnce() -> bool) {
    let x = &UnsafeCell::new(0);
    let y = unsafe { &mut *x.get() };
    internal(x, y, choice);
}
```
Under LLVM `noalias` and under Tree Borrows, `public` is a sound function: if `choice()` is true, we write to `y` and nothing is even strange; if `choice()` is false then the read from `x` means `y` can no longer be written, but `y` is still valid for reads so the protector does not kick in. (In LLVM terms, `noalias` is completely fine with arbitrary aliasing as long as all accesses are reads, and if `choice()` is false then there are no writes.) SB is unhappy with this example, but SB is often too strict.

Let's focus on `internal`, and let's imagine we introduce a spurious load from `x` at the beginning of the function. If any kind of spurious load is allowed, this one definitely is. Then we observe that in the `else` case, `x` is being read twice, and let's say we know that the closure will not mutate `x` (it cannot, since `x` is privately allocated in `public` and its provenance is never leaked to outside code) -- let's say the `choice` function is marked "readonly". We arrive at:
```rust
fn internal(x: &UnsafeCell<i32>, y: &mut i32, choice: impl FnOnce() -> bool) {
  let v = unsafe { x.get().read() };
  if choice() {
    *y = 0;
  } else {
    println!("{}", v);
  }
}
```
Interpreted as Rust code this has obvious UB if `choice()` returns true, but okay, maybe our IR has a different semantics. But what could those semantics be?
- If `choice()` is true, the read must somehow be considered "invalid", maybe we make it return `poison` or so (similar to what would happen according to LLVM semantics if this read introduced a data race). Certainly the read must not observe the actual memory contents, because if it did it would not be reorderable around the aliasing write.
- If `choice()` is false, the read must *not* return `poison`, as printing a poison value is UB!
- `choice()` might well read from stdin and use that information to determine what it should return

In other words, the semantics of the read must *predict the future* to be able to decide whether it should return poison and keep the aliasing state as-is, or return the actual data and mark the memory as "must not be mutated through other pointers". Equivalently, the read is making an angelic choice between these two options.

So... any semantics that wants to introduce spurious loads for `&UnsafeCell` must at least use angelic choice, and deal with the consequences of that (e.g. angelic choice cannot in general be freely reordered down across demonic choice). We should hold off in having MIR transformations introduce such spurious loads until that is clarified.

The status quo is that we do not emit `dereferenceable` for these references, so LLVM shouldn't be introducing any spurious loads, so we are not at risk of being affected by this. Even if LLVM gains a "dereferenceable on entry" attribute, I think we shouldn't use it on `!Freeze` (and `!Unpin`) types.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

On the legality of introducing spurious loads of `&UnsafeCell` (aka: dereferenceable and noalias don't interact well) #435

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

On the legality of introducing spurious loads of &UnsafeCell (aka: dereferenceable and noalias don't interact well) #435

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

On the legality of introducing spurious loads of `&UnsafeCell` (aka: dereferenceable and noalias don't interact well) #435