Minimum guarantees regarding `UnsafeCell`

As part of https://github.com/google/zerocopy/issues/251, we're trying to teach zerocopy to reason in more precise ways about `UnsafeCell`s. I'm making this issue as sort of an omnibus issue to clarify a number of questions regarding `UnsafeCell`s. As always with zerocopy, I'm only interested in what can currently be promised, not what might be promised in the future.

In particular, my questions concern when it's sound for different code to *disagree* on whether a given memory region is covered by an `UnsafeCell`. 

# Constraints

Based on Miri's behavior and APIs that already exist in the standard library, we can observe some constraints on what code is sound/unsound. In particular:
- Any time Miri treats code as unsound, I take that to mean that we cannot guarantee that it is sound even if it might be decided to be sound at some point in the future
- Just because Miri treats code as sound, it may not be *guaranteed* to be sound
- If a stable API exists which implies that code is sound, then it probably *is* guaranteed to be sound

## It's unsound (under stacked borrows) to have multiple *active* references which "disagree" about whether a memory region is covered by an `UnsafeCell`

**Status: can't assume sound**

*Prior art: https://github.com/rust-lang/unsafe-code-guidelines/issues/455, https://github.com/rust-lang/unsafe-code-guidelines/issues/303*

As soon as there are multiple references which are "active"\* or "not shadowed" or whatever the proper terminology is, and these references disagree about whether a given memory region is covered by an `UnsafeCell`, this is insta-UB according to stacked borrows even if the references are never used for anything. For example:

```rust
let u = 1usize;
let u_ref = &u;
let _c_ref = unsafe { &*(u_ref as *const usize as *const Cell<usize>) }; // Insta-UB
```

\*Previously, this post used the term "live", but @CAD97 [pointed out](https://github.com/rust-lang/unsafe-code-guidelines/issues/495#issuecomment-1979872804) that "active" is the more accurate term.

<details>
<summary>Miri output</summary>

```text
error: Undefined Behavior: trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
 --> src/main.rs:6:27
  |
6 |     let _c_ref = unsafe { &*(u_ref as *const usize as *const Cell<usize>) };
  |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |                           |
  |                           trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
  |                           this error occurs as part of retag at alloc809[0x0..0x8]
  |
  = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
  = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1716> was created by a SharedReadOnly retag at offsets [0x0..0x8]
 --> src/main.rs:6:30
  |
6 |     let _c_ref = unsafe { &*(u_ref as *const usize as *const Cell<usize>) };
  |                              ^^^^^
  = note: BACKTRACE (of the first span):
  = note: inside `main` at src/main.rs:6:27: 6:74
```

</details>

## It's still unsound even if the overlap is only "partial"

**Status: can't assume sound**

In this example, only *some* of the bytes go from covered by an `UnsafeCell` to not covered.

```rust
#[repr(C)]
struct Foo(Cell<u32>, u32);

let u = 1u64;
let u_ref = &u;
let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) }; // Insta-UB
```

<details>
<summary>Miri output</summary>

```text
error: Undefined Behavior: trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
 --> src/main.rs:9:27
  |
9 |     let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) };
  |                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |                           |
  |                           trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
  |                           this error occurs as part of retag at alloc809[0x0..0x8]
  |
  = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
  = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1716> was created by a SharedReadOnly retag at offsets [0x0..0x8]
 --> src/main.rs:9:30
  |
9 |     let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) };
  |                              ^^^^^
  = note: BACKTRACE (of the first span):
  = note: inside `main` at src/main.rs:9:27: 9:64
```

</details>

## It's no longer unsound if the `UnsafeCell` region is of zero size

**Status: *might* be able to guarantee sound**

In other words, it's not the mere presence of `UnsafeCell` that matters, but rather the bytes that are "covered" by `UnsafeCell`s. For this reason, ZST `UnsafeCell`s have no effect; this code is accepted by Miri:

```rust
// `Cell<()>` is a ZST
#[repr(C)]
struct Foo(Cell<()>, u64);

let u = 1u64;
let u_ref = &u;
let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) };
```

## It *is* sound to have multiple references disagree about what *type* a memory region has so long as those references agree on `UnsafeCell`s

**Status: probably guaranteed sound**

For example, in both of the following cases, while the types are not the same, the byte ranges covered by `UnsafeCell`s are the same.

```rust
let u = 1usize;
let u_ref = &u;
// Convert from `&usize` to `&isize`. Neither type has `UnsafeCell`s.
let _i_ref = unsafe { &*(u_ref as *const usize as *const isize) };
    
let cu = Cell::new(1usize);
let cu_ref = &cu;
// Convert from `&Cell<usize>` to `&Cell<isize>`. Both types have
// `UnsafeCell`s covering the same byte ranges.
let _ci_ref = unsafe { &*(cu_ref as *const Cell<usize> as *const Cell<isize>) };
```

## It *is* sound to have multiple references disagree about `UnsafeCell`s so long as only one of them is "active"

**Status: guaranteed sound**

In this example, I believe that what's happening is that `u_mut` becomes inaccessible so long as `_c_mut` exists, so there's no way to exercise the `UnsafeCell` disagreement. Not only is this accepted by Miri in practice, it's also codified in the `UnsafeCell` methods [`get_mut`](https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html#method.get_mut) (stable) and [`from_mut`](https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html#method.from_mut) (unstable nightly feature).

```rust
let mut u = 1usize;
let u_mut = &mut u;
let _c_mut = unsafe { &mut *(u_mut as *mut usize as *mut Cell<usize>) };
```

## It *is* sound to read a non-`UnsafeCell` location as an `UnsafeCell` via `ptr::read`

**Status: unclear**

I'm not sure how to think about why this is sound, and whether it's guaranteed to remain sound.

```rust
let u = 1usize;
let u_ref = &u;
let _c = unsafe { ptr::read(u_ref as *const usize as *const Cell<usize>) };
```

## It *is* sound to read an `UnsafeCell` location as a non-`UnsafeCell` via `ptr::read`

**Status: unclear**

I'm even less sure how to think about this one - my intuition tells me that Miri should reject this, but it doesn't.

```rust
let c = Cell::new(1usize);
let c_ref = &c;
let _u = unsafe { ptr::read(c_ref as *const Cell<usize> as *const usize) };
```

---

# Guarantees

I would like to be able to rely on the following guarantees. **For each guarantee, my questions are**:
- Is this already documented somewhere as being guaranteed?
- If not, would it be acceptable (at least to T-opsem) to add documentation to guarantee it?

## Sound given `UnsafeCell` agreement

In particular: it is sound to create references to the same memory which disagree about type so long as they agree about which bytes are covered by `UnsafeCell`. It is sound to use such references to perform loads and stores (assuming those loads and stores do not violate the referent types' bit validity).



## These analyses only consider active references

For the purposes of the previous soundness guarantee, only "active" (i.e., non-shadowed) references are considered.\* *This allows us to prove that code which reborrows a `&mut T` (with no `UnsafeCell`s) as a `&mut U` (with `UnsafeCell`s) or vice versa (e.g., [`UnsafeCell::get_mut`](https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html#method.get_mut)) is sound.*

*I am aware that the formal definition of "active-ness" is up in the air, and subject to models like stacked borrows and tree borrows. I am assuming that there is **some** way we can formally define borrows so that we can get some minimum guarantees about this section despite not having completely agreed on the full model.*

## It is sound to read a non-`UnsafeCell` location as an `UnsafeCell` via `ptr::read`

Given `&T` where `T` doesn't contain any `UnsafeCell`s, it is sound to convert `&T` to `*const U` (where `U` does contain `UnsafeCell`s) and use `ptr::read` to read a `U` out of that memory location (assuming that this doesn't violate `U`'s bit validity, read uninitialized bytes, read unaligned memory, etc etc).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Minimum guarantees regarding `UnsafeCell` #495

Constraints

It's unsound (under stacked borrows) to have multiple active references which "disagree" about whether a memory region is covered by an `UnsafeCell`

It's still unsound even if the overlap is only "partial"

It's no longer unsound if the `UnsafeCell` region is of zero size

It is sound to have multiple references disagree about what type a memory region has so long as those references agree on `UnsafeCell`s

It is sound to have multiple references disagree about `UnsafeCell`s so long as only one of them is "active"

It is sound to read a non-`UnsafeCell` location as an `UnsafeCell` via `ptr::read`

It is sound to read an `UnsafeCell` location as a non-`UnsafeCell` via `ptr::read`

Guarantees

Sound given `UnsafeCell` agreement

These analyses only consider active references

It is sound to read a non-`UnsafeCell` location as an `UnsafeCell` via `ptr::read`

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Minimum guarantees regarding UnsafeCell #495

Description

Constraints

It's unsound (under stacked borrows) to have multiple active references which "disagree" about whether a memory region is covered by an UnsafeCell

It's still unsound even if the overlap is only "partial"

It's no longer unsound if the UnsafeCell region is of zero size

It is sound to have multiple references disagree about what type a memory region has so long as those references agree on UnsafeCells

It is sound to have multiple references disagree about UnsafeCells so long as only one of them is "active"

It is sound to read a non-UnsafeCell location as an UnsafeCell via ptr::read

It is sound to read an UnsafeCell location as a non-UnsafeCell via ptr::read

Guarantees

Sound given UnsafeCell agreement

These analyses only consider active references

It is sound to read a non-UnsafeCell location as an UnsafeCell via ptr::read

Activity

CAD97 commented on Mar 6, 2024

joshlf commented on Mar 6, 2024

CAD97 commented on Mar 6, 2024

joshlf commented on Mar 6, 2024

CAD97 commented on Mar 6, 2024

joshlf commented on Mar 6, 2024

joshlf commented on Apr 8, 2024

RalfJung commented on Apr 9, 2024

RalfJung commented on Apr 21, 2024

CAD97 commented on Apr 21, 2024

RalfJung commented on Apr 22, 2024

joshlf commented on Apr 22, 2024

RalfJung commented on Apr 23, 2024

joshlf commented on Apr 23, 2024

saethlin commented on Apr 23, 2024

RalfJung commented on Apr 23, 2024

joshlf commented on Apr 23, 2024

RalfJung commented on Apr 23, 2024

RalfJung commented on Apr 23, 2024

joshlf commented on Sep 30, 2024

RalfJung commented on Sep 30, 2024

joshlf commented on Sep 30, 2024

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions

Minimum guarantees regarding `UnsafeCell` #495

It's unsound (under stacked borrows) to have multiple active references which "disagree" about whether a memory region is covered by an `UnsafeCell`

It's no longer unsound if the `UnsafeCell` region is of zero size

It is sound to have multiple references disagree about what type a memory region has so long as those references agree on `UnsafeCell`s

It is sound to have multiple references disagree about `UnsafeCell`s so long as only one of them is "active"

It is sound to read a non-`UnsafeCell` location as an `UnsafeCell` via `ptr::read`

It is sound to read an `UnsafeCell` location as a non-`UnsafeCell` via `ptr::read`

Sound given `UnsafeCell` agreement

It is sound to read a non-`UnsafeCell` location as an `UnsafeCell` via `ptr::read`