-
Notifications
You must be signed in to change notification settings - Fork 62
Description
As part of google/zerocopy#251, we're trying to teach zerocopy to reason in more precise ways about UnsafeCell
s. I'm making this issue as sort of an omnibus issue to clarify a number of questions regarding UnsafeCell
s. As always with zerocopy, I'm only interested in what can currently be promised, not what might be promised in the future.
In particular, my questions concern when it's sound for different code to disagree on whether a given memory region is covered by an UnsafeCell
.
Constraints
Based on Miri's behavior and APIs that already exist in the standard library, we can observe some constraints on what code is sound/unsound. In particular:
- Any time Miri treats code as unsound, I take that to mean that we cannot guarantee that it is sound even if it might be decided to be sound at some point in the future
- Just because Miri treats code as sound, it may not be guaranteed to be sound
- If a stable API exists which implies that code is sound, then it probably is guaranteed to be sound
It's unsound (under stacked borrows) to have multiple active references which "disagree" about whether a memory region is covered by an UnsafeCell
Status: can't assume sound
As soon as there are multiple references which are "active"* or "not shadowed" or whatever the proper terminology is, and these references disagree about whether a given memory region is covered by an UnsafeCell
, this is insta-UB according to stacked borrows even if the references are never used for anything. For example:
let u = 1usize;
let u_ref = &u;
let _c_ref = unsafe { &*(u_ref as *const usize as *const Cell<usize>) }; // Insta-UB
*Previously, this post used the term "live", but @CAD97 pointed out that "active" is the more accurate term.
Miri output
error: Undefined Behavior: trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
--> src/main.rs:6:27
|
6 | let _c_ref = unsafe { &*(u_ref as *const usize as *const Cell<usize>) };
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
| this error occurs as part of retag at alloc809[0x0..0x8]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1716> was created by a SharedReadOnly retag at offsets [0x0..0x8]
--> src/main.rs:6:30
|
6 | let _c_ref = unsafe { &*(u_ref as *const usize as *const Cell<usize>) };
| ^^^^^
= note: BACKTRACE (of the first span):
= note: inside `main` at src/main.rs:6:27: 6:74
It's still unsound even if the overlap is only "partial"
Status: can't assume sound
In this example, only some of the bytes go from covered by an UnsafeCell
to not covered.
#[repr(C)]
struct Foo(Cell<u32>, u32);
let u = 1u64;
let u_ref = &u;
let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) }; // Insta-UB
Miri output
error: Undefined Behavior: trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
--> src/main.rs:9:27
|
9 | let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) };
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| trying to retag from <1716> for SharedReadWrite permission at alloc809[0x0], but that tag only grants SharedReadOnly permission for this location
| this error occurs as part of retag at alloc809[0x0..0x8]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1716> was created by a SharedReadOnly retag at offsets [0x0..0x8]
--> src/main.rs:9:30
|
9 | let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) };
| ^^^^^
= note: BACKTRACE (of the first span):
= note: inside `main` at src/main.rs:9:27: 9:64
It's no longer unsound if the UnsafeCell
region is of zero size
Status: might be able to guarantee sound
In other words, it's not the mere presence of UnsafeCell
that matters, but rather the bytes that are "covered" by UnsafeCell
s. For this reason, ZST UnsafeCell
s have no effect; this code is accepted by Miri:
// `Cell<()>` is a ZST
#[repr(C)]
struct Foo(Cell<()>, u64);
let u = 1u64;
let u_ref = &u;
let _c_ref = unsafe { &*(u_ref as *const u64 as *const Foo) };
It is sound to have multiple references disagree about what type a memory region has so long as those references agree on UnsafeCell
s
Status: probably guaranteed sound
For example, in both of the following cases, while the types are not the same, the byte ranges covered by UnsafeCell
s are the same.
let u = 1usize;
let u_ref = &u;
// Convert from `&usize` to `&isize`. Neither type has `UnsafeCell`s.
let _i_ref = unsafe { &*(u_ref as *const usize as *const isize) };
let cu = Cell::new(1usize);
let cu_ref = &cu;
// Convert from `&Cell<usize>` to `&Cell<isize>`. Both types have
// `UnsafeCell`s covering the same byte ranges.
let _ci_ref = unsafe { &*(cu_ref as *const Cell<usize> as *const Cell<isize>) };
It is sound to have multiple references disagree about UnsafeCell
s so long as only one of them is "active"
Status: guaranteed sound
In this example, I believe that what's happening is that u_mut
becomes inaccessible so long as _c_mut
exists, so there's no way to exercise the UnsafeCell
disagreement. Not only is this accepted by Miri in practice, it's also codified in the UnsafeCell
methods get_mut
(stable) and from_mut
(unstable nightly feature).
let mut u = 1usize;
let u_mut = &mut u;
let _c_mut = unsafe { &mut *(u_mut as *mut usize as *mut Cell<usize>) };
It is sound to read a non-UnsafeCell
location as an UnsafeCell
via ptr::read
Status: unclear
I'm not sure how to think about why this is sound, and whether it's guaranteed to remain sound.
let u = 1usize;
let u_ref = &u;
let _c = unsafe { ptr::read(u_ref as *const usize as *const Cell<usize>) };
It is sound to read an UnsafeCell
location as a non-UnsafeCell
via ptr::read
Status: unclear
I'm even less sure how to think about this one - my intuition tells me that Miri should reject this, but it doesn't.
let c = Cell::new(1usize);
let c_ref = &c;
let _u = unsafe { ptr::read(c_ref as *const Cell<usize> as *const usize) };
Guarantees
I would like to be able to rely on the following guarantees. For each guarantee, my questions are:
- Is this already documented somewhere as being guaranteed?
- If not, would it be acceptable (at least to T-opsem) to add documentation to guarantee it?
Sound given UnsafeCell
agreement
In particular: it is sound to create references to the same memory which disagree about type so long as they agree about which bytes are covered by UnsafeCell
. It is sound to use such references to perform loads and stores (assuming those loads and stores do not violate the referent types' bit validity).
These analyses only consider active references
For the purposes of the previous soundness guarantee, only "active" (i.e., non-shadowed) references are considered.* This allows us to prove that code which reborrows a &mut T
(with no UnsafeCell
s) as a &mut U
(with UnsafeCell
s) or vice versa (e.g., UnsafeCell::get_mut
) is sound.
I am aware that the formal definition of "active-ness" is up in the air, and subject to models like stacked borrows and tree borrows. I am assuming that there is some way we can formally define borrows so that we can get some minimum guarantees about this section despite not having completely agreed on the full model.
It is sound to read a non-UnsafeCell
location as an UnsafeCell
via ptr::read
Given &T
where T
doesn't contain any UnsafeCell
s, it is sound to convert &T
to *const U
(where U
does contain UnsafeCell
s) and use ptr::read
to read a U
out of that memory location (assuming that this doesn't violate U
's bit validity, read uninitialized bytes, read unaligned memory, etc etc).
Activity
CAD97 commentedon Mar 6, 2024
Note: I think you're using "live" when you actually mean something closer to "active," which obfuscates the point you're trying to make. "Live" has a specific meaning used by the NLL RFC; specifically, that the reference is potentially read from read through in the future; that the borrow region/lifetime is live.
Thus in both the
&T as &Cell<T>
and&mut T as &Cell<T>
cases, the parent borrow is still "live".&T
remains "active" while derived borrows are "live" (dereferencing it does not end the liveness of derived borrows), whereas&mut T
is not "active" while derived borrows are "live" (dereferencing it ends the liveness of derived borrows).The unifying rule is that a byte may be borrowed as
UnsafeCell
only if the borrow of the byte is permitted to mutate the byte.As for reads, I'm reasonably sure we've documented somewhere that
transmute
/transmute_copy
/ptr.cast().read()
only requires layout/value compatibility (thusUnsafeCell
is completely irrelevant).I do think all cases of ref casting used in std amount to
#[repr(transparent)]
compatibility (if we treatstr
as a transparent newtype around[u8]
, which it essentially is except for compile perf benefit). Ref casting between#[repr(transparent)]
compatible types is essentially in the guaranteed sound category, as it's how zero copy conversions between&str
/&CStr
/&Path
/&[u8]
work. Other layout compatibility doesn't have a "soundness witness" for ref casting in the std API/documentation, as far as I'm aware.joshlf commentedon Mar 6, 2024
Yep, sounds like I mean "active"! I've updated the original text to use that term.
Do you know if a rule like this is documented anywhere normative (ie, not in not-yet-accepted proposals like stacked borrows/tree borrows)? IIUC, you're using the term "permitted" in the same sense that Miri uses it when tracking "permissions" on an allocation (e.g., treating
&UnsafeCell<T>
as havingSharedReadWrite
permission). I'm not aware of a term like that being used in official Rust documentation.Also, is this an "if and only if" or merely an "only if"? In other words, is it the case that "I am permitted to mutate this byte" is sufficient to guarantee that "I am allowed to borrow this byte as
UnsafeCell
"?Finally, one concern with this rule: Doesn't this imply that going from
&UnsafeCell<T>
to&T
is sound since the initial&UnsafeCell<T>
is permitted to mutate its referent (no problems so far), and the subsequent re-borrow as&T
doesn't have anyUnsafeCell
s, and so your rule doesn't have anything to say about it? This seems problematic - it would mean that I could, for example, synthesize a&usize
that refers to the inner contents of anAtomicUsize
and then access it while another thread performs atomic store operations on it.My concern with this case is that it seems possible to make this unsound. For example, imagine that I
ptr::read
from theUnsafeCell
inside anAtomicUsize
while another thread is performing atomic store operations on it. Isn't that UB? Should I read your statement as implying that this shouldn't be UB?CAD97 commentedon Mar 6, 2024
Here I'm use the term mostly informally to include both mutable references and shared references to what the
UnsafeCell
docs call "types that allow internal mutability." I'm also trying to be careful enough that I don't mis-stateUnsafeCell
-adjacency as one way or the other, since that's still formally undecided.I carefully worded my rule to only apply to .
as &UnsafeCell<T>
cases, and only to define a set where it is unambiguously unsound, i.e. the only logical implication I'm asserting as demonstrably true isThe informal implication is only that
&UnsafeCell<T> -> &T
can be sound, andRefCell
is an API witness to that being true. (Strictly speaking,Mutex
isn't, since it could utilize allocated indirection instead ofUnsafeCell
.) If external synchronization of anUnsafeCell
isn't consistent, that's obviously still unsound.That's defined as UB, because it causes a data race, and data races are defined as UB. The exact same as if there were never a reference created, one thread did
ptr::read
, and another didptr::write
. That the only way to stably do atomic writes is via&AtomicX
is irrelevant, as all that matters for this to be UB is that it's a data race.joshlf commentedon Mar 6, 2024
Ah I see. Unfortunately what we need is some set of behavior which is guaranteed sound (so that we can use it as the precondition of a safety proof).
To make this more concrete: If someone could demonstrate a proof that
get_mut
andfrom_mut
are sound, that would be very helpful. Our use case is basically just doing the same thing that those methods do, except with raw pointers (and so we can't literally use those methods). Given a proof of soundness for those methods, we could nearly copy-paste it into our code.Alternatively, if it's not possible to write such a proof (because the necessary preconditions aren't guaranteed by the Rust Reference or by the standard library documentation), we'd like to know what the gaps are so we can work on adding them as guarantees to the language.
CAD97 commentedon Mar 6, 2024
Note that the std docs explicitly call out that it is considered unsound to project from shared
UnsafeCell<T>
to uniqueT
by pointer cast in user code, and the use ofUnsafeCell::raw_get
is required.That could imply that the same holds for
from_mut
.More generally speaking, std API can only be witness to that something is possible, but std could always have magical instructions not replicable by user code.
joshlf commentedon Mar 6, 2024
Definitely agreed. But do you believe that it's not possible to prove sound given any of the other guarantees provided by the language? If so, would opsem be comfortable adding the necessary guarantees to make it possible to complete such a proof?
joshlf commentedon Apr 8, 2024
cc @RalfJung ; I assume you may have thoughts here
RalfJung commentedon Apr 9, 2024
Yeah I may; I also have quite a big backlog of Rust things so I can't say when I'll have time to take a look at this thread.
NoCell
bound ontransmute_mut!
google/zerocopy#1049RalfJung commentedon Apr 21, 2024
Note sure if that was a good idea, it's usually better to keep one-topic-per-thread/issue...
I consider that a Stacked Borrows bug that must be fixed for the official aliasing model. Mixing
&T
and&UnsafeCell<T>
references to the same memory should be entirely fine, as long as there is no mutation while any&T
is live/active (orT
itself has interior mutability).Rust does not do C-style TBAA. So yes I consider that pretty much settled. (Not just for UnsafeCell but for all references.)
Note on that example:
_c_mut
is derived fromu_mut
. That matters a lot. Having a mutable reference and any other reference active at the same time is unsound unless they are in a parent-child relationship. (In Tree Borrows terms: mutable references get invalidated on foreign writes, but not on child writes.)You're only using the UnsafeCell by-value here, so
T
andUnsafeCell<T>
are entirely equivalent.Same as above.
That's basically "Rust has no TBAA except for its reference guarantees". I don't know if this is officially documented somewhere. I don't even know how to document this. A list of "all the UB Rust does not have" could become quite long. ;)
Note that on top of what you said here, it is obviously also crucial to not mutate any of the parts that an active shared reference points to outside of an UnsafeCell! And you cannot violate uniqueness of
&mut
references. And data race rules apply as well. Your accesses also still have to be in-bounds and the allocation cannot be freed -- all the usual. (Some of your later questions here seemed related to a lack of stating all these requirements here).I am confused by your / @CAD97's use of the term "active" here. I would say a mutable reference that has been reborrowed is still active. It just has a child now. It becomes inactive when it gets invalidated y a conflicting access.
I would say that falls out of the fact that
UnsafeCell
isrepr(transparent)
.They are trivially sound because
UnsafeCell
only makes a difference under&
. So&mut UnsafeCell<T>
and&mut T
are identical types -- same safety invariant, same validity invariant. (They are only identical in owned position though, not in shared position.)CAD97 commentedon Apr 21, 2024
On "active:" terms are hard, and there aren't enough words. NLL uses "live" for roughly "potential access later in the CFG" and TB uses "active" for roughly "hasn't been invalidated, would be DB to write through." There isn't a good word for the stronger property of "leaf active," roughly "spurious operations are acceptable and don't change the tag state."
In any case, while the disclaimer stating that
UnsafeCell::raw_get
is required remains, std is saying that you mustn't "project into"*const UnsafeCell<T>
except by calling theUnsafeCell
API. The other direction of&mut T
to&mut UnsafeCell<T>
is explicitly allowed by that same documentation.RalfJung commentedon Apr 22, 2024
I also didn't really need such a word before, so this never seemed like a problem to me.
For the "disagree about where the UnsafeCell are" question, I think "active (including possibly with children)" is the notion we want, isn't it?
Though I think I'd put it more precisely as: it is UB to derive
&U
from&T
whenU
hasUnsafeCell
anywhere thatT
does not. That better reflects the fact that this is asymmetric -- deriving&T
from&UnsafeCell<T>
is fine (though only viaUnsafeCell::get
/get_mut
), but the opposite direction is not. IOW, when discussing these aliasing questions we must also always consider the parent/child relationships of the pointers involved. (Tree Borrows makes this very explicit, but Stacked Borrows largely works on that principle as well.)Also worth noting that this is a form of library UB -- it currently seems very likely that there will be no language UB from side-stepping
get
/raw_get
.joshlf commentedon Apr 22, 2024
Amazing, thank you for this! It'll take me some time to wrap my head around everything, but this is very helpful.
One thing that I would like to get clarity on, since it seems like the answer is "this is sound", but I'm not entirely sure that we all agree on why it's sound, or on how to prove that it's sound: See the following code:
Three questions:
// SAFETY: TODO
safety proof in such a way that it only relies on axioms documented by the Reference or standard library documentation?At the end of the day, what we're really after is the text of that safety proof. If we can't write it, then we can't commit the relevant code to zerocopy.
RalfJung commentedon Apr 23, 2024
This is sufficient to avoid immediate UB, but not sufficient for soundness. Types with the same layout and bit validity can still have different safety invariants (e.g.,
String
andVec<u8>
, or[u8]
andstr
).But other than that, isn't your function a straight-forward composition of
UnsafeCell::get_mut
and a "transmute between&mut T
and&mut U
when their types are equivalent"? The only tricky part is how to express when that transmute is allowed, but that question seems entirely unrelated toUnsafeCell
.You don't have to justify the soundness of
UnsafeCell::get_mut
, it is given to you as an axiom from the standard library. The justification isn't complicated, though --UnsafeCell
only makes a difference below&
, but there's no&
here, therefore the presence ofUnsafeCell
is entirely moot.joshlf commentedon Apr 23, 2024
Yeah, I'm ignoring safety invariants here to keep the question simple.
I agree that this is in principle straightforward, but:
UnsafeCell::get_mut
exists isn't a guarantee that we're allowed to do the same thing.UnsafeCell
only makes a difference below&
"?I know I'm being incredibly pedantic, but my point with this exercise is that we're trying to write a soundness proof that is correct only in virtue of what's documented to the public, not in virtue of what is intended to be documented in the future, what's commonly understood to be true, etc. Often, when we go through these exercises, we discover that things that are "obviously true" are not actually documented anywhere.
If you were writing that safety comment, what would you write? My hope in asking is to discover whether sufficient axioms are already provided by the documentation, or whether we need to add things.
saethlin commentedon Apr 23, 2024
It's not lots, and this is being fixed rust-lang/rust#124251. Considering how often this example is trotted out, I actually wonder if people will have another similar example after that PR is merged.
There's reason to think we will eliminate as many of these as possible over time. The sort of implicit reliance makes it hard to detect when third-party code is accidentally relying on such details.
Note that before fat pointer layout, the example everyone used was uninitialized arrays of
u8
, and that was also fixed.RalfJung commentedon Apr 23, 2024
joshlf commentedon Apr 23, 2024
Since the function is
unsafe
, presumably it's on the caller to not violate safety invariants after calling the function? IIUC we can still talk coherently about the soundness of the function itself without needing to think about the safety invariants ofT
andU
.RalfJung commentedon Apr 23, 2024
RalfJung commentedon Apr 23, 2024
Immutable
semantics google/zerocopy#1155joshlf commentedon Sep 30, 2024
Hey, sorry for such long radio silence on this! I was traveling to a work conference and had to go into the conference right as you sent you last message, then lost all mental context and it fell off my radar.
I think that I understand the model better now, specifically that (in the absence of TBAA):
Ordering
), but is agnostic to typesUnsafeCells
weaken shared references to not assume immutabilityUnsafeCell
agreement" has nothing to do with TBAA*. Instead, it’s just about whether we can guarantee that code isn’t going to violate the "won't be modified out from under me" invariant of shared references; this is entirely a runtime property, and so exclusive references definitionally don’t care aboutUnsafeCell
s - their semantics is completely unaffected by it*Except for in Stacked Borrows, where it’s considered a bug
RalfJung commentedon Sep 30, 2024
That does sound right.
So... what is the remaining thing to do in this issue? It began with a list of questions, some subset of which has been answered. I don't know if there is anything actionable that's left to do.
joshlf commentedon Sep 30, 2024
I believe all of the outstanding questions have been resolved! Thanks so much for spending the time to help me work through all of this. It's been incredibly helpful!