Skip to content

Mutable references vs self-referential structs #148

Open
@RalfJung

Description

@RalfJung
Member

Turns out that stacked borrows and self-referential generators don't go well together. This code fails in Miri:

#![feature(generators, generator_trait)]

use std::{
    ops::{Generator, GeneratorState},
    pin::Pin,
};

fn firstn() -> impl Generator<Yield = u64, Return = ()> {
    static move || {
        let mut num = 0;
        let num = &mut num;

        yield *num;
        *num += 1; //~ ERROR: borrow stack

        yield *num;
        *num += 1;

        yield *num;
        *num += 1;
    }
}

fn main() {
    let mut generator_iterator = firstn();
    let mut pin = unsafe { Pin::new_unchecked(&mut generator_iterator) };
    let mut sum = 0;
    while let GeneratorState::Yielded(x) = pin.as_mut().resume() {
        sum += x;
    }
    println!("{}", sum);
}

The reason it fails is that each time through the loop, we call Pin::as_mut, which creates a fresh mutable reference to the generator. Since mutable references are unique, that invalidates the pointer that points from one field of the generator to another.

This is basically a particularly bad instance of #133.

Cc @cramertj @tmandry @nikomatsakis @arielb1

Activity

added
A-aliasing-modelTopic: Related to the aliasing model (e.g. Stacked/Tree Borrows)
on Jun 21, 2019
RalfJung

RalfJung commented on Jun 23, 2019

@RalfJung
MemberAuthor

In some sense there are two problems here:

  • When a Pin<&mut T> gets passed around, it gets retagged the same way as an &mut T, assuring uniqueness of this pointer to the entire memory range. This is probably not something we want. So maybe we need a "no retagging for private fields" kind of thing, similar to what we might need for protectors. This is the (relatively) easy part.
  • However, even just calling Pin::as_mut creates a mutable reference internally (before wrapping it in Pin again). So even if we fix the above, we'd still assert uniqueness here. In some sense it would be more correct if a Pin<&mut T> would actually be represented as a NonNull<T>, as that's a more honest reflection of the aliasing situation: it's not a unique pointer, there can be pointers inside this data structure that alias. This is the hard part.

If we still had PinMut as a separate type, this would be fixable, but with Pin<impl Deref>, that's hard. And even then, map_unchecked_mut still creates a mutable reference and even passes it as a function argument, which is about as strongly asserting uniqueness as we can -- and which we don't want, aliasing pointers are allowed to exist.

RalfJung

RalfJung commented on Jun 25, 2019

@RalfJung
MemberAuthor

If we ignore the fact that references are wrapped inside the Pin struct, self-referential generators do the equivalent of:

fn main() {
    let mut local = 42;
    let raw_ptr = &mut local as *mut i32; // create raw pointer
    let safe_ref = &mut local; // create a reference, which is unique, and hence invalidates the raw pointer
    println!("{}", unsafe { *raw_ptr }); // UB
}

I explicitly wanted that code to be UB. While raw pointers aliasing with each other is allowed, having a single raw pointer and a mutable reference acts pretty much like having two mutable references -- after all, between the two of them, it's enough if one side does not allow aliasing, since "aliases-with" is a symmetric relation. If we replace raw_ptr by another reference above, the code is rejected by both the borrow checker and Stacked Borrows.

So to fix this problem, I see two general options:

  1. "Hide" references in struct fields from Stacked Borrows. Then the implementation of Pin methods would have to do lots of transmute to avoid ever creating a "bare" reference. And map_unchecked_mut has a problem. It should probably use raw pointers (and RFC for an operator to take a raw reference rfcs#2582 to get a field address). On the other hand, at least we have a plan.
  2. Allow the example code above. We would have to somehow "lazily" activate mutable references. It's a bit like two-phase borrows but worse. I don't have a very concrete idea what this would look like, but I think Stacked Borrows would have to become Tree-shaped Borrows or so -- a stack just does not have enough structure.
HadrienG2

HadrienG2 commented on Jun 29, 2019

@HadrienG2

@RalfJung: Could this be resolved by turning Pin<&'a mut T> into some abstraction around (&'a UnsafeCell<T>; PhantomData<&'a mut T>) that does not actually hold a reference, but only behaves like one and spawns actual references on demand?

This would likely require turning Pin into Compiler Magic(tm) that is hard to implement for user-defined pointer types, though, and I can imagine backwards-incompatible implications in other areas such as changing the safety contract for extracting mutable references from Pin...

EDIT: Ah, I see that you explored a NonNull-based variant of this strategy above.

RalfJung

RalfJung commented on Jun 29, 2019

@RalfJung
MemberAuthor

You don't need UnsafeCell, *mut T would do it. And yes that's basically what I had in mind with option (1).

CAD97

CAD97 commented on Jun 29, 2019

@CAD97
Contributor

The way I understand how Pin works, the self referential references borrow their validity from the one in the pin. So under SB, they're retagged from the pin.

The most minimal change that would make pin play nicely with SB I think would be to somehow keep the interior references valid even while the unsafe map_unchecked_mut reference is used.

Could it be possible to not retag for pin's map_unchecked_mut and only pop tags when the reference is used to access that memory location? (This is super spitball, sorry)

RalfJung

RalfJung commented on Jun 29, 2019

@RalfJung
MemberAuthor

The way I understand how Pin works, the self referential references borrow their validity from the one in the pin. So under SB, they're retagged from the pin.

Correct.

However, then the pin itself eventually gets retagged, and that kills the self-referential borrows. This currently happens all the time, but could be reduced to just "inside the Pin implementation" if we make retag respect privacy, or just generally not enter structs, or so.

Could it be possible to not retag for pin's map_unchecked_mut

Well it'd need a magic marker or so.

and only pop tags when the reference is used to access that memory location? (This is super spitball, sorry)

That's basically option (2) from above.

changed the title [-]Stacked Borrows vs self-referential generators[/-] [+]Stacked Borrows vs self-referential structs[/+] on Aug 13, 2019
RalfJung

RalfJung commented on Aug 13, 2019

@RalfJung
MemberAuthor

#194 indicates that this is a problem with self-referential structs in general, not just self-referential generators. That's not surprising, so I generalized the issue title.

Aaron1011

Aaron1011 commented on Aug 22, 2019

@Aaron1011
Member

@RalfJung: Related to your example of:

fn main() {
    let mut local = 42;
    let raw_ptr = &mut local as *mut i32; // create raw pointer
    let safe_ref = &mut local; // create a reference, which is unique, and hence invalidates the raw pointer
    println!("{}", unsafe { *raw_ptr }); // UB
}

While working on pin-project, I wanted to write this code:

use std::pin::Pin;

struct Foo {
    // #[pin]
    field1: u8,
    field2: bool
}

struct FooProj<'a> {
    __self_ptr: *mut Foo,
    field1: Pin<&'a mut u8>,
    field2: &'a mut bool
}

impl Foo {
    fn project<'a>(self: Pin<&'a mut Self>) -> FooProj<'a> {
        let this = unsafe { self.get_unchecked_mut() };
        let __self_ptr: *mut Foo = this;
        let field1  = unsafe { Pin::new_unchecked(&mut this.field1) };
        let field2 = &mut this.field2;
        FooProj {
            __self_ptr,
            field1,
            field2
        }
    }
}

impl<'a> FooProj<'a> {
    fn unproject(self) -> Pin<&'a mut Foo> {
        unsafe {
            let this: &mut Foo = &mut *self.__self_ptr;
            Pin::new_unchecked(this)
        }
    }
}

fn main() {
    let mut foo = Foo { field1: 25, field2: true };
    let foo_pin: Pin<&mut Foo> = Pin::new(&mut foo);
    let foo_proj: FooProj = foo_pin.project();
    let foo_orig: Pin<&mut Foo> = foo_proj.unproject();
}

The key thing here is the unproject method. Basically, I want to be able to the following:

  1. Convert an &mut Self to a *mut Self
  2. Create mutable references to fields of Self (&mut self.field, etc.)
  3. Later, upgrade the *mut Self to a &mut Self, after all of the mutable fields references have gone out of scope.

However, Miri flags this as UB for (I believe) a similar reason as your example - creating a mutable reference to a field ends up transitively asserts that we have unique ownership of the base type. Therefore, any pre-existing raw pointers to the base type will be invalidated.

Allowing this kind of pattern would make pin-project much more useful. It would allow the following pattern:

impl MyType {
	fn process(self: Pin<&mut Self>) {
        let this = self.project();

        // Use a pin-projected field - e.g. poll a wrapped future
        ...


        // Construct a new instance of MyType - e.g. a new enum variant
        let new_foo: MyType = ...;
        // Overwrite ourself using Pin::set
        let new_self: Pin<&mut Self> = this.unproject();
        new_self.set(new_foo);
    } 

This is especially useful when working with enums - here's a real-world example in Hyper.

However, I can't see a way to avoid creating and later 'upgrading' a raw pointer when trying to safely abstract this pattern in pin-project.

54 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-aliasing-modelTopic: Related to the aliasing model (e.g. Stacked/Tree Borrows)C-open-questionCategory: An open question that we should revisitS-pending-designStatus: Resolving this issue requires addressing some open design questions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @mitsuhiko@Nemo157@RalfJung@thomcc@Darksonn

        Issue actions

          Mutable references vs self-referential structs · Issue #148 · rust-lang/unsafe-code-guidelines