Skip to content

rustc performs auto-ref when a raw pointer would be enough #73987

@RalfJung

Description

@RalfJung
Member

The following code:

#![feature(slice_ptr_len)]

pub struct Test {
    data: [u8],
}

pub fn test_len(t: *const Test) -> usize {
    unsafe { (*t).data.len() }
}

generates MIR like

        _2 = &((*_1).0: [u8]);
        _0 = const core::slice::<impl [u8]>::len(move _2) -> bb1;

This means that a reference to data gets created, even though a raw pointer would be enough. That is a problem because creating a reference makes aliasing and validity assumptions that could be avoided. It would be better if rustc would not implicitly introduce such assumptions.

Cc @matthewjasper

Activity

added
A-MIRArea: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.html
C-bugCategory: This is a bug.
T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.
on Jul 3, 2020
SimonSapin

SimonSapin commented on Jul 3, 2020

@SimonSapin
Contributor

This is specifically about field projection of a raw pointer, right?

RalfJung

RalfJung commented on Jul 3, 2020

@RalfJung
MemberAuthor

I think so, yes. It is key that t starts out as a raw pointer.

SimonSapin

SimonSapin commented on Jul 3, 2020

@SimonSapin
Contributor

Oh I just realized something, and I think that the issue title and description are misleading. They make it sound like we’re calling <*const [u8]>::len(self), and in the process unnecessarily going through &[u8]. But the second line of MIR shows that the method called is actually <[u8]>::len(&self). On closer look that seems completely expected to me. The expression (*t).data by itself has type [u8], and method resolution ends up finding a result through auto-ref. But there is no equivalent to auto-ref for raw pointer. If we instead try to call a raw pointer method that doesn’t have a slice method of the same name, we get an error:

trait Foo {
    fn bar(self);
}

impl Foo for *const [u8] {
    fn bar(self) {}
}

pub struct Test {
    data: [u8],
}

pub fn test_len(t: *const Test) -> usize {
    unsafe { (*t).data.bar() }
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0599]: no method named `bar` found for slice `[u8]` in the current scope
  --> src/lib.rs:14:24
   |
14 |     unsafe { (*t).data.bar() }
   |                        ^^^ method not found in `[u8]`
   |
   = help: items from traits can only be used if the trait is implemented and in scope
note: `Foo` defines an item `bar`, perhaps you need to implement it
  --> src/lib.rs:1:1
   |
1  | trait Foo {
   | ^^^^^^^^^

Another example without involving a struct field:

fn ptr_after<T>(x: &T) -> *const T {
    (x as *const T).offset(1)  // Ok
}

fn ptr_after2<T>(x: &T) -> *const T {
    x.offset(1)
}

(Playground)

Errors:

   Compiling playground v0.0.1 (/playground)
error[E0599]: no method named `offset` found for reference `&T` in the current scope
 --> src/lib.rs:6:7
  |
6 |     x.offset(1)
  |       ^^^^^^ method not found in `&T`

So I’d be inclined to call this not a bug.

RalfJung

RalfJung commented on Jul 3, 2020

@RalfJung
MemberAuthor

Oh I just realized something, and I think that the issue title and description are misleading. They make it sound like we’re calling <*const [u8]>::len(self), and in the process unnecessarily going through &[u8]. But the second line of MIR shows that the method called is actually <[u8]>::len(&self).

Yes, that is the problem. We should be calling the raw ptr method, but instead the compiler chooses the call the other method. That is what this bug is about. I am happy for suggestions for how to word this better. :)

Elsewhere you wrote:

The example code in #73987 never involved a *const [u8] value at all. I’ve commented some more there.

The example code also doesn't involve a &[u8]. It just involves a [u8]. The issue is that the compiler chooses to introduce an &[u8] instead of introducing *const [u8]. Either choice works synatically, but one makes way more assumptions, so we should be auto-introducing the one with fewer assumptions.

I am aware that the reason for this is auto-ref, and not having auto-raw-ptr. But that is IMO a big problem as it means it is actually very hard to call raw-self methods on paths -- and it is very easy to accidentally call the reference method instead.

RalfJung

RalfJung commented on Jul 3, 2020

@RalfJung
MemberAuthor

If we instead try to call a raw pointer method that doesn’t have a slice method of the same name, we get an error:

Indeed. IMO we should not stabilize any raw ptr method where a reference method with the same name exists, until this bug is fixed. It's just too much of a footgun, with people accidentally calling the reference method instead of the raw ptr method.

SimonSapin

SimonSapin commented on Jul 3, 2020

@SimonSapin
Contributor

I’m not sure I agree that this is a bug in the first place. The language never had coercion from T to *const T in any context

RalfJung

RalfJung commented on Jul 3, 2020

@RalfJung
MemberAuthor

Would you agree that it is a footgun, though?

I agree it is behavior as intended. I just don't think the intentions are fit for modern Rust any more -- after all, when this behavior was designed, there were no raw-self methods.

neocturne

neocturne commented on Jul 3, 2020

@neocturne
Contributor

In this particular example, the behaviour does not feel like a footgun to me: In my simplified mental model of the language, (*t) is already only valid when the usual aliasing and validity assumptions hold, even if these assumptions only actually need to hold when I do something with the result.

I would go as far as saying that having &raw const (*t).data as the supported way to get from the raw struct pointer to the raw field pointer is quite ugly because the code looks as if t is dereferenced - is there some nicer way to do this? Optimally, the test_len function in the original example shouldn't even need unsafe at all. (But I'm likely missing years of discussions on these topics, given that I've only recently taken an interest in Rust)

RalfJung

RalfJung commented on Jul 4, 2020

@RalfJung
MemberAuthor

In my simplified mental model of the language, (*t) is already only valid when the usual aliasing and validity assumptions hold, even if these assumptions only actually need to hold when I do something with the result.

That is not the case though. *t (where t is a raw pointer) just requires t to be aligned and dereferencable; creating a reference (& or &mut) makes a huge difference on top of that by also making aliasing assumptions.

(What you said is basically right when t is a reference, though.)

I would go as far as saying that having &raw const (*t).data as the supported way to get from the raw struct pointer to the raw field pointer is quite ugly because the code looks as if t is dereferenced - is there some nicer way to do this?

Well, t does get dereferenced. No memory access happens, but all rules in the langauge that talk about pointers being dereferenced apply to *t, even when used in the context of &*t.

This is the same in C: &ptr->field is UB if ptr is dangling or unaligned.

ssomers

ssomers commented on Jul 15, 2020

@ssomers
Contributor

Wearing an old hat, *t is not just dereferencing (for some definition) to me but how you get from a raw pointer back into the safe world. So I would expect (*t).data.len() to make all the assumptions it does. And to find in a back alley some notation like t + .data or &t->data to do pointer arithmetic, reading in the doc that pointer arithmetic is subject to the same pointer validation as dereferencing.

Wearing a newer hat, since unsafe {&raw const *t} and &raw const (*t).data exist, and don't dereference (as much as *t), it's much less clear to me what (*t).data.len()should do. Isn't quietly doing raw pointer access also a risk, leaving you unprotected by aliasing rules that you thought were being applied?

31 remaining items

RalfJung

RalfJung commented on Jul 19, 2022

@RalfJung
MemberAuthor

(As for why I included format_args!(), it can be easy to forget that it creates references even to Copy types; I could easily see a user writing println!("i32 field: {}", (*ptr).i32_field) and thinking that (*ptr).i32_field is a value expression, like it would be for regular function calls.)

If it were a value expression, then the ptr would be actually loaded from, so that would only have more UB. Therefore this is not a footgun.

The problem is writing code where you don't want a value expression, like addr_of_mut!((*(ptr))[..layout_size]), and then accidentally creating a reference with aliasing guarantees nonetheless.

LegionMammal978

LegionMammal978 commented on Jul 19, 2022

@LegionMammal978
Contributor

If it were a value expression, then the ptr would be actually loaded from, so that would only have more UB. Therefore this is not a footgun.

Hmm, you're right about that, now that I think about it. At the very top of the borrow stack, it's only an issue when you hold on to the &mut longer than you need it. The transient &muts are mainly an issue in the middle of the borrow stack, where you end up with long-lived *mut <- &mut <- *mut patterns.

The problem is writing code where you don't want a value expression, like addr_of_mut!((*(ptr))[..layout_size]), and then accidentally creating a reference with aliasing guarantees nonetheless.

So we agree that there's lots of ways to get implicit refs from place expressions, and this can cause issues with the aliasing and non-null restrictions. You seem to argue that we should lint on these, or change the semantics so they operate via pointer. But I think that we should keep place semantics as they are, and steer users away from writing (*ptr) places at all, unless they specifically want to access the value or reborrow as a reference.

Right now, the predominant case where (*ptr) places are necessary is in field projections such as addr_of_mut!((*ptr).field); this case could be eliminated with something like Gankra's path offset syntax ptr~field. The second-biggest case is probably your subslice case, which could be eliminated with the further extension ptr~[..layout_size], but that syntax is probably stretching it a bit. More conservatively, we could give clear examples in the docs of implementing the same behavior with #71146 + #74265.

If we could eliminate those two cases, then users could simply avoid (*ptr) places unless they want to access the value. Even though the mental model of "(*ptr) as inherently risky" would be somewhat inaccurate, it would be sufficient to prevent unexpected behavior through any of the implicit-ref operations.

RalfJung

RalfJung commented on Jul 19, 2022

@RalfJung
MemberAuthor

steer users away from writing (*ptr) places at all

I mean, that'd be great, but it is a big change to the language -- much bigger than what I have the time for. So I dearly hope someone will pursue this. :) But meanwhile, I think there are smaller steps we can take that will help improve the situation, and those are the steps I am proposing.

And even once we reach that goal, many people will still write *ptr places, since that's how you do it in C. So we still need a plan for how to detect and lint against incorrect use of that pattern.

kornelski

kornelski commented on Sep 9, 2022

@kornelski
Contributor

I agree it's a trap, but at the same time if * magically temporarily preserved "raw-pointerness" of the value, then it would be inconsistent with:

let tmp = *ptr;
tmp.data.len();

I'm already spooked by &* cancelling each other in a special way, but at least in safe rust that's inconsequential.

So having a dedicated operator for a raw deref (->) or pointer offset (~) would be better: https://faultlore.com/blah/fix-rust-pointers/

RalfJung

RalfJung commented on Sep 9, 2022

@RalfJung
MemberAuthor

it would be inconsistent with:

It's not inconsistent since these are completely different programs! Let's think more carefully about places and values to make sense of all this. We have to make place-to-value coercions explicit; I will use __load(place) to write the value expression that denotes the value stored in the place. In particular when x is a local variable, then x is a place expression denoting the address of the local variable, and __load(x) is a value expression denoting the value stored in that local variable. *value is a place expression and &place is a value expression. place.field is also a place expression. Aside from __load, none of these performs a memory access.

Your code then becomes

let tmp = __load(*__load(ptr));
len(&tmp.data);

whereas test_len from the OP becomes

pub fn test_len(t: *const Test) -> usize = unsafe {
    len(&(*__load(t)).data) // the & being inserted here is exactly the problem
}

IOW, by storing *ptr into a local, you are doing an extra __load, which makes a big difference for which kind of UB can happen where. It should not be surprising that when you do a __load from the raw pointer, the pointer must be valid. But in test_len we are never __loading from the pointer (we are just loading the pointer value itself, stored in the place t), so we should never assert any kind of validity.

I'm already spooked by &* cancelling each other in a special way, but at least in safe rust that's inconsequential.

That is also entirely explained by a proper treatment of places and values. There's nothing special going on. MiniRust defines both & and * in a compositional modular way without special cases to get the right behavior (but so far it's not really in a state where it can serve as a tutorial for people not already versed in this kind of operational semantics).

I do agree that we could do a lot better teaching this place/value stuff.

kornelski

kornelski commented on Sep 9, 2022

@kornelski
Contributor

You're looking at this from a very low-level perspective — no doubt technically correct one, but I mean it from more layman perspective. It requires having a more low-level mental model of the language. For novice Rust users used to higher-level/GC languages it's already weird that x().foo() and let tmp = x(); tmp.foo() are semantically different, and this adds another such case.

So I don't think that having a special raw-pointer-temporary deref would solve the surprising behavior, it'd just move it around.

RalfJung

RalfJung commented on Sep 9, 2022

@RalfJung
MemberAuthor

If you are using raw pointers, then I don't think you can avoid learning about places and values. (Believe me, it can get a lot more low-level than that. ;)

The comparison with high-level/GC languages makes little sense since those languages don't have the features we are discussing here. For better or worse, Rust (like C and C++ but unlike, e.g., Java) is a place-based language, and it makes little sense to try and hide that fact from people that want to use low-level Rust features such as raw pointers.

adetaylor

adetaylor commented on Jan 27, 2025

@adetaylor
Contributor

#123239 seems to be a more recent PR trying to add a lint for some of these situations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-MIRArea: Mid-level IR (MIR) - https://blog.rust-lang.org/2016/04/19/MIR.htmlC-bugCategory: This is a bug.F-arbitrary_self_types`#![feature(arbitrary_self_types)]`T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @kornelski@adetaylor@SimonSapin@RalfJung@neocturne

        Issue actions

          rustc performs auto-ref when a raw pointer would be enough · Issue #73987 · rust-lang/rust