Skip to content

Moving #[rustc_box] to move_val_init intrinsic #110715

Closed
@est31

Description

@est31

I have opened this issue for discussing how to move #[rustc_box] (introduced by #97293) to something more simpler. My original proposal (reproduced below) was to use builtin # syntax for it, but through the discussion, it was brought up that there used to be a move_val_init intrinsic which would have been a good fit. Initial tests, both with a builtin # ptr_write and move_val_init directly showed promising results: it can deliver the same codegen as #[rustc_box] can, which means that one could switch the vec![] macro from using #[rustc_box] towards Box::new_uninit_slice, then writing into the pointer with move_val_init, then calling asusme_init on it.

Right now, #[rustc_box] desugars to THIR ExprKind::Box, which desugars to code calling exchange_malloc, followed by ShallowInitBox, a construct specifically added to support box syntax, and then followed by something equivalent to move_val_init: a write into the pointer without putting the value into a local first.

A move to using move_val_init directly would simplify the code that desugars #[rustc_box] at the cost of the code in the vec![] macro. To me that seems like a win because it would make the code around creating boxes less special.


Original proposal:

I have opened this issue for discussing how to move #[rustc_box] (introduced by #97293) to builtin# syntax (#110680).

My original proposal was here, to introduce builtin#ptr_write, but I made that proposal while being unaware of ShallowInitBox. So maybe we'd need both builtin#ptr_write and builtin#shallow_init_box.

The options I see:

  • Add builtin#box instead of #[rustc_box], replacing it 1:1.
  • Add builtin#ptr_write and builtin#shallow_init_box, first doing latter and then writing into it doing former. But what would the latter return on a type level? A Box<T>?
  • Add builtin#ptr_write to then do in the Vec macro: first Box::new_uninit_slice, then builtin#ptr_write, then call assume_init on it. This mirrors the proposal by oli-obk but the issue seems to be very poor codegen (new godbolt example based on the one linked in the ShallowInitBox MCP). The issue might just be due to the .write call though.
  • Only add builtin#shallow_init_box. I'm not sure this will work though as the magic seems to hide in the pointer writing.

Maybe one could first add builtin#ptr_write without touching #[rustc_box] and then check if the godbolt example still has that behaviour?

cc @nbdd0121 @drmeepster @clubby789 @oli-obk

Earlier discussion: #110694 (comment)

Activity

clubby789

clubby789 commented on Apr 24, 2023

@clubby789
Contributor

I implemented a basic builtin#ptr_write and the codegen looks about equivalent:

#![feature(builtin_syntax)]
#![feature(new_uninit)]
type Type = [u8; 128];
pub fn foo2(f: fn() -> Type) -> Box<Type> {
    let mut b = Box::<Type>::new_uninit();
    unsafe { 
    	builtin#ptr_write(f(), b.as_mut_ptr());
        b.assume_init()
    }
}
_ZN3poc4foo217hf17d37ea238339f3E:
	push	r14
	push	rbx
	push	rax
	mov	rbx, rdi
	mov	edi, 128
	mov	esi, 1
	call	qword ptr [rip + __rust_alloc@GOTPCREL]
	test	rax, rax
	je	.LBB0_1
	mov	r14, rax
	mov	rdi, rax
	call	rbx
	mov	rax, r14
	add	rsp, 8
	pop	rbx
	pop	r14
	ret
.LBB0_1:
	mov	edi, 128
	mov	esi, 1
	call	qword ptr [rip + _ZN5alloc5alloc18handle_alloc_error17h9f155cb3ff8eda02E@GOTPCREL]
	ud2

If there's interest I can push my work (based on top of the offset_of PR) although it's very rudimentary, missing proper type checking and probably not constructing MIR properly

est31

est31 commented on Apr 24, 2023

@est31
MemberAuthor

@clubby789 that looks very promising! I think this gives good support for builtin#ptr_write. A PR might be too early, I'd suggest opening one once #110694 is closer to getting merged (after an initial round of reviews for example).

nbdd0121

nbdd0121 commented on Apr 24, 2023

@nbdd0121
Contributor

Although we could shift library code to do box creation, followed by pointer write and then type conversion, I still think the best thing to do is to make Box::new a lang item and magically replace it during the lowering, and therefore allow all users to enjoy the benefit.

My patch in #87781 didn't work due to interaction with two-phase borrow, but that should be fixable.

nbdd0121

nbdd0121 commented on Apr 24, 2023

@nbdd0121
Contributor

ShallowInitBox (rust-lang/compiler-team#460) converts a *mut T to Box<T>. What's magical is that it's an uninitialized box, a construct that we have to support anyway because we can move values out from (and back into) a box.

So if we were to implement Box::new using the builtin syntax, it would probably just look like this:

    pub fn new(x: T) -> Self {
        let ptr = Box::into_raw(Box::<T>::new_uninit()).cast::<T>();
        let ret = builtin#shallow_init_box(ptr);
        *ret = ptr;
        ret
    }
nbdd0121

nbdd0121 commented on Apr 24, 2023

@nbdd0121
Contributor
#![feature(builtin_syntax)]
#![feature(new_uninit)]
type Type = [u8; 128];
pub fn foo2(f: fn() -> Type) -> Box<Type> {
    let mut b = Box::<Type>::new_uninit();
    unsafe { 
    	builtin#ptr_write(f(), b.as_mut_ptr());
        b.assume_init()
    }
}

What's the semantics of ptr_write here? Does it evaluate f directly into *b? If so, it sounds like this is reintroducing the move_val_init intrinsics, which we get rid of in #80290. cc @RalfJung

clubby789

clubby789 commented on Apr 24, 2023

@clubby789
Contributor
let src = &this.thir[src];
let dst = &this.thir[dst];
let dst = unpack!(block = this.as_place(block, dst));
unpack!(block = this.expr_into_dest(this.tcx.mk_place_deref(dst), block, src));

is the MIR building for this, so probably the same thing

est31

est31 commented on Apr 24, 2023

@est31
MemberAuthor

If so, it sounds like this is reintroducing the move_val_init intrinsics, which we get rid of in #80290.

Yeah that would be it, more or less. Linux is building a huge macro to support something like builtin#ptr_write: https://twitter.com/LinaAsahi/status/1570119345510182913 (code link) . I'm not sure if it's enough for their use cases though because they want an error if something gets copied onto the stack, not just silently having that copy happen.

I think builtin#ptr_write would be a nice building block for functionality that would allow you to tag functions as accepting some args in-place, after doing some preparatory work to create memory to put those args inside. This could then be used for Vec::push/HashMap::insert/ptr::write/etc, and not just Box::new alone. I don't think it's a good idea to make all of these functions lang items, instead it would be better to have one approach that works for all of these functions.

est31

est31 commented on Apr 24, 2023

@est31
MemberAuthor

a construct that we have to support anyway because we can move values out from (and back into) a box.

@nbdd0121 what do you mean by that? This is a general property, right? I can't see ShallowInitBox being constructed outside of #[rustc_box] lowering. There is some special casing for boxes in elaborate_drop.rs, but that one just does a .is_box() check to then call open_drop_for_box. It doesn't rely on ShallowInitBox.

nbdd0121

nbdd0121 commented on Apr 24, 2023

@nbdd0121
Contributor

If so, it sounds like this is reintroducing the move_val_init intrinsics, which we get rid of in #80290.

Yeah that would be it, more or less. Linux is building a huge macro to support something like builtin#ptr_write: twitter.com/LinaAsahi/status/1570119345510182913 (code link) . I'm not sure if it's enough for their use cases though because they want an error if something gets copied onto the stack, not just silently having that copy happen.

I happen to be very familiar with this ;) Asahi's macro doesn't (and won't) get into the mainline. Instead, we have a more general version of in-place/pinned initialization macros designed by y86-dev and me.

move_val_init is not sufficient for our use case, because (1) as you said, we want a guarantee that copy is not happening, and (2) we need initialization to be fallible, and move_val_init don't have a way to signal Err (in kernel we can't use panicking).

I am familiar with this removed intrinsic precisely because I was exploring it as a potential solution of the in-place initialization problem in Linux kernel, and it didn't work out eventually :(

a construct that we have to support anyway because we can move values out from (and back into) a box.

@nbdd0121 what do you mean by that? This is a general property, right? I can't see ShallowInitBox being constructed outside of #[rustc_box] lowering. There is some special casing for boxes in elaborate_drop.rs, but that one just does a .is_box() check to then call open_drop_for_box. It doesn't rely on ShallowInitBox.

Sorry, by "a construct" I mean the fact the box is uninitialized. I meant that we need to support box being uninitialized regardless whether ShallowInitBox was there.

petrochenkov

petrochenkov commented on Apr 24, 2023

@petrochenkov
Contributor

builtin # foo is specifically for the cases where we cannot fit a feature into an existing syntax, neither into a function call, nor into an attribute, nor into anything else.

The cases described here very much fit into function calls, or other expressions (possibly with attributes) - all the arguments are expressions, you don't need to specify something like a field name as an argument like with offset_of.

I wouldn't want everything that is currently served by e.g. lang items to spread to syntactic level, there's no need for that, it's all about semantics.

scottmcm

scottmcm commented on Apr 24, 2023

@scottmcm
Member

Overall box syntax might make sense as builtin#, but I agree that builtin#ptr_write doesn't need to be. It would just lower to *p = move val in MIR, so would be fine as a normal intrinsic without parser logic (like the existing intrinsics::read lowers to target = copy *p in MIR).

est31

est31 commented on Apr 24, 2023

@est31
MemberAuthor

Instead, we have a more general version of in-place/pinned initialization macros designed by y86-dev and me.

Ohhh that's very interesting. Do you have a link to the macros or maybe any public discussions?

I still don't fully understand why ShallowInitBox is needed for moving out of boxes, because there are ways to create boxes without #[rustc_box] being involved at all, say Box::from_raw/from_raw_in or the safe Box::new_in. Do you say if you move out of an initialized box in those instances, UB is involved?

I've tried using copy_nonoverlapping that ptr::write uses since #80290, and the result looked like the Box::<MaybeUninit<[...]>>::write version (link). I then locally enabled RUSTC_BOOTSTRAP and compared it with a version that uses move_val_init (using rustc 1.50.0; link to code but didnt get it to work in godbolt as I didn't know how to set RUSTC_BOOTSTRAP there). It actually generates code that looks like the box syntax code, so if we reintroduce move_val_init, we might be able to use it to get rid of #[rustc_box] entirely. To me that would seem like a simplification.

The cases described here very much fit into function calls

@petrochenkov thanks for explaining why you are against using #[rustc_box] here. You @scottmcm, as well as the precedent of move_val_init convinced me that there should be no builtin#ptr_write, but instead it should be done via intrinsics. In fact, it would be best to have #[rustc_box] as intrinsic too, but there is no module in alloc for allocation related intrinsics, only one in core, which obviously can't touch allocations.

The other thing to consider is the type_ascribe! macro. I guess it would be best to not migrate that one either to builtin#, but to instead turn it into a proper intrinsic of the signature fn ascribe<T>(v: T) -> T.

jyn514

jyn514 commented on Apr 26, 2023

@jyn514
Member

petrochenkov thanks for explaining why you are against using #[rustc_box] here. You scottmcm, as well as the precedent of move_val_init convinced me that there should be no builtin#ptr_write, but instead it should be done via intrinsics. In fact, it would be best to have #[rustc_box] as intrinsic too, but there is no module in alloc for allocation related intrinsics, only one in core, which obviously can't touch allocations.

Does that mean we should close this issue? I'm not sure why this should be an intrinsic - rustc_box doesn't have function-like semantics, the whole point is that Box::new has different semantics than rustc_box.

22 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-allocatorsArea: Custom and system allocatorsT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.T-libsRelevant to the library team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @RalfJung@safinaskar@nbdd0121@petrochenkov@est31

        Issue actions

          Moving #[rustc_box] to move_val_init intrinsic · Issue #110715 · rust-lang/rust