A Composable, Efficient, and Expressive 2-Instruction Design #40

RossTate · 2023-08-28T13:36:40Z

RossTate
Aug 28, 2023

The following is a design for stack-switching that has a number of useful properties that seem to make it a good fit for the needs of this extension to WebAssembly.
I have made an effort to address all the concerns that have been expressed throughout the lifetime of the Stacks subgroup to the best of my understanding of them.
For example, I have checked that all prior proposals have a straightforward and efficient translation into this design, and I have identified a composability property that this design (but no prior proposal) satisfies.
Please let me know your thoughts, and I am happy to provide a translation if anyone requests one.

The Design

When one performs a call $func, the currently executing function is conceptually left in a suspended state: it stores its current state (e.g. the values of its locals) as a frame on the stack and provides a code address to use to resume the function.
The $func is then performed on the same stack, with that "resumption" address as its return address.
The type of that return address is (result to*), where $func has type func (param ti*) (result to*).

This design treats suspended stacks as essentially having a dangling resumption address.
The type of a suspended stack indicates the type of its dangling resumption address.
The design then provides instructions generalizing call $func by letting $func be performed on a different stack with its dangling resumption address as the return address.

Types

First, consider function types in WebAssembly: func (param ti*) (result to*). We often think of this as saying the function takes ti* arguments and returns to* results. But at the assembly level it really says the function takes a pointer to a stack with a return address expecting to* results along with ti* arguments.
That is, (result to*) really describes how the caller of the function can be resumed, so we can think of (result to*) as the required resumption type (rt) of the caller of the function.
The design of WebAssembly ensures this resumption can always occur without allocating a new stack, but this perspective offers a nice generalization for typing stack switching.
Right now the only resumption types are of the form result to*, but this perspective leaves a nice path forward for easily integrating stack-switching with potential extensions to WebAssembly that might extend what a callee can do with its caller, such as inspecting the stack or return through alternate paths.

With that in mind, this design introduces just one new type: resumeref rt.
a resumeref rt is either null or a possibly-expired reference to a stack that can be resumed according to rt (e.g. by providing values for a result).
In other words, it is a stack with a dangling return address whose resumption type is rt.
A resumeref expires if and only if it has been resumed, ensuring that a resumption reference can be used at most once, that a stack has an unexpired resumption reference to it if and only if it is suspended, and that a suspended stack has at most one unexpired resumption reference to it.

Although resumeref is a kind of reference, it need not be a garbage-collected object; it can be implemented using a reference to a stack and a counter.
In fact, the only instructions in this design requiring memory allocation are those containing new that involve allocating a new stack.

Core Instructions

This proposal has just 2 core instructions, from which the remaining instructions can be derived .
Both instructions are constant-time, always.

`resume.new`

resume.new rt: [] -> [(resumeref rt)]

This instruction allocates a new stack whose root frame has resumption type rt but which will trap if it is ever resumed, meaning the new stack is conceptually empty.

`resume.switch_call`

The instruction resume.switch_call rt1 $func: [ti* (resumeref rt2)] -> rt1 references a function $func : [ti* (resumeref rt1)] -> rt2.

This instruction transfers control to the referenced suspended stack (trapping if the resumeref is null or expired), expires the resumeref, and calls $func on the target stack with its dangling return address as the function's return address, and with the arguments taken from the value stack and the resumeref referencing the following instruction on the now-suspended stack.

Since resumption types are a new concept, it might be helpful to illustrate what this instruction looks like using results:

resume.switch_call (result t1*) $func: [ti* (resumeref (result t2*))] -> [t1*] references a function $func : [ti* (resumeref (result t1*))] -> [t2*]

It is essentially performing call $func (which would normally return to the instruction following the call) but switching the stack-pointer register with the register containing resumeref's stack-pointer just before jumping to the head of the function.

High-Level Example: Synchronous Channels for Green Threads

To get a sense of how this can be used, consider implementing a green thread sending an i32 on a typed synchronous channel.

For context, suppose the program is supposed to trap if ever there is more than one waiting receiver or waiting sender.
Also, suppose the program has some global queue of paused threads that are simply waiting to be scheduled for a turn; this queue can be added to using $add_to_paused_queue: [(resumeref (result))] -> [] and can be removed from using $pop_from_paused_queue: [] -> [(resumeref (result))].

Channels (some channelref) would have two fields (though only one is non-null at a time): a resumeref (result i32) for a waiting receiver, and a resumeref (result (resumeref (result i32))) for a waiting sender.
For simplicity, suppose the program is supposed to trap if ever there is more than one waiting receiver or waiting sender.

The send function first needs to get a receiver to send to.

If there is a waiting receiver already on the channel, then set the local $receiver: (resumeref (result i32)) to it and null the field.

If there is no waiting receiver, then the send first needs to check if there is already a waiting sender.
If there is, then the program traps.
Otherwise, the send needs to add the current thread to the channel and yield control to some paused thread.
We can do this with resume.switch_call (result (resumeref (result i32))) $set_waiting_sender (local.get $channel) (call $pop_from_paused_queue), where $set_waiting_sender: [channelref (resumeref (result (resumeref (result i32))))] -> [] sets the waiting-sender field of its first argument to the value of its second argument and returns.
This leaves the stack in a suspended state, but---when it returns---a (resumeref (result i32)) will be on the value stack.
So the following instruction can be simply local.set $receiver.

Now that $receiver has been initialized, one way or another, the send needs to resume the receiver with the message.
This leaves the sender in a suspended state; the suspender does not need any data, it just needs to be given a turn to execute.
As such, it also needs to add itself to the queue of paused threads.
We can do all this with the instruction resume.switch_call $add_to_paused_queue_then_return (result) (local.get $message) (local.get $waiting_receiver), where $add_to_paused_queue_then_return : [i32 (resumeref (result))] -> [i32] calls $add_to_paused_queue with its second argument and then returns its first argument.

This leaves the sending thread in a suspended state.
When it gets resumed, no values will be on the value stack.
At this point, the send function simply returns.

Notice that this example does not use any derived instructions.
In particular, resume.switch is not expressive enough to support such a simple implementation of send; only the resume.switch_called functions have the critical ability to do something with the suspending stack before transferring control to the target resumption point, such as putting it in a channel or adding it to a queue.

Unabridged Version

The unabridged write-up for this design can be found here.
Below is an outline with links so that you can jump to whichever parts you're interested in.

RossTate · 2023-08-28T13:48:01Z

RossTate
Aug 28, 2023
Author

@kayceesrk, what do you think of the support for Multicore OCaml's effects? I incorporated the tail-resumption optimization that @daanx's research suggests can significantly improve performance.

0 replies

RossTate · 2023-08-28T13:48:11Z

RossTate
Aug 28, 2023
Author

@eqrion, I remember you were concerned about the maintenance burden of having both suspenders and core stack-switching. Does the "JSPI as a standard library" implementation I provided, with a suspender as simply a mutable reference to a core resumption reference, offer a reasonable solution that issue?

0 replies

RossTate · 2023-08-28T13:48:34Z

RossTate
Aug 28, 2023
Author

@slindley and crew, I'm curious what your thoughts are on the composability guarantee.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Composable, Efficient, and Expressive 2-Instruction Design #40

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

A Composable, Efficient, and Expressive 2-Instruction Design #40

RossTate Aug 28, 2023

The Design

Types

Core Instructions

resume.new

resume.switch_call

High-Level Example: Synchronous Channels for Green Threads

Unabridged Version

Replies: 3 comments

RossTate Aug 28, 2023 Author

RossTate Aug 28, 2023 Author

RossTate Aug 28, 2023 Author

RossTate
Aug 28, 2023

`resume.new`

`resume.switch_call`

RossTate
Aug 28, 2023
Author

RossTate
Aug 28, 2023
Author

RossTate
Aug 28, 2023
Author