(Re-)Consider multi-return vs unit #41

lukewagner · 2022-06-08T00:06:52Z

In #29, we had an unresolved comment thread about whether we should make component-level function results symmetric with parameters. The question was independent (pre-existing) of #29, so that PR merged and this issue represents the continued discussion.

One interesting new technical argument just came up via @peterhuene's work: Component-level start sections calling functions that return "nothing" technically always return unit. If we treated unit uniformly (pushed a new unit value into the value index space), then the "linearity" requirement (each value must be consumed exactly once) would mean we have to consume that unit... which would then produce a new unit (ad infinitum). Thus, we have to special-case unit in the validation rules (so it doesn't push a value index) which is irregular and adds validation complexity. It's not a massive problem, but it does add a point to the multi-return column (which can return an empty list of values).

The text was updated successfully, but these errors were encountered:

lukewagner · 2022-06-17T19:17:31Z

#49 raises an interesting case where having an empty return (returning undefined to a JS caller) would produce a more-regular JS interface than a unit-return (returning null to a JS caller).

(I haven't forgotten about this issue; I just haven't had time to think and discuss more about it yet.)

Recently I found myself trying to debug an invalid component but the way that it was invalid meant that I couldn't print it with `wasmprinter`. This commit removes all the type management in `wasmprinter` which is currently needed for the `start` function of components (and other since-removed items I believe historically) to allow printing more components, namely those that even may be invalid. In general I personally think it's best for `wasmprinter` to print as much as possible, even if it's invalid, and leave validation primarily to the parsing phase later. This commit does effectively "break" the implementation of printing modules with `start` functions, however. The `(result ..)` production is never printed any more since the type information here is not tracked. This is tracked at WebAssembly/component-model#41 for updates to the `start` function.

lukewagner · 2022-07-11T18:15:20Z

Here's a third argument in favor of multi-return that I recently realized:

To support runtime instantiation of components (going beyond the declarative all-up-front instantiation we currently have), I've been imagining that we'd define a new canonical built-in (canon instantiate $Component (func $new)) that defines a function $new that, when called, creates a new instance of $Component. If $Component defines any function exports, then the return type of $new would be a (handle $Component) (with $Component serving as a resource type). But if $Component exclusively exports values, then $new can simply be a normal function returning values (thereby realizing the WASI "command" pattern¹). But now the asymmetry of parameters/results bites again: value import names can turn into parameter names but value export names must either be dropped or wrapped into a record.

lukewagner · 2022-07-11T19:23:23Z

So if we were to switch to multi-return, then an empty result type list would be the way to express what is today (result unit) and the existing function subtyping rules that allow superfluous results to be ignored by the caller would take the place of the special "everything is a subtype of unit" rule. Rather than have these two different ways to express "nothing" (requiring API authors to make an ad hoc choice), I think we should remove unit as a specialization and instead allow <valtype>* as the payload of the variant, union and expected cases, which are, iiuc, the only other places where unit is useful to express "no value" today.

So if expected now takes lists of types, with both lists being empty as a common case, "expected" as an adjective doesn't look right (there's nothing to "expect"). Mulling over alternative names for a while, "result" just seems like the much better name for a return value because, in all cases, you're getting "a result" (which may be simple or compound). (This is just from first principles and inability to find a better name; I'm not even a Rust programmer.) Originally, result was thrown out because result is already used in the s-expression text format for function types and it'd look funny to write (func (result (result))). Fast-forward a year and now our expectation is that component interfaces aren't authored in s-expressions but in wit which lets you write func() -> result. Sure, the resulting <functype> in the resulting component would have (result (result)), but I don't think it's grammatically ambiguous and so I think the tradeoff for, imo, the ideal type name is worth it.

Lastly, if the result list is to be exactly symmetric with the param list, that implies that a result list can contain an arbitrary mix of named and unnamed values. For parameters, this seemed fine because parameter names can always be ignored, falling back to positional parameter rules. But with results, we'll often need to synthesize a source-language value (e.g., to return to a JS caller) which means defining how to interpret these mixed-name types (as opposed to all-named (record) or all-positional (tuple)). It's certainly possible to imagine a binding, of course, but it feels like an unnecessary burden for a corner case that noone needs. Two restrictions that make sense to me are:

a param/result list is either all-named or all-unnamed
a param list is either all-named or contains a single unnamed type

Option 1 seems simpler and more regular, but if we consider the components-as-functions use case in the preceding comment, components can't express multiple unnamed params/results because of the uniqueness constraints on import/export names. More-generally, parameter/result name uniqueness seems like it could be broadly useful and simplifying for the same reason as import/export name uniqueness. Thus, I actually lean toward option 2, which gives each param/result a unique name (with that name possibly being the empty string) while preserving the pragmatic performance goal that scalar return values are not needlessly and annoyingly object-wrapped in JS and all the other dynamic languages.

So that's what I'm thinking a coherent switch to multi-return entails. What do folks think? (Sorry for the churn and flip-flopping here!)

Recently I found myself trying to debug an invalid component but the way that it was invalid meant that I couldn't print it with `wasmprinter`. This commit removes all the type management in `wasmprinter` which is currently needed for the `start` function of components (and other since-removed items I believe historically) to allow printing more components, namely those that even may be invalid. In general I personally think it's best for `wasmprinter` to print as much as possible, even if it's invalid, and leave validation primarily to the parsing phase later. This commit does effectively "break" the implementation of printing modules with `start` functions, however. The `(result ..)` production is never printed any more since the type information here is not tracked. This is tracked at WebAssembly/component-model#41 for updates to the `start` function.

peterhuene · 2022-07-12T17:16:16Z

I think the above makes a lot of sense to me. I don't have any objections with the other proposed changes.

Regarding option 2 from above:

a param list is either all-named or contains a single unnamed type

Is my understanding of this, and the following paragraph you wrote, correct with the following summary?

a param/result list is either empty, a single unnamed param/result, or all uniquely-named.

the empty string is considered a valid name.

If so, I think that option gives the most flexibility too.

lukewagner · 2022-07-12T22:27:03Z

Yep! Thanks for clarifying. Thus, you can write both (func (result "" u32)) and (func (result u32)) and they are (hopefully not too annoyingly in practice...) distinct types.

Resolves #41

lukewagner · 2022-07-14T23:15:15Z

Writing this up, I realized we probably don't want to have union payloads be list of types: it makes the syntax way more awkward (lists of lists of types) and the point of union was to specialize anyways. But other than that, everything described above is in #69.

Recently I found myself trying to debug an invalid component but the way that it was invalid meant that I couldn't print it with `wasmprinter`. This commit removes all the type management in `wasmprinter` which is currently needed for the `start` function of components (and other since-removed items I believe historically) to allow printing more components, namely those that even may be invalid. In general I personally think it's best for `wasmprinter` to print as much as possible, even if it's invalid, and leave validation primarily to the parsing phase later. This commit does effectively "break" the implementation of printing modules with `start` functions, however. The `(result ..)` production is never printed any more since the type information here is not tracked. This is tracked at WebAssembly/component-model#41 for updates to the `start` function.

peterhuene mentioned this issue Jun 8, 2022

components: Validation of the start section conditionally defines the result as a value bytecodealliance/wasm-tools#604

Closed

Liamolucko mentioned this issue Jun 17, 2022

Should unit map to null or undefined? #49

Closed

alexcrichton mentioned this issue Jul 11, 2022

Remove type management in wasmprinter bytecodealliance/wasm-tools#678

Merged

lukewagner added a commit that referenced this issue Jul 14, 2022

Switch (back) to multi-return, remove unit, s/expected/result/

b8349ea

Resolves #41

lukewagner added a commit that referenced this issue Jul 14, 2022

Switch (back) to multi-return, remove unit, s/expected/result/

0715a99

Resolves #41

lukewagner mentioned this issue Jul 14, 2022

Switch (back) to multi-return, remove unit, s/expected/result/ #69

Merged

lukewagner closed this as completed in 4d1e879 Aug 4, 2022

lukewagner mentioned this issue Apr 29, 2024

The output of a function in the component model is itself not a valid component model type #349

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(Re-)Consider multi-return vs unit #41

(Re-)Consider multi-return vs unit #41

lukewagner commented Jun 8, 2022

lukewagner commented Jun 17, 2022

lukewagner commented Jul 11, 2022

lukewagner commented Jul 11, 2022

peterhuene commented Jul 12, 2022

lukewagner commented Jul 12, 2022

lukewagner commented Jul 14, 2022

(Re-)Consider multi-return vs unit #41

(Re-)Consider multi-return vs unit #41

Comments

lukewagner commented Jun 8, 2022

lukewagner commented Jun 17, 2022

lukewagner commented Jul 11, 2022

lukewagner commented Jul 11, 2022

peterhuene commented Jul 12, 2022

lukewagner commented Jul 12, 2022

lukewagner commented Jul 14, 2022