Niche placement heuristic: place at beginning or end of type

I noticed that #102750 regressed the size of some types in the `syn` crate. I minimized the difference to the following:

```rust
pub enum Enum {
    A(A),
    B(B),
}

pub struct A {
    pub x: Thing,
    pub y: u16,
    pub z: u16,
}

pub struct B {
    pub x: Thing,
    pub y: u16,
}

pub enum Thing {
    C(u16, u16),
    D(u16, u16),
}

fn main() {
    println!("{}", std::mem::size_of::<Enum>());
}
```

```console
$ cargo +nightly-2022-11-23 run --quiet && cargo +nightly-2022-11-24 run --quiet
10
12
```

The type layout in nightly-2022-11-23 is:

```console
print-type-size type: `Enum`: 10 bytes, alignment: 2 bytes
print-type-size     variant `A`: 10 bytes
print-type-size         field `.0`: 10 bytes
print-type-size     variant `B`: 10 bytes
print-type-size         padding: 2 bytes
print-type-size         field `.0`: 8 bytes, alignment: 2 bytes
print-type-size type: `A`: 10 bytes, alignment: 2 bytes
print-type-size     field `.x`: 6 bytes
print-type-size     field `.y`: 2 bytes
print-type-size     field `.z`: 2 bytes
print-type-size type: `B`: 8 bytes, alignment: 2 bytes
print-type-size     field `.x`: 6 bytes
print-type-size     field `.y`: 2 bytes
print-type-size type: `Thing`: 6 bytes, alignment: 2 bytes
print-type-size     discriminant: 2 bytes
print-type-size     variant `C`: 4 bytes
print-type-size         field `.0`: 2 bytes
print-type-size         field `.1`: 2 bytes
print-type-size     variant `D`: 4 bytes
print-type-size         field `.0`: 2 bytes
print-type-size         field `.1`: 2 bytes
```

and in nightly-2022-11-24:

```console
print-type-size type: `Enum`: 12 bytes, alignment: 2 bytes
print-type-size     discriminant: 2 bytes
print-type-size     variant `A`: 10 bytes
print-type-size         field `.0`: 10 bytes
print-type-size     variant `B`: 8 bytes
print-type-size         field `.0`: 8 bytes
print-type-size type: `A`: 10 bytes, alignment: 2 bytes
print-type-size     field `.y`: 2 bytes
print-type-size     field `.z`: 2 bytes
print-type-size     field `.x`: 6 bytes
print-type-size type: `B`: 8 bytes, alignment: 2 bytes
print-type-size     field `.y`: 2 bytes
print-type-size     field `.x`: 6 bytes
print-type-size type: `Thing`: 6 bytes, alignment: 2 bytes
print-type-size     discriminant: 2 bytes
print-type-size     variant `C`: 4 bytes
print-type-size         field `.0`: 2 bytes
print-type-size         field `.1`: 2 bytes
print-type-size     variant `D`: 4 bytes
print-type-size         field `.0`: 2 bytes
print-type-size         field `.1`: 2 bytes
```

Graphically, the layout of Enum::A and Enum::B before and after are:

<table>
<tr><th></th><th>Before</th><th>After</th></tr>
<tr><th>Enum::A</th><td><pre>
+-----+-----+-----+-----+-----+
| dsc | u16 | u16 | u16 | u16 |
+-----+-----+-----+-----+-----+
^~~Thing~~~~~~~~~~^
^~~A~~~~~~~~~~~~~~~~~~~~~~~~~~^
</pre></td><td><pre>
+-----+-----+-----+-----+-----+-----+
| dsc | u16 | u16 | dsc | u16 | u16 |
+-----+-----+-----+-----+-----+-----+
                  ^~~Thing~~~~~~~~~~^
      ^~~A~~~~~~~~~~~~~~~~~~~~~~~~~~^
</pre></td></tr>
<tr><th>Enum::B</th><td><pre>
      +-----+-----+-----+-----+
..... | dsc | u16 | u16 | u16 |
      +-----+-----+-----+-----+
      ^~~Thing~~~~~~~~~~^      
      ^~~B~~~~~~~~~~~~~~~~~~~~^
</pre></td><td><pre>
+-----+-----+-----+-----+-----+
| dsc | u16 | dsc | u16 | u16 | .....
+-----+-----+-----+-----+-----+
            ^~~Thing~~~~~~~~~~^
      ^~~B~~~~~~~~~~~~~~~~~~~~^
</pre></td></tr>
</table>

Notice how the old layout is putting `Thing` at the beginning of `A` and `B`, while the new layout is putting `Thing` at the end of `A` and `B`. The reason the new layout is worse is that now, when building `Enum`, there is no way to line up `A`'s existing niche with the `B` case's padding. Instead a whole new discriminant needs to be added.

From reading the description of #102750, I don't get the impression that the justification of that PR applies to the difference in this placement of `Thing` inside `A` and `B` in the code above. It seems like just a coincidence of the implementation that the PR affected this code.

My observation is that, other things being equal, we should prefer to locate niches ~~in the very beginning or very end of a type, not in the middle~~ near the beginning or end of a type, as far from the middle of the type as possible. This will allow them to get lined up with the padding of smaller enum variants. Placing a niche as far from the middle of the type as possible makes room for the largest possible other type to line up before or after the niche.

Mentioning @the8472 @wesleywiser since you were recently working on layout.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Niche placement heuristic: place at beginning or end of type #104807

7 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Niche placement heuristic: place at beginning or end of type #104807

Description

Activity

the8472 commented on Nov 24, 2022

dtolnay commented on Nov 24, 2022

the8472 commented on Nov 24, 2022

the8472 commented on Nov 24, 2022

dtolnay commented on Nov 24, 2022

7 remaining items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions