Skip to content

Commit 9dcd6f1

Browse files
tall-vasegribozavrtall-vase
authored
"Token Types" chapter of Idiomatic Rust (#2921)
Materials on "token types." --------- Co-authored-by: Dmitri Gribenko <[email protected]> Co-authored-by: tall-vase <[email protected]>
1 parent d4fbf29 commit 9dcd6f1

File tree

8 files changed

+625
-0
lines changed

8 files changed

+625
-0
lines changed

src/SUMMARY.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -445,6 +445,13 @@
445445
- [Serializer: implement Struct](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics/struct.md)
446446
- [Serializer: implement Property](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics/property.md)
447447
- [Serializer: Complete implementation](idiomatic/leveraging-the-type-system/typestate-pattern/typestate-generics/complete.md)
448+
- [Token Types](idiomatic/leveraging-the-type-system/token-types.md)
449+
- [Permission Tokens](idiomatic/leveraging-the-type-system/token-types/permission-tokens.md)
450+
- [Token Types with Data: Mutex Guards](idiomatic/leveraging-the-type-system/token-types/mutex-guard.md)
451+
- [Branded pt 1: Variable-specific tokens](idiomatic/leveraging-the-type-system/token-types/branded-01-motivation.md)
452+
- [Branded pt 2: `PhantomData` and Lifetime Subtyping](idiomatic/leveraging-the-type-system/token-types/branded-02-phantomdata.md)
453+
- [Branded pt 3: Implementation](idiomatic/leveraging-the-type-system/token-types/branded-03-impl.md)
454+
- [Branded pt 4: Branded types in action.](idiomatic/leveraging-the-type-system/token-types/branded-04-in-action.md)
448455

449456
---
450457

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
---
2+
minutes: 15
3+
---
4+
5+
# Token Types
6+
7+
Types with private constructors can be used to act as proof of invariants.
8+
9+
<!-- dprint-ignore-start -->
10+
```rust,editable
11+
pub mod token {
12+
// A public type with private fields behind a module boundary.
13+
pub struct Token { proof: () }
14+
15+
pub fn get_token() -> Option<Token> {
16+
Some(Token { proof: () })
17+
}
18+
}
19+
20+
pub fn protected_work(token: token::Token) {
21+
println!("We have a token, so we can make assumptions.")
22+
}
23+
24+
fn main() {
25+
if let Some(token) = token::get_token() {
26+
// We have a token, so we can do this work.
27+
protected_work(token);
28+
} else {
29+
// We could not get a token, so we can't call `protected_work`.
30+
}
31+
}
32+
```
33+
<!-- dprint-ignore-end -->
34+
35+
<details>
36+
37+
- Motivation: We want to be able to restrict user's access to functionality
38+
until they've performed a specific task.
39+
40+
We can do this by defining a type the API consumer cannot construct on their
41+
own, through the privacy rules of structs and modules.
42+
43+
[Newtypes](./newtype-pattern.md) use the privacy rules in a similar way, to
44+
restrict construction unless a value is guaranteed to hold up an invariant at
45+
runtime.
46+
47+
- Ask: What is the purpose of the `proof: ()` field here?
48+
49+
Without `proof: ()`, `Token` would have no private fields and users would be
50+
able to construct values of `Token` arbitrarily.
51+
52+
Demonstrate: Try to construct the token manually in `main` and show the
53+
compilation error. Demonstrate: Remove the `proof` field from `Token` to show
54+
how users would be able to construct `Token` if it had no private fields.
55+
56+
- By putting the `Token` type behind a module boundary (`token`), users outside
57+
that module can't construct the value on their own as they don't have
58+
permission to access the `proof` field.
59+
60+
The API developer gets to define methods and functions that produce these
61+
tokens. The user does not.
62+
63+
The token becomes a proof that one has met the API developer's conditions of
64+
access for those tokens.
65+
66+
- Ask: How might an API developer accidentally introduce ways to circumvent
67+
this?
68+
69+
Expect answers like "serialization implementations", other parser/"from
70+
string" implementations, or an implementation of `Default`.
71+
72+
</details>
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
---
2+
minutes: 10
3+
---
4+
5+
# Variable-Specific Tokens (Branding 1/4)
6+
7+
What if we want to tie a token to a specific variable?
8+
9+
```rust,editable
10+
struct Bytes {
11+
bytes: Vec<u8>,
12+
}
13+
struct ProvenIndex(usize);
14+
15+
impl Bytes {
16+
fn get_index(&self, ix: usize) -> Option<ProvenIndex> {
17+
if ix < self.bytes.len() { Some(ProvenIndex(ix)) } else { None }
18+
}
19+
fn get_proven(&self, token: &ProvenIndex) -> u8 {
20+
unsafe { *self.bytes.get_unchecked(token.0) }
21+
}
22+
}
23+
24+
fn main() {
25+
let data_1 = Bytes { bytes: vec![0, 1, 2] };
26+
if let Some(token_1) = data_1.get_index(2) {
27+
data_1.get_proven(&token_1); // Works fine!
28+
29+
// let data_2 = Bytes { bytes: vec![0, 1] };
30+
// data_2.get_proven(&token_1); // Panics! Can we prevent this?
31+
}
32+
}
33+
```
34+
35+
<details>
36+
37+
- What if we want to tie a token to a _specific variable_ in our code? Can we do
38+
this in Rust's type system?
39+
40+
- Motivation: We want to have a Token Type that represents a known, valid index
41+
into a byte array.
42+
43+
Once we have these proven indexes we would be able to avoid bounds checks
44+
entirely, as the tokens would act as the _proof of an existing index_.
45+
46+
Since the index is known to be valid, `get_proven()` can skip the bounds
47+
check.
48+
49+
In this example there's nothing stopping the proven index of one array being
50+
used on a different array. If an index is out of bounds in this case, it is
51+
undefined behavior.
52+
53+
- Demonstrate: Uncomment the `data_2.get_proven(&token_1);` line.
54+
55+
The code here panics! We want to prevent this "crossover" of token types for
56+
indexes at compile time.
57+
58+
- Ask: How might we try to do this?
59+
60+
Expect students to not reach a good implementation from this, but be willing
61+
to experiment and follow through on suggestions.
62+
63+
- Ask: What are the alternatives, why are they not good enough?
64+
65+
Expect runtime checking of index bounds, especially as both `Vec::get` and
66+
`Bytes::get_index` already uses runtime checking.
67+
68+
Runtime bounds checking does not prevent the erroneous crossover in the first
69+
place, it only guarantees a panic.
70+
71+
- The kind of token-association we will be doing here is called Branding. This
72+
is an advanced technique that expands applicability of token types to more API
73+
designs.
74+
75+
- [`GhostCell`](https://plv.mpi-sws.org/rustbelt/ghostcell/paper.pdf) is a
76+
prominent user of this, later slides will touch on it.
77+
78+
</details>
Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
---
2+
minutes: 30
3+
---
4+
5+
# `PhantomData` and Lifetime Subtyping (Branding 2/4)
6+
7+
Idea:
8+
9+
- Use a lifetime as a unique brand for each token.
10+
- Make lifetimes sufficiently distinct so that they don't implicitly convert
11+
into each other.
12+
13+
<!-- dprint-ignore-start -->
14+
```rust,editable
15+
use std::marker::PhantomData;
16+
17+
#[derive(Default)]
18+
struct InvariantLifetime<'id>(PhantomData<&'id ()>); // The main focus
19+
20+
struct Wrapper<'a> { value: u8, invariant: InvariantLifetime<'a> }
21+
22+
fn lifetime_separator<T>(value: u8, f: impl for<'a> FnOnce(Wrapper<'a>) -> T) -> T {
23+
f(Wrapper { value, invariant: InvariantLifetime::default() })
24+
}
25+
26+
fn try_coerce_lifetimes<'a>(left: Wrapper<'a>, right: Wrapper<'a>) {}
27+
28+
fn main() {
29+
lifetime_separator(1, |wrapped_1| {
30+
lifetime_separator(2, |wrapped_2| {
31+
// We want this to NOT compile
32+
try_coerce_lifetimes(wrapped_1, wrapped_2);
33+
});
34+
});
35+
}
36+
```
37+
<!-- dprint-ignore-end -->
38+
39+
<details>
40+
41+
<!-- TODO: Link back to PhantomData in the borrowck invariants chapter.
42+
- We saw `PhantomData` back in the Borrow Checker Invariants chapter.
43+
-->
44+
45+
- In Rust, lifetimes can have subtyping relations between one another.
46+
47+
This kind of relation allows the compiler to determine if one lifetime
48+
outlives another.
49+
50+
Determining if a lifetime outlives another also allows us to say _the shortest
51+
common lifetime is the one that ends first_.
52+
53+
This is useful in many cases, as it means two different lifetimes can be
54+
treated as if they were the same in the regions they do overlap.
55+
56+
This is usually what we want. But here we want to use lifetimes as a way to
57+
distinguish values so we say that a token only applies to a single variable
58+
without having to create a newtype for every single variable we declare.
59+
60+
- **Goal**: We want two lifetimes that the rust compiler cannot determine if one
61+
outlives the other.
62+
63+
We are using `try_coerce_lifetimes` as a compile-time check to see if the
64+
lifetimes have a common shorter lifetime (AKA being subtyped).
65+
66+
- Note: This slide compiles, by the end of this slide it should only compile
67+
when `subtyped_lifetimes` is commented out.
68+
69+
- There are two important parts of this code:
70+
- The `impl for<'a>` bound on the closure passed to `lifetime_separator`.
71+
- The way lifetimes are used in the parameter for `PhantomData`.
72+
73+
## `for<'a>` bound on a Closure
74+
75+
- We are using `for<'a>` as a way of introducing a lifetime generic parameter to
76+
a function type and asking that the body of the function to work for all
77+
possible lifetimes.
78+
79+
What this also does is remove some ability of the compiler to make assumptions
80+
about that specific lifetime for the function argument, as it must meet rust's
81+
borrow checking rules regardless of the "real" lifetime its arguments are
82+
going to have. The caller is substituting in actual lifetime, the function
83+
itself cannot.
84+
85+
This is analogous to a forall (Ɐ) quantifier in mathematics, or the way we
86+
introduce `<T>` as type variables, but only for lifetimes in trait bounds.
87+
88+
When we write a function generic over a type `T`, we can't determine that type
89+
from within the function itself. Even if we call a function
90+
`fn foo<T, U>(first: T, second: U)` with two arguments of the same type, the
91+
body of this function cannot determine if `T` and `U` are the same type.
92+
93+
This also prevents _the API consumer_ from defining a lifetime themselves,
94+
which would allow them to circumvent the restrictions we want to impose.
95+
96+
## PhantomData and Lifetime Variance
97+
98+
- We already know `PhantomData`, which can introduce a formal no-op usage of an
99+
otherwise unused type or a lifetime parameter.
100+
101+
- Ask: What can we do with `PhantomData`?
102+
103+
Expect mentions of the Typestate pattern, tying together the lifetimes of
104+
owned values.
105+
106+
- Ask: In other languages, what is subtyping?
107+
108+
Expect mentions of inheritance, being able to use a value of type `B` when a
109+
asked for a value of type `A` because `B` is a "subtype" of `A`.
110+
111+
- Rust does have Subtyping! But only for lifetimes.
112+
113+
Ask: If one lifetime is a subtype of another lifetime, what might that mean?
114+
115+
A lifetime is a "subtype" of another lifetime when it _outlives_ that other
116+
lifetime.
117+
118+
- The way that lifetimes used by `PhantomData` behave depends not only on where
119+
the lifetime "comes from" but on how the reference is defined too.
120+
121+
The reason this compiles is that the
122+
[**Variance**](https://doc.rust-lang.org/stable/reference/subtyping.html#r-subtyping.variance)
123+
of the lifetime inside of `InvariantLifetime` is too lenient.
124+
125+
Note: Do not expect to get students to understand variance entirely here, just
126+
treat it as a kind of ladder of restrictiveness on the ability of lifetimes to
127+
establish subtyping relations.
128+
129+
<!-- Note: We've been using "invariants" in this module in a specific way, but subtyping introduces _invariant_, _covariant_, and _contravariant_ as specific terms. -->
130+
131+
- Ask: How can we make it more restrictive? How do we make a reference type more
132+
restrictive in rust?
133+
134+
Expect or demonstrate: Making it `&'id mut ()` instead. This will not be
135+
enough!
136+
137+
We need to use a
138+
[**Variance**](https://doc.rust-lang.org/stable/reference/subtyping.html#r-subtyping.variance)
139+
on lifetimes where subtyping cannot be inferred except on _identical
140+
lifetimes_. That is, the only subtype of `'a` the compiler can know is `'a`
141+
itself.
142+
143+
Note: Again, do not try to get the whole class to understand variance. Treat
144+
it as a ladder of restrictiveness for now.
145+
146+
Demonstrate: Move from `&'id ()` (covariant in lifetime and type),
147+
`&'id mut ()` (covariant in lifetime, invariant in type), `*mut &'id mut ()`
148+
(invariant in lifetime and type), and finally `*mut &'id ()` (invariant in
149+
lifetime but not type).
150+
151+
Those last two should not compile, which means we've finally found candidates
152+
for how to bind lifetimes to `PhantomData` so they can't be compared to one
153+
another in this context.
154+
155+
Reason: `*mut` means
156+
[mutable raw pointer](https://doc.rust-lang.org/reference/types/pointer.html#r-type.pointer.raw).
157+
Rust has mutable pointers! But you cannot reason about them in safe rust.
158+
Making this a mutable raw pointer to a reference that has a lifetime
159+
complicates the compiler's ability subtype because it cannot reason about
160+
mutable raw pointers within the borrow checker.
161+
162+
- Wrap up: We've introduced ways to stop the compiler from deciding that
163+
lifetimes are "similar enough" by choosing a Variance for a lifetime in
164+
`PhantomData` that is restrictive enough to prevent this slide from compiling.
165+
166+
That is, we can now create variables that can exist in the same scope as each
167+
other, but whose types are automatically made different from one another
168+
per-variable without much boilerplate.
169+
170+
## More to Explore
171+
172+
- The `for<'a>` quantifier is not just for function types. It is a
173+
[**Higher-ranked trait bound**](https://doc.rust-lang.org/reference/subtyping.html?search=Hiher#r-subtype.higher-ranked).
174+
175+
</details>

0 commit comments

Comments
 (0)