-
Notifications
You must be signed in to change notification settings - Fork 1.9k
"Token Types" chapter of Idiomatic Rust #2921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
tall-vase
merged 31 commits into
google:main
from
tall-vase:idiomatic/typesystem-tokens
Oct 10, 2025
Merged
Changes from all commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
b7e8128
Initial skeleton for token types
tall-vase 1a7d356
Reframe around mutexguard and a gentle walkthrough of branded tokens
tall-vase d7f3984
Another editing pass, focused on the branded tokens slide
tall-vase 36c07e3
docs: Clarify speaker note style for instructors (#2917)
gribozavr ba2ceda
Formatting pass
400c336
Initial skeleton for token types
tall-vase a70aa6b
Reframe around mutexguard and a gentle walkthrough of branded tokens
tall-vase 29872e3
Another editing pass, focused on the branded tokens slide
tall-vase bc6abb0
Formatting pass
aa3b402
Merge branch 'idiomatic/typesystem-tokens' of github.com-mainmatter:t…
d47ca92
fix merge artefact
a23df16
fix test errors
dfebb37
Address lints
6c2157d
Update src/idiomatic/leveraging-the-type-system/token-types.md
tall-vase 146a30f
Apply suggestions from review
tall-vase 63e40dc
Apply feedback to tokens/mutex slide
1adee3c
Rewrite token types speaker notes and correct mutex explanation
a680dd8
Address further feedback
36da55f
Fix compilation of branded tokens pt 1
af6523c
Editing pass
a7d0d76
Apply suggestions from code review
tall-vase 46b6b35
Address less complex feedback
c6160b9
Rewrite the phanomdata & lifetime subtyping slide
ae5f961
Expand on Branded material and perform another editing pass
c3aa869
Make the panic line in branded-01 be commented out by default
7267b17
Apply suggestions from code review
tall-vase 638c885
Address further feedback
25b739e
Copy in and edit gribozavr's suggestion of showing the issue with ret…
15f1d86
Apply suggestions from code review
tall-vase 33391af
Update branded-03-impl.md
tall-vase d96c9ba
Formatting pass
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
--- | ||
minutes: 15 | ||
--- | ||
|
||
# Token Types | ||
tall-vase marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Types with private constructors can be used to act as proof of invariants. | ||
|
||
<!-- dprint-ignore-start --> | ||
```rust,editable | ||
pub mod token { | ||
// A public type with private fields behind a module boundary. | ||
pub struct Token { proof: () } | ||
|
||
pub fn get_token() -> Option<Token> { | ||
Some(Token { proof: () }) | ||
} | ||
} | ||
|
||
pub fn protected_work(token: token::Token) { | ||
println!("We have a token, so we can make assumptions.") | ||
} | ||
|
||
fn main() { | ||
if let Some(token) = token::get_token() { | ||
// We have a token, so we can do this work. | ||
protected_work(token); | ||
} else { | ||
// We could not get a token, so we can't call `protected_work`. | ||
} | ||
} | ||
``` | ||
tall-vase marked this conversation as resolved.
Show resolved
Hide resolved
|
||
<!-- dprint-ignore-end --> | ||
|
||
<details> | ||
|
||
- Motivation: We want to be able to restrict user's access to functionality | ||
until they've performed a specific task. | ||
|
||
We can do this by defining a type the API consumer cannot construct on their | ||
own, through the privacy rules of structs and modules. | ||
|
||
[Newtypes](./newtype-pattern.md) use the privacy rules in a similar way, to | ||
restrict construction unless a value is guaranteed to hold up an invariant at | ||
runtime. | ||
|
||
- Ask: What is the purpose of the `proof: ()` field here? | ||
|
||
Without `proof: ()`, `Token` would have no private fields and users would be | ||
able to construct values of `Token` arbitrarily. | ||
tall-vase marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Demonstrate: Try to construct the token manually in `main` and show the | ||
compilation error. Demonstrate: Remove the `proof` field from `Token` to show | ||
how users would be able to construct `Token` if it had no private fields. | ||
|
||
- By putting the `Token` type behind a module boundary (`token`), users outside | ||
that module can't construct the value on their own as they don't have | ||
permission to access the `proof` field. | ||
|
||
The API developer gets to define methods and functions that produce these | ||
tokens. The user does not. | ||
|
||
The token becomes a proof that one has met the API developer's conditions of | ||
access for those tokens. | ||
|
||
- Ask: How might an API developer accidentally introduce ways to circumvent | ||
this? | ||
|
||
Expect answers like "serialization implementations", other parser/"from | ||
string" implementations, or an implementation of `Default`. | ||
|
||
</details> |
78 changes: 78 additions & 0 deletions
78
src/idiomatic/leveraging-the-type-system/token-types/branded-01-motivation.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
--- | ||
minutes: 10 | ||
--- | ||
|
||
# Variable-Specific Tokens (Branding 1/4) | ||
|
||
What if we want to tie a token to a specific variable? | ||
|
||
```rust,editable | ||
struct Bytes { | ||
bytes: Vec<u8>, | ||
} | ||
struct ProvenIndex(usize); | ||
|
||
impl Bytes { | ||
fn get_index(&self, ix: usize) -> Option<ProvenIndex> { | ||
if ix < self.bytes.len() { Some(ProvenIndex(ix)) } else { None } | ||
} | ||
fn get_proven(&self, token: &ProvenIndex) -> u8 { | ||
unsafe { *self.bytes.get_unchecked(token.0) } | ||
} | ||
} | ||
|
||
fn main() { | ||
let data_1 = Bytes { bytes: vec![0, 1, 2] }; | ||
if let Some(token_1) = data_1.get_index(2) { | ||
data_1.get_proven(&token_1); // Works fine! | ||
|
||
// let data_2 = Bytes { bytes: vec![0, 1] }; | ||
// data_2.get_proven(&token_1); // Panics! Can we prevent this? | ||
} | ||
} | ||
``` | ||
tall-vase marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
<details> | ||
|
||
- What if we want to tie a token to a _specific variable_ in our code? Can we do | ||
tall-vase marked this conversation as resolved.
Show resolved
Hide resolved
|
||
this in Rust's type system? | ||
|
||
- Motivation: We want to have a Token Type that represents a known, valid index | ||
into a byte array. | ||
|
||
Once we have these proven indexes we would be able to avoid bounds checks | ||
entirely, as the tokens would act as the _proof of an existing index_. | ||
tall-vase marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Since the index is known to be valid, `get_proven()` can skip the bounds | ||
check. | ||
|
||
In this example there's nothing stopping the proven index of one array being | ||
used on a different array. If an index is out of bounds in this case, it is | ||
undefined behavior. | ||
|
||
- Demonstrate: Uncomment the `data_2.get_proven(&token_1);` line. | ||
|
||
The code here panics! We want to prevent this "crossover" of token types for | ||
indexes at compile time. | ||
|
||
- Ask: How might we try to do this? | ||
|
||
Expect students to not reach a good implementation from this, but be willing | ||
to experiment and follow through on suggestions. | ||
|
||
- Ask: What are the alternatives, why are they not good enough? | ||
|
||
Expect runtime checking of index bounds, especially as both `Vec::get` and | ||
`Bytes::get_index` already uses runtime checking. | ||
|
||
Runtime bounds checking does not prevent the erroneous crossover in the first | ||
place, it only guarantees a panic. | ||
|
||
- The kind of token-association we will be doing here is called Branding. This | ||
is an advanced technique that expands applicability of token types to more API | ||
designs. | ||
|
||
- [`GhostCell`](https://plv.mpi-sws.org/rustbelt/ghostcell/paper.pdf) is a | ||
prominent user of this, later slides will touch on it. | ||
|
||
</details> | ||
tall-vase marked this conversation as resolved.
Show resolved
Hide resolved
|
175 changes: 175 additions & 0 deletions
175
src/idiomatic/leveraging-the-type-system/token-types/branded-02-phantomdata.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,175 @@ | ||
--- | ||
minutes: 30 | ||
--- | ||
|
||
# `PhantomData` and Lifetime Subtyping (Branding 2/4) | ||
|
||
Idea: | ||
|
||
- Use a lifetime as a unique brand for each token. | ||
- Make lifetimes sufficiently distinct so that they don't implicitly convert | ||
into each other. | ||
|
||
<!-- dprint-ignore-start --> | ||
```rust,editable | ||
use std::marker::PhantomData; | ||
|
||
#[derive(Default)] | ||
struct InvariantLifetime<'id>(PhantomData<&'id ()>); // The main focus | ||
|
||
struct Wrapper<'a> { value: u8, invariant: InvariantLifetime<'a> } | ||
|
||
fn lifetime_separator<T>(value: u8, f: impl for<'a> FnOnce(Wrapper<'a>) -> T) -> T { | ||
f(Wrapper { value, invariant: InvariantLifetime::default() }) | ||
} | ||
|
||
fn try_coerce_lifetimes<'a>(left: Wrapper<'a>, right: Wrapper<'a>) {} | ||
|
||
fn main() { | ||
lifetime_separator(1, |wrapped_1| { | ||
lifetime_separator(2, |wrapped_2| { | ||
// We want this to NOT compile | ||
try_coerce_lifetimes(wrapped_1, wrapped_2); | ||
}); | ||
}); | ||
} | ||
``` | ||
<!-- dprint-ignore-end --> | ||
|
||
<details> | ||
|
||
<!-- TODO: Link back to PhantomData in the borrowck invariants chapter. | ||
- We saw `PhantomData` back in the Borrow Checker Invariants chapter. | ||
--> | ||
|
||
- In Rust, lifetimes can have subtyping relations between one another. | ||
|
||
This kind of relation allows the compiler to determine if one lifetime | ||
outlives another. | ||
|
||
Determining if a lifetime outlives another also allows us to say _the shortest | ||
common lifetime is the one that ends first_. | ||
|
||
This is useful in many cases, as it means two different lifetimes can be | ||
treated as if they were the same in the regions they do overlap. | ||
|
||
This is usually what we want. But here we want to use lifetimes as a way to | ||
distinguish values so we say that a token only applies to a single variable | ||
without having to create a newtype for every single variable we declare. | ||
|
||
- **Goal**: We want two lifetimes that the rust compiler cannot determine if one | ||
tall-vase marked this conversation as resolved.
Show resolved
Hide resolved
|
||
outlives the other. | ||
|
||
We are using `try_coerce_lifetimes` as a compile-time check to see if the | ||
lifetimes have a common shorter lifetime (AKA being subtyped). | ||
|
||
- Note: This slide compiles, by the end of this slide it should only compile | ||
when `subtyped_lifetimes` is commented out. | ||
|
||
- There are two important parts of this code: | ||
- The `impl for<'a>` bound on the closure passed to `lifetime_separator`. | ||
- The way lifetimes are used in the parameter for `PhantomData`. | ||
|
||
## `for<'a>` bound on a Closure | ||
|
||
- We are using `for<'a>` as a way of introducing a lifetime generic parameter to | ||
a function type and asking that the body of the function to work for all | ||
possible lifetimes. | ||
|
||
What this also does is remove some ability of the compiler to make assumptions | ||
about that specific lifetime for the function argument, as it must meet rust's | ||
borrow checking rules regardless of the "real" lifetime its arguments are | ||
going to have. The caller is substituting in actual lifetime, the function | ||
itself cannot. | ||
|
||
This is analogous to a forall (Ɐ) quantifier in mathematics, or the way we | ||
introduce `<T>` as type variables, but only for lifetimes in trait bounds. | ||
|
||
When we write a function generic over a type `T`, we can't determine that type | ||
from within the function itself. Even if we call a function | ||
`fn foo<T, U>(first: T, second: U)` with two arguments of the same type, the | ||
body of this function cannot determine if `T` and `U` are the same type. | ||
|
||
This also prevents _the API consumer_ from defining a lifetime themselves, | ||
which would allow them to circumvent the restrictions we want to impose. | ||
|
||
## PhantomData and Lifetime Variance | ||
|
||
- We already know `PhantomData`, which can introduce a formal no-op usage of an | ||
otherwise unused type or a lifetime parameter. | ||
|
||
- Ask: What can we do with `PhantomData`? | ||
|
||
Expect mentions of the Typestate pattern, tying together the lifetimes of | ||
owned values. | ||
|
||
- Ask: In other languages, what is subtyping? | ||
|
||
Expect mentions of inheritance, being able to use a value of type `B` when a | ||
asked for a value of type `A` because `B` is a "subtype" of `A`. | ||
|
||
- Rust does have Subtyping! But only for lifetimes. | ||
|
||
Ask: If one lifetime is a subtype of another lifetime, what might that mean? | ||
|
||
A lifetime is a "subtype" of another lifetime when it _outlives_ that other | ||
lifetime. | ||
|
||
- The way that lifetimes used by `PhantomData` behave depends not only on where | ||
the lifetime "comes from" but on how the reference is defined too. | ||
|
||
The reason this compiles is that the | ||
[**Variance**](https://doc.rust-lang.org/stable/reference/subtyping.html#r-subtyping.variance) | ||
of the lifetime inside of `InvariantLifetime` is too lenient. | ||
|
||
Note: Do not expect to get students to understand variance entirely here, just | ||
treat it as a kind of ladder of restrictiveness on the ability of lifetimes to | ||
establish subtyping relations. | ||
|
||
<!-- Note: We've been using "invariants" in this module in a specific way, but subtyping introduces _invariant_, _covariant_, and _contravariant_ as specific terms. --> | ||
|
||
- Ask: How can we make it more restrictive? How do we make a reference type more | ||
restrictive in rust? | ||
|
||
Expect or demonstrate: Making it `&'id mut ()` instead. This will not be | ||
enough! | ||
|
||
We need to use a | ||
tall-vase marked this conversation as resolved.
Show resolved
Hide resolved
|
||
[**Variance**](https://doc.rust-lang.org/stable/reference/subtyping.html#r-subtyping.variance) | ||
on lifetimes where subtyping cannot be inferred except on _identical | ||
lifetimes_. That is, the only subtype of `'a` the compiler can know is `'a` | ||
itself. | ||
|
||
Note: Again, do not try to get the whole class to understand variance. Treat | ||
it as a ladder of restrictiveness for now. | ||
|
||
Demonstrate: Move from `&'id ()` (covariant in lifetime and type), | ||
`&'id mut ()` (covariant in lifetime, invariant in type), `*mut &'id mut ()` | ||
(invariant in lifetime and type), and finally `*mut &'id ()` (invariant in | ||
lifetime but not type). | ||
|
||
Those last two should not compile, which means we've finally found candidates | ||
for how to bind lifetimes to `PhantomData` so they can't be compared to one | ||
another in this context. | ||
|
||
Reason: `*mut` means | ||
[mutable raw pointer](https://doc.rust-lang.org/reference/types/pointer.html#r-type.pointer.raw). | ||
Rust has mutable pointers! But you cannot reason about them in safe rust. | ||
Making this a mutable raw pointer to a reference that has a lifetime | ||
complicates the compiler's ability subtype because it cannot reason about | ||
mutable raw pointers within the borrow checker. | ||
|
||
- Wrap up: We've introduced ways to stop the compiler from deciding that | ||
lifetimes are "similar enough" by choosing a Variance for a lifetime in | ||
`PhantomData` that is restrictive enough to prevent this slide from compiling. | ||
|
||
That is, we can now create variables that can exist in the same scope as each | ||
other, but whose types are automatically made different from one another | ||
per-variable without much boilerplate. | ||
|
||
## More to Explore | ||
|
||
- The `for<'a>` quantifier is not just for function types. It is a | ||
[**Higher-ranked trait bound**](https://doc.rust-lang.org/reference/subtyping.html?search=Hiher#r-subtype.higher-ranked). | ||
|
||
</details> | ||
tall-vase marked this conversation as resolved.
Show resolved
Hide resolved
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.