Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Rust for Linux material #2622

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -417,6 +417,38 @@
- [Broadcast Chat Application](concurrency/async-exercises/chat-app.md)
- [Solutions](concurrency/async-exercises/solutions.md)

# Rust for Linux

---

- [Welcome](rust-for-linux/welcome.md)
- [Interoperation Requirements](rust-for-linux/basic-requirements.md)
- [Building Kernel Modules](rust-for-linux/modules.md)
- [Type Mapping](rust-for-linux/types.md)
- [Bindings and Safe Interfaces](rust-for-linux/bindings-interfaces.md)
- [Avoiding Bloat](rust-for-linux/bloat.md)
- [Hands-on With Kernel Rust](rust-for-linux/hands-on.md)
- [Rust for Linux](rust-for-linux/rust-for-linux.md)
- [`rust-analyzer` Setup](rust-for-linux/rust-analyzer.md)
- [Macros](rust-for-linux/macros.md)
- [A Rust Kernel Module](rust-for-linux/kernel-module.md)
- [The `module!` Macro](rust-for-linux/modules/module-macro.md)
- [Module Setup and Teardown](rust-for-linux/modules/setup-and-teardown.md)
- [Module Parameters](rust-for-linux/modules/parameters.md)
- [Using Abstractions](rust-for-linux/using-abstractions.md)
- [Complications and Conflicts](rust-for-linux/complications.md)
- [`Pin` and Self-Reference](rust-for-linux/complications/pin.md)
- [The Kernel Rust Safety Model](rust-for-linux/complications/safety.md)
- [Atomic/Task Contexts and Sleep](rust-for-linux/complications/sleeping.md)
- [Memory Models](rust-for-linux/complications/memory-models.md)
- [Separate Compilation and Linking](rust-for-linux/complications/separate-compilation.md)
- [Fallible Allocation](rust-for-linux/complications/fallible-allocation.md)
- [Code Size](rust-for-linux/complications/code-size.md)
- [Documentation](rust-for-linux/complications/kernel-doc.md)
- [Security Mitigations](rust-for-linux/complications/mitigations.md)
- [Async](rust-for-linux/complications/async.md)
- [Next Steps](rust-for-linux/next-steps.md)

# Final Words

---
Expand Down
48 changes: 48 additions & 0 deletions src/rust-for-linux/basic-requirements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Interoperation Requirements

To use Rust code in Linux, we can start by comparing this situation with C/Rust
interop in userspace.

## Building

In userspace, the most common setup is to use Cargo to compile our Rust and
later integrate into a C build system if needed. Meanwhile, the Linux Kernel
compiles its C code with its custom Kbuild build system. In Rust for Linux, the
kernel build system invokes the Rust compiler directly, without Cargo.

## No `libstd`

Unlike typical usage of Rust in userspace, which makes use of the rust standard
library through the `std` crate, Rust in the kernel does not run atop an
operating system, so kernel Rust will have to eschew the standard library.

## Module Support

Much code in the kernel is compiled into kernel modules rather than as part of
the core kernel. To write kernel modules in Rust we'll need to be able to match
the ABI of kernel modules.

## Safe Wrappers

To reap the benefits of Rust, we want to be able to write as much code as
possible in safe Rust. This means that we want safe wrappers for as much kernel
functionality as possible.

## Mapping Types

When writing these wrappers, we'll need to refer to the data types of values
passed to and from existing kernel functions in C. Unlike userspace C, the
kernel uses its own set of primitive types rather than those provided by the C
standard. We'll have to map back and forth between those kernel types and
compatible Rust ones when doing foreign calls.

## Keeping the Kernel Lean

Finally, even the core Rust library assumes a basic level of functionality that
includes some costly operations (e.g. unicode processing) for which the kernel
does not want to pay implementation costs. To use Rust in the kernel we'll need
a way to disable this functionality.

# Outline

{{%segment outline}}
Binary file added src/rust-for-linux/bindgen-mapping.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
90 changes: 90 additions & 0 deletions src/rust-for-linux/bindings-interfaces.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
---
minutes: 18
---

# Bindings and Safe Interfaces

`bindgen` is used to generate low-level, unsafe bindings for C interfaces.

But to reap the benefits of Rust, we want to use safe, foolproof interfaces to
unsafe functionality.

Subsystems are expected to implement safe interfaces on top of the low-level
generated bindings. These safe interfaces are exposed as top-level modules
within the [`kernel` crate](https://rust.docs.kernel.org/kernel/). The top-level
`bindings` module holds the unsafe `bindgen`-generated bindings, which are
generated from the C headers included by `rust/bindings/bindings_helper.h`.

In Rust for Linux, unsafe `bindgen`-generated bindings should not be used
outside the `kernel` crate. Drivers and other subsystems will make use of the
safe abstractions from this crate.

Only a subset of Linux subsystems currently have such abstractions.

It's worth browsing the
[list of modules](https://rust.docs.kernel.org/kernel/#modules) exposed by the
`kernel` crate to see what exists currently. Many of these subsystems have only
partial bindings based on the needs of consumers so far.

## Adding a Module

To add a module for some subsystem, first its header must be added to
`bindings_helper.h`. It may be necessary to write some custom code to wrap
macros or `inline` functions that are not automatically handled by `bindgen`;
this code lives in the `rust/helpers/` directory.

Then we need to write a safe abstraction using these bindings and exposing them
to the rest of kernel Rust.

Some commits from work-in-progress bindings and abstractions can provide an idea
of what it looks like to expose new kernel functionality:

- GPIO Consumer:
[fecb4bd73f06bb2cac8e16aca7ef0e2f1b6acb50](https://github.com/Fabo/linux/commit/fecb4bd73f06bb2cac8e16aca7ef0e2f1b6acb50)
- Regmap:
[ec0b740ac5ab299e4c86011a0002919e5bbe5c2d](https://github.com/Fabo/linux/commit/ec0b740ac5ab299e4c86011a0002919e5bbe5c2d)
- I2C:
[70ed30fcdf8ec62fa91485c3c0a161a9d0194668](https://github.com/Fabo/linux/commit/70ed30fcdf8ec62fa91485c3c0a161a9d0194668)

## Guidelines for Abstractions

Abstractions may not be perfectly safe, but should try to be as safe as
possible. Unsafe functionality exposed should have its safety conditions
documented so that users have guidance on how to use the functionality and
justify such use.

Abstractions should also attempt to present relatively idiomatic Rust in their
interfaces:

- Follow Rust naming/capitalization conventions while remaining unsurprising to
kernel developers.
- Use RAII instead of manual resource management where possible.
- Avoid raw pointers to bound kernel objects in favor of safer, more limited
interfaces.

When exposing types from generated bindings, code should make use of the
[`Opaque<T>`](https://rust.docs.kernel.org/kernel/types/struct.Opaque.html)
type along with native Rust references and the
[`ARef<T>`](https://rust.docs.kernel.org/kernel/types/struct.ARef.html) type
for types that are inherently reference-counted. This type links types'
built-in reference count operations to the `Clone` and `Drop` traits.

## Submitting the cyclic dependency

We already know that drivers should not use unsafe bindings directly. But
subsystem maintainers may balk if they see patches submitted that add Rust
abstractions without motivation or consumers. But drivers and subsystem
abstractions may have to be submitted separately to different maintainers due to
the distributed nature of Linux development.

So how should a developer submit a driver that requires bindings/abstractions
for a subsystem not yet exposed to Rust?

There are two main approaches[^1]:

1. Submit the driver as an RFC before submitting the abstractions it relies upon
while referencing the RFC as a potential consumer.
2. Submit a stub driver and fill out non-stub functionality as subsystem
abstractions land.

[^1]: <https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/Upstreaming.20a.20driver.20with.20unsave.20C.20API.20calls.3F/near/471677707>
21 changes: 21 additions & 0 deletions src/rust-for-linux/bloat.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
minutes: 5
---

# Avoiding Bloat

Rust for Linux makes use of `libcore` to avoid reimplementing all functionality
of the Rust standard library. But even `libcore` has some functionality built-in
that is not portable to all targets the kernel would like to support or that is
not necessary for the kernel while occupying valuable code space.

This includes[^1]:

- Support for math with 128-bit integers
- String formatting for floating-point numbers
- Unicode support for strings

Work is ongoing to make these features optional. In the meantime, the `libcore`
used by Rust for Linux is larger and less portable than it could be.

[^1]: <https://github.com/Rust-for-Linux/linux/issues/514>
40 changes: 40 additions & 0 deletions src/rust-for-linux/complications.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
minutes: 5
---

# Complications and Conflicts

{{%segment outline}}

There are a number of subtleties and unresolved conflicts between the Rust
paradigm and the kernel one. These must be resolved to ship Rust code in the
kernel.

Some issues are deeper problems that require additional research and development
before Rust for Linux is ready for the prime-time; others merely require some
additional learning and attention on behalf of aspiring Rust for Linux
developers.

##

Resolving these conflicts involves changes on both sides of the collaboration.
On the Rust side, new features land first in the Nightly edition of the compiler
before being stabilized.

To avoid waiting for stabilization, the kernel uses an
[escape hatch](https://rustc-dev-guide.rust-lang.org/building/bootstrapping/what-bootstrapping-does.html#complications-of-bootstrapping)
to access unstable features even in stable releases of the compiler. This
assists in the goal of eventually deploying Rust for Linux in Linux
distributions that ship only a stable version of the Rust toolchain.

Nonetheless, being able to build Rust for Linux using only stable Rust features
is a significant goal; the issues blocking this are tracked specifically by both
the Rust for Linux project[^1] and the Rust developers themselves[^2].

In the next slides we'll explore the most significant sources of friction
between Rust and Linux kernel development to be aware of challenges we are
likely to encounter when trying to implement kernel functionality in Rust.

[^1]: <https://github.com/Rust-for-Linux/linux/issues/2>

[^2]: <https://github.com/rust-lang/rust-project-goals/issues/116>
38 changes: 38 additions & 0 deletions src/rust-for-linux/complications/async.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
minutes: 8
---

# Async

The kernel performs many operations concurrently and involves significant
amounts of interaction between CPU cores and other devices. For this reason, it
would be no surprise to see that async Rust would be a fundamental requirement
for using Rust in the kernel. But the kernel is central arbiter of most
synchronization and is currently written in regular, synchronous C.

Rust code making use of `async` mostly exists to write composable code that will
run atop event loops, but the Linux kernel is not really organized as an event
loop: user tasks call directly into the kernel; control flow for interrupts is
handled by hardware.

As such, `async` support is not critical for most kernel programming tasks.
However, it is possible to view some components of the kernel as async
executors, and some work has been done in this direction. Wedson Almeida Filho
implemented both workqueue-based[^1] and single-threaded async executors as
proofs of concept.

There is not a fundamental incompatibility between Rust-for-Linux and Rust
`async`, which is a similar situation to the amenability of `async` to use in
embedded Rust programming (e.g. the Embassy project).

Nonetheless, no killer application of `async` in Rust for Linux has made it a
priority.

<details>

[^1]: <https://github.com/Rust-for-Linux/linux/tree/rust/rust/kernel/kasync>

An example of an async server using the kernel async executor may be found
[here](https://github.com/Rust-for-Linux/linux/blob/rust/samples/rust/rust_echo_server.rs).

</details>
54 changes: 54 additions & 0 deletions src/rust-for-linux/complications/code-size.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
minutes: 10
---

# Code Size

One pitfall when writing Rust code can be the multiplicative increase in
generated machine code when using generics.

For the Linux kernel, which must be suitable for space-limited embedded
environments, keeping code size low is a significant concern.

Experiments with Rust in the kernel so far have shown that Rust code can be of
similar code size to C, but may also be larger in some cases[^1].

## Assessing Bloat

Tools exist to help analyze different source code's contribution to the size of
compiled code, such as
[`cargo-bloat`](https://github.com/RazrFalcon/cargo-bloat).

## Shrinking Code Size

The reasons for code bloat vary and are not generally specific to Linux kernel
usage of Rust. The most common causes for code bloat are excessive use of
generics and forced inlining. In general, generics should be preferred over
trait objects when writing abstractions that are expected to "compile out" or
where generating separate code for different types is critical for performance
(e.g. inner loops or arithmetic on values of a generic type).

In other situations, trait objects should be preferred to allow reusing
definitions without machine-code duplication, which may closer mirror patterns
that would be most natural in C.

When accepting generic parameters that get converted to a concrete type before
use, follow the pattern of defining an inner monomorphic function that can be
shared[^2]:

```rust
pub fn read_to_string<P: AsRef<Path>>(path: P) -> io::Result<String> {
fn inner(path: &Path) -> io::Result<String> {
let mut file = File::open(path)?;
let size = file.metadata().map(|m| m.len() as usize).ok();
let mut string = String::with_capacity(size.unwrap_or(0));
io::default_read_to_string(&mut file, &mut string, size)?;
Ok(string)
}
inner(path.as_ref())
}
```

[^1]: <https://www.usenix.org/system/files/atc24-li-hongyu.pdf>

[^2]: <https://github.com/rust-lang/rust/blob/ae612bedcbfc7098d1711eb35bc7ca994eb17a4c/library/std/src/fs.rs#L295-L304>
62 changes: 62 additions & 0 deletions src/rust-for-linux/complications/fallible-allocation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
minutes: 13
---

# Fallible Allocation

Allocation in Rust is assumed to be infallible:

```rust
let x = Box::new(5);
```

In the Linux kernel, memory allocation is much more complex.

```C
void * kmalloc(size_t size, int flags)
```

`flags` is one of `GFP_KERNEL`, `GFP_NOWAIT`, `GFP_ATOMIC`, etc.[^1]

The return value must be checked against `NULL` to see whether allocation
succeeded.

In Rust for Linux, rather than using the infallible allocation APIs provided by
`liballoc`, the kernel library has its own allocation interfaces:

## `KBox`

```rust
let b = KBox::new(24_u64, GFP_KERNEL)?;
assert_eq!(*b, 24_u64);
```

[`KBox::new`](https://rust.docs.kernel.org/kernel/alloc/kbox/struct.Box.html#tymethod.new)
returns a `Result<Self, AllocError>`. Here we propagate this error with the `?`
operator.

## `KVec`

Similarly,
[`KVec`](https://rust.docs.kernel.org/kernel/alloc/kvec/type.KVec.html) presents
a similar API to the standard `Vec`, but where operations that may allocate take
a flags parameter:

```rust
let mut v = KVec::new();
v.push(1, GFP_KERNEL)?;
assert_eq!(&v, &[1]);
```

## `FromIterator`

Because the standard
[`FromIterator`](https://doc.rust-lang.org/std/iter/trait.FromIterator.html)
trait also involves making new collections often involving memory allocation,
the `.collect()` method on iterators is not available in Rust for Linux in its
original form. Work is ongoing to design an equivalent API[^2], but for now we
do without its convenience.

[^1]: <https://docs.kernel.org/core-api/memory-allocation.html>

[^2]: <https://rust-for-linux.zulipchat.com/#narrow/channel/288089-General/topic/flat_map.20collecting.20with.20Kvec>
Loading
Loading