Tracking Issue for secure random data generation in `std` #130703

New issue

Open

Tracking Issue

Open

Tracking Issue for secure random data generation in std#130703

Tracking Issue

Labels

C-tracking-issueI-libs-api-nominatedT-libs-api

joboet

Member

Feature gate: #![feature(random)]

This is a tracking issue for secure random data generation support in std.

Central to this feature are the Random and RandomSource traits inside core::random. The Random trait defines a method to create a new random value of the implementing type from random bytes generated by a RandomSource. std also exposes the platform's secure random number generator via the DefaultRandomSource type which can be conveniently access via the random::random function.

Public API

// core::random

pub trait RandomSource {
    fn fill_bytes(&mut self, bytes: &mut [u8]);
}

pub trait Random {
    fn random(source: &mut (impl RandomSource + ?Sized)) -> Self;
}

impl Random for bool { ... }
impl Random for /* all integer types */ { ... }

// std::random (additionally)

pub struct DefaultRandomSource;

impl RandomSource for DefaultRandomSource { ... }

pub fn random<T: Random>() -> T { ... }

Steps / History

ACP: Simple secure random number generation libs-team#393
Implementation: std: implement the random feature (alternative version) #129201
Final comment period (FCP)¹
Stabilization PR
To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.

Unresolved Questions

Naming: the ACP used gen_bytes and DefaultRng, the implementation PR uses fill_bytes and DefaultRandomSource (see arguments pro gen_bytes and pro fill_bytes)
Concerns listed at Simple secure random number generation libs-team#393 (comment) should be addressed

https://std-dev-guide.rust-lang.org/feature-lifecycle/stabilization.html ↩

added

mentioned this

Contributor

Disclaimer: I am one of the getrandom developers.

I think it's important for RandomSource methods to properly return potential errors. Getting randomness is an IO operation and it may fail. In some context it's important to process such errors instead of panicking. The error may be either io::Error or something like getrandom::Error (i.e. a thin wrapper around NonZeroU32).

It may be worth to add the following methods to RandomSource:

fill_bytes which works with uninitialized buffers, e.g. based on BorrowedBuf. Yes, zeroization of buffers is usually a very small cost compared to a syscall, but it still goes against the zero-cost spirit.
Generation of u32 and u64. Some platforms support direct generation of such values (e.g. RDRAND, WASI, etc.). Going through fill_bytes will be a bit less efficient in such cases.
Methods for potentially "insecure" generation of random values, but which are less prone to blocking. The HashMap seeding is the most obvious use-case for this.

It's also not clear whether it's allowed to overwrite the default RandomSource supplied with std similarly to GlobalAlloc.

joboet

MemberAuthor

Would you consider rust-lang/libs-team#159 to be a better solution? That one used the Read trait to fulfil everything you mention.

newpavlov

Contributor

No, I don't think it's an appropriate solution. Firstly, it relies on io::Error, while IIUC you intend for this API to be available in core. Secondly, it does not provide methods for generation of u32 and u64. As I wrote, going through the byte interface is not always efficient. Finally, most of io::Read methods are not relevant here.

For the last point I guess we could define a separate DefaultInsecureRandomSource type.

bstrie

Contributor

Firstly, it relies on io::Error, while IIUC you intend for this API to be available in core.

I don't think this needs to be a blocker. IMO a lot of std::io should be moved into core--not the OS-specific implementations obviously, but all the cross-platform things like type definitions, same as what happened with core::net.

wwylele

Contributor

fn random(source: &mut (impl RandomSource + ?Sized)) -> Self;

Would it be better to make this explicit with generics? Like fn random<R: RandomSource + ?Sized>(source: &mut R) -> Self;. This gives user the ability to specify the type name when needed.

ericlagergren

I think it's important for RandomSource methods to properly return potential errors.

I (mostly) disagree. CSPRNGs should almost never fail. When they do, users are almost never qualified to diagnose the problem.

For example: golang/go#66821

A compromise is something like this:

trait RandomSource {
    type Error;
    fn fill_bytes(...) {
        self.try_fill_bytes(...).unwrap();
    }
    fn try_fill_bytes(...) -> Result<..., Self::Error>
}

This allows most CSPRNGs to use Error = Infallible, but still has support for weird HSMs, etc.

newpavlov

Contributor

@ericlagergren
I've assumed this trait is for a "system" RNG, which will work together with a #[global_allocator]-like way to register implementation. I don't think that we need a general RNG trait in std/core as I wrote in this comment.

As for design of fallible RNG traits, see the new rand_core crate.

dhardy

Contributor

pub trait Random {
    fn random(source: &mut (impl RandomSource + ?Sized)) -> Self;
}

This trait (and the topic of random value generation) should be removed from this discussion entirely in my opinion, focussing only on "secure random data generation" as in the title. Why: because (1) provision of secure random data is an important topic by itself (with many users only wanting a byte slice and with methods like from_ne_bytes already providing safe conversion) and (2) because random value generation is a whole other topic (including uniform-ranged samples and much more).

Disclaimer: I am one of the rand developers. rand originally had a similar trait which got removed; the closest surviving equivalent is StandardUniform.

hanna-kruppe

Contributor

@newpavlov

It's also not clear whether it's allowed to overwrite the default RandomSource supplied with std similarly to GlobalAlloc.

Overriding the default source in an application that already has one from linking std seems questionable. It's not that I can't imagine any use case for it, but the established pattern for such overrides allows any crate in the dependency graph to do it (it's only an error if you link two such crates), instead of putting the leaf binary/cdylib/staticlib artifact in charge. As you articulated in the context of getrandom, that's a security risk for applications. So a RandomSource equivalent should probably be more restrictive in who can override it, but that's not the existing pattern. It also doesn't seem to fit with the proposed generalization of that pattern via "externally implementable functions" (rust-lang/rfcs#3632) -- if that RFC is accepted, any new API surface should use it instead of adding new one-off mechanisms.

If overriding the std source isn't supported, then it could work the same way as #[panic_handler]: you must supply an implementation if you don't link std, but if you do link std then supplying your own is an error. This would still be extremely useful. Currently, every crate that's (optionally) no_std and needs some randomness, most commonly for seeding Hashers, has to cobble together some sub-par ad-hoc solution to try and get some entropy from somewhere. There's a bunch of partial solutions that are better than nothing (const-random, taking addresses of global/local variables and praying that there's some ASLR, a global counter when atomics are available, cfg-gated access to target-specific sources like CPU cycle counters or x86 RDRAND) but:

How well this works ends up highly platform-specific, in particular none of them work well for wasm32-unknown-unknown and wasm32v1-none targets .
Applications that have access to a better source of entropy and (directly or transitively) use such libraries don't have a good way to enumerate them and make all of them use the better source.

This wouldn't be a problem if the entire ecosystem could agree to always delegate this problem to on one specific crate (version) with appropriate hooks, like getrandom, but evidently that's not happening. Putting this capability into core (or a new no_std sysroot crate, comparable to alloc) has a better chance of solving this coordination problem. Well, at least eventually, once everyone's MSRV has caught up.

Edit: almost forgot that even std::collections::Hash{Map,Set} depend on having a source of random seeds. A way to supply such a source without linking std could help with moving those types to alloc, although as #27242 (comment) points out, it's not backwards compatible to make such a source mandatory for no_std + alloc applications.

newpavlov

Contributor

@hanna-kruppe

Overriding the default source in an application that already has one from linking std seems questionable.

There is a number of reasons to allow overriding:

An alternative interface may be more efficient than the default one (e.g. reading the RNDR register vs doing syscall)
It may help reduce binary size and eliminate potentially problematic fallback paths (e.g. if you know that you do not need the file fallback on Linux)
In some cases it's useful to eliminate non-deterministic inputs (testing, fuzzing)

So a RandomSource equivalent should probably be more restrictive in who can override it, but that's not the existing pattern.

Yes. How about following the getrandom path and allow override only when a special configuration flag is passed to the compiler?

Either way, overriding is probably can be left for later. I think we both agree that we need a way to expose "system" entropy source in std and a way to define this source for std-less targets.

It also doesn't seem to fit with the proposed generalization of that pattern via "externally implementable functions" -- if that RFC is accepted, any new API surface should use it instead of adding new one-off mechanisms.

I agree that ideally we need a unified approach for this kind of problem. I made a similar proposal once upon a time.

But I think it fits fine? Targets with std could implicitly use std_random_impl crate for "external implementation" of the getrandom-like functions and users will be able to override it in application crates if necessary.

How well this works ends up highly platform-specific, in particular none of them work well for wasm32-unknown-unknown and wasm32v1-none targets .

I believe that having std for wasm32-unknown-unknown was a big mistake in the first place and the wasm32v1-none target is a good step in the direction of amending it. So I hope we will not give too much attention to its special circumstances.

This wouldn't be a problem if the entire ecosystem could agree to always delegate this problem to on one specific crate (version) with appropriate hooks, like getrandom, but evidently that's not happening.

Well, it has happened, sort of. getrandom is reasonably popular in the ecosystem even after excluding rand users.

The problem is that std already effectively includes its variant of getrandom for HashMap seeding and people reasonably want to get access to that. And I think problem of getting "system" entropy is fundamental enough for having it in std (well, not in the std per se, let's say in the sysroot crate set).

A way to supply such a source without linking std could help with moving those types to alloc, although as #27242 (comment) points out, it's not backwards compatible to make such a source mandatory for no_std + alloc applications.

Can we add yet another sysroot crate for HashMap which will depend on both alloc and the hypothetical "system entropy" crate?

hanna-kruppe

Contributor

But I think it fits fine? Targets with std could implicitly use std_random_impl crate for "external implementation" of the getrandom-like functions and users will be able to override it in application crates if necessary.

The RFC (and the competing ones I've looked at) only supports a default implementation in the crate that "declares" the externally-implementable thing. If that crate isn't std, then an implementation from std would not count as "default" but conflict with any other definition. So we'd need another special carve-out for std (the very thing we'd want to avoid by adding a general language feature), or the language feature needs to become much more general to support overrideable default implementations from another source.

I believe that having std for wasm32-unknown-unknown was a big mistake in the first place and the wasm32v1-none target is a good step in the direction of amending it. So I hope we will not give too much attention to its special circumstances.

I was specifically talking about no_std libraries, for which the two targets are basically equivalent. Both don't have any source of entropy implied by the target tuple (instruction set, OS, env, etc.), and if you want to add one it'll have to involve whatever application-specific interface the wasm module has with its host.

Well, it has happened, sort of. getrandom is reasonably popular in the ecosystem even after excluding rand users.

Not to point any fingers but a counter example that's fresh on my mind because I looked at its code recently is foldhash. As another example, ahash only uses getrandom optionally (though it's on by default). If you're only using ahash indirectly through another library that disables the feature, then it's not gonna use getrandom unless you happen to notice this and add a direct dependency to enable the feature. In that case there is a solution, at least, but it's still not discoverable.

Can we add yet another sysroot crate for HashMap which will depend on both alloc and the hypothetical "system entropy" crate?

Possibly, but people may object to a proliferation of sysroot crates so let's hope there's a better solution.

143 remaining items

removed

Member

Regarding error handling, one possibility would be to only produce an error when the DefaultRandomSource is created, but keeping the RandomSource trait infallible. This means at the default random source can be seeded once when it is created (which may fail due to the OS). Once seeded, generating random data is infallible and should still be secure as long as a proper CSPRNG is used.

ericlagergren

Regarding error handling, one possibility would be to only produce an error when the DefaultRandomSource is created, but keeping the RandomSource trait infallible. This means at the default random source can be seeded once when it is created (which may fail due to the OS). Once seeded, generating random data is infallible and should still be secure as long as a proper CSPRNG is used.

I strongly disagree with this. Rust should not provide an in-process CSPRNG. Nobody (*) should be using an in-process CSPRNG for cryptographic purposes. See this comment and the responses #130703 (comment)

(I really do appreciate the error handling consideration, though.)

*: Except when it's the only option, or when required to (FIPS, Common Criteria, etc.).

newpavlov

Contributor

@Amanieu
I don't think it's a good option. You effectively repeat the rand::ThreadRng design and it's not as straightforward at it seems at the first glance. For example, have you thought about fork safety? In cryptographic applications it's also somewhat easier to rely on the OS API, than on a user-space re-seeded PRNG. You don't need to review CSPRNG implementation in std and in some cases you simply can not use "non-standard" cryptographic algorithms in your software (e.g. in certified software).

I think it may be worth to include a ThreadRng-like type into std, but it should be a separate source in addition to a "system" source.

ChrisDenton

Member

This thread is long and has gone on a few tangents so I'll attempt to go over some of the possibilities put forward. I'll try to keep it short so sorry if I miss any nuances. I'll list them in an order that roughly reflects my preference.

Names below are open to bikeshedding but for the purposes of designing the API, I feel it important to settle on the shape of the API before figuring out the perfect name.

mod random {
    // Fill some bytes using the OS's cryptographically secure rng.
    fn fill_bytes(buf: &mut [u8]);
}

The argument for this is that it has the least moving parts, is easy to document and documentation is all in one place. It doesn't rely on anything outside of the function (so just read its docs) and can't go wrong which makes auditing cryptographic code easier. No error is returned; either it fills all the bytes or, in the unlikely event it fails, the program does not continue. This means there's no potential for the buffer to be used after the call unless it succeeds. It does also mean that there's no way to recover from an error (similar to HashMap with the default hasher). See #130703 (comment) and golang/go#66821 for a discussion on (in)fallibility.

A slight variation on the first option is to return an error instead of dying.

mod random {
    fn fill_bytes(buf: &mut [u8]) -> Result<(), SecureRngError>;
}

This allows applications to recover from the failure (e.g. maybe they didn't need cryptographically secure random data after all?). SecureRngError could have From/Into implementations for io::Error so ? just works.

Alternatively we could simply use io::Error though I don't think there's anything applications can do with a specific OS error code here, other than maybe log it. Matching on error kinds, for example, would be wrong. However, an io::Error may be more convenient I guess.

Another option is to use the RandomSource trait to be generic over non-OS random sources. This is the original ACP design. A short excerpt from the OP:

// A trait that can be implemented for all kinds of random sources.
trait RandomSource {
    fn fill_bytes(&mut self, buf: &mut [u8]); 
}
// The OS's cryptographically secure rng.
pub struct DefaultRandomSource;
impl RandomSource for DefaultRandomSource { ... }

This would allow it to be used with the proposed Random API. The nature of that API is somewhat off topic for this issue but it has been brought up that coupling insecure and secure random sources does not provide much benefit but increases the chances of misuse. And even if we did want to do this then having a free function does not mean we can't add the type in the future that expands its use. For example, alloc::alloc::alloc and Vec both exist at the same time.

The fill_bytes function that returns an io::Error looks quite a bit like the Read trait, or at least the Read::read_exact function. So another option would be to have a type that implements Read.

// Uses the OS's cryptographically secure rng.
struct DefaultRandomSource;
impl Read for DefaultRandomSource {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> { ... }
    fn read_exact(&mut self, buf: &mut [u8]) -> io::Result<()> {
        self.read(&mut buf).map(|_| ())
    }
}

The upside to this is that it uses a familiar trait. The downside is that the Read trait has very loose requirements (e.g. it can partially fill a buffer, it can be interrupted, etc) whereas it's important for people implementing cryptographic code to be confident that the buffer is always fully filled. That can be solved by reading the docs for DefaultRandomSource (and mostly ignoring the docs for Read) but using the generic Read trait for cryptographically secure random bytes does not seem to add any benefit that's worth the hassle.

ChrisDenton

Member

I would also just add that crates like getrandom and rand will likely always exist. The standard library does not have to fully subsume all ecosystem crates nor provide functions for every possible use case.

And I'd again emphasise that adding a free standing function does not necessarily prevent adding a type + trait later so it doesn't have to be an either/or situation.

newpavlov

Contributor

@ChrisDenton
As the most minimal viable function and as a starting point, I support your first potentially panicking fill_bytes function. As a small amendment, I would relax its docs a bit from "the OS's cryptographically secure rng" to "cryptographically secure non-deterministic rng". In future it may be reasonable for it to use a ThreadRng-like source.

But this function should not be the end point. I strongly believe that we need RNG traits in core and RNG source structs in std, but I guess it's better to move their discussion into a separate issue.

I would also just add that crates like getrandom and rand will likely always exist.

As a getrandom maintainer, I really hope that getrandom will be eventually fully deprecated in favor of a new sysroot crate.

ericlagergren

@ChrisDenton As the most minimal viable function and as a starting point, I support your first potentially panicking fill_bytes function.

I very much agree with this. I think it is a good decision.

As a small amendment, I would relax its docs a bit from "the OS's cryptographically secure rng" to "cryptographically secure non-deterministic rng". In future it may be reasonable for it to use a ThreadRng-like source.

But I also very much disagree with this for the reasons stated in my (and your) previous comments. If Rust ever needs an in-process CSPRNG then it should have a separate API that informs the user about the risks, e.g., fork safety, swap safety, reseeding, etc.