Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit 5d0211d

Browse files
committedOct 7, 2022
std: use futex in Once
1 parent a2cdcb3 commit 5d0211d

File tree

5 files changed

+483
-289
lines changed

5 files changed

+483
-289
lines changed
 

‎library/std/src/sync/once.rs

Lines changed: 23 additions & 289 deletions
Original file line numberDiff line numberDiff line change
@@ -3,99 +3,12 @@
33
//! This primitive is meant to be used to run one-time initialization. An
44
//! example use case would be for initializing an FFI library.
55
6-
// A "once" is a relatively simple primitive, and it's also typically provided
7-
// by the OS as well (see `pthread_once` or `InitOnceExecuteOnce`). The OS
8-
// primitives, however, tend to have surprising restrictions, such as the Unix
9-
// one doesn't allow an argument to be passed to the function.
10-
//
11-
// As a result, we end up implementing it ourselves in the standard library.
12-
// This also gives us the opportunity to optimize the implementation a bit which
13-
// should help the fast path on call sites. Consequently, let's explain how this
14-
// primitive works now!
15-
//
16-
// So to recap, the guarantees of a Once are that it will call the
17-
// initialization closure at most once, and it will never return until the one
18-
// that's running has finished running. This means that we need some form of
19-
// blocking here while the custom callback is running at the very least.
20-
// Additionally, we add on the restriction of **poisoning**. Whenever an
21-
// initialization closure panics, the Once enters a "poisoned" state which means
22-
// that all future calls will immediately panic as well.
23-
//
24-
// So to implement this, one might first reach for a `Mutex`, but those cannot
25-
// be put into a `static`. It also gets a lot harder with poisoning to figure
26-
// out when the mutex needs to be deallocated because it's not after the closure
27-
// finishes, but after the first successful closure finishes.
28-
//
29-
// All in all, this is instead implemented with atomics and lock-free
30-
// operations! Whee! Each `Once` has one word of atomic state, and this state is
31-
// CAS'd on to determine what to do. There are four possible state of a `Once`:
32-
//
33-
// * Incomplete - no initialization has run yet, and no thread is currently
34-
// using the Once.
35-
// * Poisoned - some thread has previously attempted to initialize the Once, but
36-
// it panicked, so the Once is now poisoned. There are no other
37-
// threads currently accessing this Once.
38-
// * Running - some thread is currently attempting to run initialization. It may
39-
// succeed, so all future threads need to wait for it to finish.
40-
// Note that this state is accompanied with a payload, described
41-
// below.
42-
// * Complete - initialization has completed and all future calls should finish
43-
// immediately.
44-
//
45-
// With 4 states we need 2 bits to encode this, and we use the remaining bits
46-
// in the word we have allocated as a queue of threads waiting for the thread
47-
// responsible for entering the RUNNING state. This queue is just a linked list
48-
// of Waiter nodes which is monotonically increasing in size. Each node is
49-
// allocated on the stack, and whenever the running closure finishes it will
50-
// consume the entire queue and notify all waiters they should try again.
51-
//
52-
// You'll find a few more details in the implementation, but that's the gist of
53-
// it!
54-
//
55-
// Atomic orderings:
56-
// When running `Once` we deal with multiple atomics:
57-
// `Once.state_and_queue` and an unknown number of `Waiter.signaled`.
58-
// * `state_and_queue` is used (1) as a state flag, (2) for synchronizing the
59-
// result of the `Once`, and (3) for synchronizing `Waiter` nodes.
60-
// - At the end of the `call_inner` function we have to make sure the result
61-
// of the `Once` is acquired. So every load which can be the only one to
62-
// load COMPLETED must have at least Acquire ordering, which means all
63-
// three of them.
64-
// - `WaiterQueue::Drop` is the only place that may store COMPLETED, and
65-
// must do so with Release ordering to make the result available.
66-
// - `wait` inserts `Waiter` nodes as a pointer in `state_and_queue`, and
67-
// needs to make the nodes available with Release ordering. The load in
68-
// its `compare_exchange` can be Relaxed because it only has to compare
69-
// the atomic, not to read other data.
70-
// - `WaiterQueue::Drop` must see the `Waiter` nodes, so it must load
71-
// `state_and_queue` with Acquire ordering.
72-
// - There is just one store where `state_and_queue` is used only as a
73-
// state flag, without having to synchronize data: switching the state
74-
// from INCOMPLETE to RUNNING in `call_inner`. This store can be Relaxed,
75-
// but the read has to be Acquire because of the requirements mentioned
76-
// above.
77-
// * `Waiter.signaled` is both used as a flag, and to protect a field with
78-
// interior mutability in `Waiter`. `Waiter.thread` is changed in
79-
// `WaiterQueue::Drop` which then sets `signaled` with Release ordering.
80-
// After `wait` loads `signaled` with Acquire and sees it is true, it needs to
81-
// see the changes to drop the `Waiter` struct correctly.
82-
// * There is one place where the two atomics `Once.state_and_queue` and
83-
// `Waiter.signaled` come together, and might be reordered by the compiler or
84-
// processor. Because both use Acquire ordering such a reordering is not
85-
// allowed, so no need for SeqCst.
86-
876
#[cfg(all(test, not(target_os = "emscripten")))]
887
mod tests;
898

90-
use crate::cell::Cell;
919
use crate::fmt;
92-
use crate::marker;
9310
use crate::panic::{RefUnwindSafe, UnwindSafe};
94-
use crate::ptr;
95-
use crate::sync::atomic::{AtomicBool, AtomicPtr, Ordering};
96-
use crate::thread::{self, Thread};
97-
98-
type Masked = ();
11+
use crate::sys_common::once as sys;
9912

10013
/// A synchronization primitive which can be used to run a one-time global
10114
/// initialization. Useful for one-time initialization for FFI or related
@@ -114,19 +27,9 @@ type Masked = ();
11427
/// ```
11528
#[stable(feature = "rust1", since = "1.0.0")]
11629
pub struct Once {
117-
// `state_and_queue` is actually a pointer to a `Waiter` with extra state
118-
// bits, so we add the `PhantomData` appropriately.
119-
state_and_queue: AtomicPtr<Masked>,
120-
_marker: marker::PhantomData<*const Waiter>,
30+
inner: sys::Once,
12131
}
12232

123-
// The `PhantomData` of a raw pointer removes these two auto traits, but we
124-
// enforce both below in the implementation so this should be safe to add.
125-
#[stable(feature = "rust1", since = "1.0.0")]
126-
unsafe impl Sync for Once {}
127-
#[stable(feature = "rust1", since = "1.0.0")]
128-
unsafe impl Send for Once {}
129-
13033
#[stable(feature = "sync_once_unwind_safe", since = "1.59.0")]
13134
impl UnwindSafe for Once {}
13235

@@ -136,10 +39,8 @@ impl RefUnwindSafe for Once {}
13639
/// State yielded to [`Once::call_once_force()`]’s closure parameter. The state
13740
/// can be used to query the poison status of the [`Once`].
13841
#[stable(feature = "once_poison", since = "1.51.0")]
139-
#[derive(Debug)]
14042
pub struct OnceState {
141-
poisoned: bool,
142-
set_state_on_drop_to: Cell<*mut Masked>,
43+
pub(crate) inner: sys::OnceState,
14344
}
14445

14546
/// Initialization value for static [`Once`] values.
@@ -159,49 +60,14 @@ pub struct OnceState {
15960
)]
16061
pub const ONCE_INIT: Once = Once::new();
16162

162-
// Four states that a Once can be in, encoded into the lower bits of
163-
// `state_and_queue` in the Once structure.
164-
const INCOMPLETE: usize = 0x0;
165-
const POISONED: usize = 0x1;
166-
const RUNNING: usize = 0x2;
167-
const COMPLETE: usize = 0x3;
168-
169-
// Mask to learn about the state. All other bits are the queue of waiters if
170-
// this is in the RUNNING state.
171-
const STATE_MASK: usize = 0x3;
172-
173-
// Representation of a node in the linked list of waiters, used while in the
174-
// RUNNING state.
175-
// Note: `Waiter` can't hold a mutable pointer to the next thread, because then
176-
// `wait` would both hand out a mutable reference to its `Waiter` node, and keep
177-
// a shared reference to check `signaled`. Instead we hold shared references and
178-
// use interior mutability.
179-
#[repr(align(4))] // Ensure the two lower bits are free to use as state bits.
180-
struct Waiter {
181-
thread: Cell<Option<Thread>>,
182-
signaled: AtomicBool,
183-
next: *const Waiter,
184-
}
185-
186-
// Head of a linked list of waiters.
187-
// Every node is a struct on the stack of a waiting thread.
188-
// Will wake up the waiters when it gets dropped, i.e. also on panic.
189-
struct WaiterQueue<'a> {
190-
state_and_queue: &'a AtomicPtr<Masked>,
191-
set_state_on_drop_to: *mut Masked,
192-
}
193-
19463
impl Once {
19564
/// Creates a new `Once` value.
19665
#[inline]
19766
#[stable(feature = "once_new", since = "1.2.0")]
19867
#[rustc_const_stable(feature = "const_once_new", since = "1.32.0")]
19968
#[must_use]
20069
pub const fn new() -> Once {
201-
Once {
202-
state_and_queue: AtomicPtr::new(ptr::invalid_mut(INCOMPLETE)),
203-
_marker: marker::PhantomData,
204-
}
70+
Once { inner: sys::Once::new() }
20571
}
20672

20773
/// Performs an initialization routine once and only once. The given closure
@@ -261,19 +127,20 @@ impl Once {
261127
/// This is similar to [poisoning with mutexes][poison].
262128
///
263129
/// [poison]: struct.Mutex.html#poisoning
130+
#[inline]
264131
#[stable(feature = "rust1", since = "1.0.0")]
265132
#[track_caller]
266133
pub fn call_once<F>(&self, f: F)
267134
where
268135
F: FnOnce(),
269136
{
270137
// Fast path check
271-
if self.is_completed() {
138+
if self.inner.is_completed() {
272139
return;
273140
}
274141

275142
let mut f = Some(f);
276-
self.call_inner(false, &mut |_| f.take().unwrap()());
143+
self.inner.call(false, &mut |_| f.take().unwrap()());
277144
}
278145

279146
/// Performs the same function as [`call_once()`] except ignores poisoning.
@@ -320,18 +187,19 @@ impl Once {
320187
/// // once any success happens, we stop propagating the poison
321188
/// INIT.call_once(|| {});
322189
/// ```
190+
#[inline]
323191
#[stable(feature = "once_poison", since = "1.51.0")]
324192
pub fn call_once_force<F>(&self, f: F)
325193
where
326194
F: FnOnce(&OnceState),
327195
{
328196
// Fast path check
329-
if self.is_completed() {
197+
if self.inner.is_completed() {
330198
return;
331199
}
332200

333201
let mut f = Some(f);
334-
self.call_inner(true, &mut |p| f.take().unwrap()(p));
202+
self.inner.call(true, &mut |p| f.take().unwrap()(p));
335203
}
336204

337205
/// Returns `true` if some [`call_once()`] call has completed
@@ -378,119 +246,7 @@ impl Once {
378246
#[stable(feature = "once_is_completed", since = "1.43.0")]
379247
#[inline]
380248
pub fn is_completed(&self) -> bool {
381-
// An `Acquire` load is enough because that makes all the initialization
382-
// operations visible to us, and, this being a fast path, weaker
383-
// ordering helps with performance. This `Acquire` synchronizes with
384-
// `Release` operations on the slow path.
385-
self.state_and_queue.load(Ordering::Acquire).addr() == COMPLETE
386-
}
387-
388-
// This is a non-generic function to reduce the monomorphization cost of
389-
// using `call_once` (this isn't exactly a trivial or small implementation).
390-
//
391-
// Additionally, this is tagged with `#[cold]` as it should indeed be cold
392-
// and it helps let LLVM know that calls to this function should be off the
393-
// fast path. Essentially, this should help generate more straight line code
394-
// in LLVM.
395-
//
396-
// Finally, this takes an `FnMut` instead of a `FnOnce` because there's
397-
// currently no way to take an `FnOnce` and call it via virtual dispatch
398-
// without some allocation overhead.
399-
#[cold]
400-
#[track_caller]
401-
fn call_inner(&self, ignore_poisoning: bool, init: &mut dyn FnMut(&OnceState)) {
402-
let mut state_and_queue = self.state_and_queue.load(Ordering::Acquire);
403-
loop {
404-
match state_and_queue.addr() {
405-
COMPLETE => break,
406-
POISONED if !ignore_poisoning => {
407-
// Panic to propagate the poison.
408-
panic!("Once instance has previously been poisoned");
409-
}
410-
POISONED | INCOMPLETE => {
411-
// Try to register this thread as the one RUNNING.
412-
let exchange_result = self.state_and_queue.compare_exchange(
413-
state_and_queue,
414-
ptr::invalid_mut(RUNNING),
415-
Ordering::Acquire,
416-
Ordering::Acquire,
417-
);
418-
if let Err(old) = exchange_result {
419-
state_and_queue = old;
420-
continue;
421-
}
422-
// `waiter_queue` will manage other waiting threads, and
423-
// wake them up on drop.
424-
let mut waiter_queue = WaiterQueue {
425-
state_and_queue: &self.state_and_queue,
426-
set_state_on_drop_to: ptr::invalid_mut(POISONED),
427-
};
428-
// Run the initialization function, letting it know if we're
429-
// poisoned or not.
430-
let init_state = OnceState {
431-
poisoned: state_and_queue.addr() == POISONED,
432-
set_state_on_drop_to: Cell::new(ptr::invalid_mut(COMPLETE)),
433-
};
434-
init(&init_state);
435-
waiter_queue.set_state_on_drop_to = init_state.set_state_on_drop_to.get();
436-
break;
437-
}
438-
_ => {
439-
// All other values must be RUNNING with possibly a
440-
// pointer to the waiter queue in the more significant bits.
441-
assert!(state_and_queue.addr() & STATE_MASK == RUNNING);
442-
wait(&self.state_and_queue, state_and_queue);
443-
state_and_queue = self.state_and_queue.load(Ordering::Acquire);
444-
}
445-
}
446-
}
447-
}
448-
}
449-
450-
fn wait(state_and_queue: &AtomicPtr<Masked>, mut current_state: *mut Masked) {
451-
// Note: the following code was carefully written to avoid creating a
452-
// mutable reference to `node` that gets aliased.
453-
loop {
454-
// Don't queue this thread if the status is no longer running,
455-
// otherwise we will not be woken up.
456-
if current_state.addr() & STATE_MASK != RUNNING {
457-
return;
458-
}
459-
460-
// Create the node for our current thread.
461-
let node = Waiter {
462-
thread: Cell::new(Some(thread::current())),
463-
signaled: AtomicBool::new(false),
464-
next: current_state.with_addr(current_state.addr() & !STATE_MASK) as *const Waiter,
465-
};
466-
let me = &node as *const Waiter as *const Masked as *mut Masked;
467-
468-
// Try to slide in the node at the head of the linked list, making sure
469-
// that another thread didn't just replace the head of the linked list.
470-
let exchange_result = state_and_queue.compare_exchange(
471-
current_state,
472-
me.with_addr(me.addr() | RUNNING),
473-
Ordering::Release,
474-
Ordering::Relaxed,
475-
);
476-
if let Err(old) = exchange_result {
477-
current_state = old;
478-
continue;
479-
}
480-
481-
// We have enqueued ourselves, now lets wait.
482-
// It is important not to return before being signaled, otherwise we
483-
// would drop our `Waiter` node and leave a hole in the linked list
484-
// (and a dangling reference). Guard against spurious wakeups by
485-
// reparking ourselves until we are signaled.
486-
while !node.signaled.load(Ordering::Acquire) {
487-
// If the managing thread happens to signal and unpark us before we
488-
// can park ourselves, the result could be this thread never gets
489-
// unparked. Luckily `park` comes with the guarantee that if it got
490-
// an `unpark` just before on an unparked thread it does not park.
491-
thread::park();
492-
}
493-
break;
249+
self.inner.is_completed()
494250
}
495251
}
496252

@@ -501,37 +257,6 @@ impl fmt::Debug for Once {
501257
}
502258
}
503259

504-
impl Drop for WaiterQueue<'_> {
505-
fn drop(&mut self) {
506-
// Swap out our state with however we finished.
507-
let state_and_queue =
508-
self.state_and_queue.swap(self.set_state_on_drop_to, Ordering::AcqRel);
509-
510-
// We should only ever see an old state which was RUNNING.
511-
assert_eq!(state_and_queue.addr() & STATE_MASK, RUNNING);
512-
513-
// Walk the entire linked list of waiters and wake them up (in lifo
514-
// order, last to register is first to wake up).
515-
unsafe {
516-
// Right after setting `node.signaled = true` the other thread may
517-
// free `node` if there happens to be has a spurious wakeup.
518-
// So we have to take out the `thread` field and copy the pointer to
519-
// `next` first.
520-
let mut queue =
521-
state_and_queue.with_addr(state_and_queue.addr() & !STATE_MASK) as *const Waiter;
522-
while !queue.is_null() {
523-
let next = (*queue).next;
524-
let thread = (*queue).thread.take().unwrap();
525-
(*queue).signaled.store(true, Ordering::Release);
526-
// ^- FIXME (maybe): This is another case of issue #55005
527-
// `store()` has a potentially dangling ref to `signaled`.
528-
queue = next;
529-
thread.unpark();
530-
}
531-
}
532-
}
533-
}
534-
535260
impl OnceState {
536261
/// Returns `true` if the associated [`Once`] was poisoned prior to the
537262
/// invocation of the closure passed to [`Once::call_once_force()`].
@@ -568,13 +293,22 @@ impl OnceState {
568293
/// assert!(!state.is_poisoned());
569294
/// });
570295
#[stable(feature = "once_poison", since = "1.51.0")]
296+
#[inline]
571297
pub fn is_poisoned(&self) -> bool {
572-
self.poisoned
298+
self.inner.is_poisoned()
573299
}
574300

575301
/// Poison the associated [`Once`] without explicitly panicking.
576-
// NOTE: This is currently only exposed for the `lazy` module
302+
// NOTE: This is currently only exposed for `OnceLock`.
303+
#[inline]
577304
pub(crate) fn poison(&self) {
578-
self.set_state_on_drop_to.set(ptr::invalid_mut(POISONED));
305+
self.inner.poison();
306+
}
307+
}
308+
309+
#[stable(feature = "std_debug", since = "1.16.0")]
310+
impl fmt::Debug for OnceState {
311+
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
312+
f.debug_struct("OnceState").field("poisoned", &self.is_poisoned()).finish()
579313
}
580314
}

‎library/std/src/sys_common/mod.rs

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ pub mod io;
2727
pub mod lazy_box;
2828
pub mod memchr;
2929
pub mod mutex;
30+
pub mod once;
3031
pub mod process;
3132
pub mod remutex;
3233
pub mod rwlock;
Lines changed: 134 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,134 @@
1+
use crate::cell::Cell;
2+
use crate::sync as public;
3+
use crate::sync::atomic::{
4+
AtomicU32,
5+
Ordering::{Acquire, Relaxed, Release},
6+
};
7+
use crate::sys::futex::{futex_wait, futex_wake_all};
8+
9+
// On some platforms, the OS is very nice and handles the waiter queue for us.
10+
// This means we only need one atomic value with 5 states:
11+
12+
/// No initialization has run yet, and no thread is currently using the Once.
13+
const INCOMPLETE: u32 = 0;
14+
/// Some thread has previously attempted to initialize the Once, but it panicked,
15+
/// so the Once is now poisoned. There are no other threads currently accessing
16+
/// this Once.
17+
const POISONED: u32 = 1;
18+
/// Some thread is currently attempting to run initialization. It may succeed,
19+
/// so all future threads need to wait for it to finish.
20+
const RUNNING: u32 = 2;
21+
/// Some thread is currently attempting to run initialization and there are threads
22+
/// waiting for it to finish.
23+
const QUEUED: u32 = 3;
24+
/// Initialization has completed and all future calls should finish immediately.
25+
const COMPLETE: u32 = 4;
26+
27+
// Threads wait by setting the state to QUEUED and calling `futex_wait` on the state
28+
// variable. When the running thread finishes, it will wake all waiting threads using
29+
// `futex_wake_all`.
30+
31+
pub struct OnceState {
32+
poisoned: bool,
33+
set_state_to: Cell<u32>,
34+
}
35+
36+
impl OnceState {
37+
#[inline]
38+
pub fn is_poisoned(&self) -> bool {
39+
self.poisoned
40+
}
41+
42+
#[inline]
43+
pub fn poison(&self) {
44+
self.set_state_to.set(POISONED);
45+
}
46+
}
47+
48+
struct CompletionGuard<'a> {
49+
state: &'a AtomicU32,
50+
set_state_on_drop_to: u32,
51+
}
52+
53+
impl<'a> Drop for CompletionGuard<'a> {
54+
fn drop(&mut self) {
55+
// Use release ordering to propagate changes to all threads checking
56+
// up on the Once. `futex_wake_all` does its own synchronization, hence
57+
// we do not need `AcqRel`.
58+
if self.state.swap(self.set_state_on_drop_to, Release) == QUEUED {
59+
futex_wake_all(&self.state);
60+
}
61+
}
62+
}
63+
64+
pub struct Once {
65+
state: AtomicU32,
66+
}
67+
68+
impl Once {
69+
#[inline]
70+
pub const fn new() -> Once {
71+
Once { state: AtomicU32::new(INCOMPLETE) }
72+
}
73+
74+
#[inline]
75+
pub fn is_completed(&self) -> bool {
76+
// Use acquire ordering to make all initialization changes visible to the
77+
// current thread.
78+
self.state.load(Acquire) == COMPLETE
79+
}
80+
81+
// This uses FnMut to match the API of the generic implementation. As this
82+
// implementation is quite light-weight, it is generic over the closure and
83+
// so avoids the cost of dynamic dispatch.
84+
#[cold]
85+
#[track_caller]
86+
pub fn call(&self, ignore_poisoning: bool, f: &mut impl FnMut(&public::OnceState)) {
87+
let mut state = self.state.load(Acquire);
88+
loop {
89+
match state {
90+
POISONED if !ignore_poisoning => {
91+
// Panic to propagate the poison.
92+
panic!("Once instance has previously been poisoned");
93+
}
94+
INCOMPLETE | POISONED => {
95+
// Try to register the current thread as the one running.
96+
if let Err(new) =
97+
self.state.compare_exchange_weak(state, RUNNING, Acquire, Acquire)
98+
{
99+
state = new;
100+
continue;
101+
}
102+
// `waiter_queue` will manage other waiting threads, and
103+
// wake them up on drop.
104+
let mut waiter_queue =
105+
CompletionGuard { state: &self.state, set_state_on_drop_to: POISONED };
106+
// Run the function, letting it know if we're poisoned or not.
107+
let f_state = public::OnceState {
108+
inner: OnceState {
109+
poisoned: state == POISONED,
110+
set_state_to: Cell::new(COMPLETE),
111+
},
112+
};
113+
f(&f_state);
114+
waiter_queue.set_state_on_drop_to = f_state.inner.set_state_to.get();
115+
return;
116+
}
117+
RUNNING | QUEUED => {
118+
// Set the state to QUEUED if it is not already.
119+
if state == RUNNING
120+
&& let Err(new) = self.state.compare_exchange_weak(RUNNING, QUEUED, Relaxed, Acquire)
121+
{
122+
state = new;
123+
continue;
124+
}
125+
126+
futex_wait(&self.state, QUEUED, None);
127+
state = self.state.load(Acquire);
128+
}
129+
COMPLETE => return,
130+
_ => unreachable!("state is never set to invalid values"),
131+
}
132+
}
133+
}
134+
}
Lines changed: 282 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,282 @@
1+
// Each `Once` has one word of atomic state, and this state is CAS'd on to
2+
// determine what to do. There are four possible state of a `Once`:
3+
//
4+
// * Incomplete - no initialization has run yet, and no thread is currently
5+
// using the Once.
6+
// * Poisoned - some thread has previously attempted to initialize the Once, but
7+
// it panicked, so the Once is now poisoned. There are no other
8+
// threads currently accessing this Once.
9+
// * Running - some thread is currently attempting to run initialization. It may
10+
// succeed, so all future threads need to wait for it to finish.
11+
// Note that this state is accompanied with a payload, described
12+
// below.
13+
// * Complete - initialization has completed and all future calls should finish
14+
// immediately.
15+
//
16+
// With 4 states we need 2 bits to encode this, and we use the remaining bits
17+
// in the word we have allocated as a queue of threads waiting for the thread
18+
// responsible for entering the RUNNING state. This queue is just a linked list
19+
// of Waiter nodes which is monotonically increasing in size. Each node is
20+
// allocated on the stack, and whenever the running closure finishes it will
21+
// consume the entire queue and notify all waiters they should try again.
22+
//
23+
// You'll find a few more details in the implementation, but that's the gist of
24+
// it!
25+
//
26+
// Atomic orderings:
27+
// When running `Once` we deal with multiple atomics:
28+
// `Once.state_and_queue` and an unknown number of `Waiter.signaled`.
29+
// * `state_and_queue` is used (1) as a state flag, (2) for synchronizing the
30+
// result of the `Once`, and (3) for synchronizing `Waiter` nodes.
31+
// - At the end of the `call` function we have to make sure the result
32+
// of the `Once` is acquired. So every load which can be the only one to
33+
// load COMPLETED must have at least acquire ordering, which means all
34+
// three of them.
35+
// - `WaiterQueue::drop` is the only place that may store COMPLETED, and
36+
// must do so with release ordering to make the result available.
37+
// - `wait` inserts `Waiter` nodes as a pointer in `state_and_queue`, and
38+
// needs to make the nodes available with release ordering. The load in
39+
// its `compare_exchange` can be relaxed because it only has to compare
40+
// the atomic, not to read other data.
41+
// - `WaiterQueue::drop` must see the `Waiter` nodes, so it must load
42+
// `state_and_queue` with acquire ordering.
43+
// - There is just one store where `state_and_queue` is used only as a
44+
// state flag, without having to synchronize data: switching the state
45+
// from INCOMPLETE to RUNNING in `call`. This store can be Relaxed,
46+
// but the read has to be Acquire because of the requirements mentioned
47+
// above.
48+
// * `Waiter.signaled` is both used as a flag, and to protect a field with
49+
// interior mutability in `Waiter`. `Waiter.thread` is changed in
50+
// `WaiterQueue::drop` which then sets `signaled` with release ordering.
51+
// After `wait` loads `signaled` with acquire ordering and sees it is true,
52+
// it needs to see the changes to drop the `Waiter` struct correctly.
53+
// * There is one place where the two atomics `Once.state_and_queue` and
54+
// `Waiter.signaled` come together, and might be reordered by the compiler or
55+
// processor. Because both use acquire ordering such a reordering is not
56+
// allowed, so no need for `SeqCst`.
57+
58+
use crate::cell::Cell;
59+
use crate::fmt;
60+
use crate::ptr;
61+
use crate::sync as public;
62+
use crate::sync::atomic::{AtomicBool, AtomicPtr, Ordering};
63+
use crate::thread::{self, Thread};
64+
65+
type Masked = ();
66+
67+
pub struct Once {
68+
state_and_queue: AtomicPtr<Masked>,
69+
}
70+
71+
pub struct OnceState {
72+
poisoned: bool,
73+
set_state_on_drop_to: Cell<*mut Masked>,
74+
}
75+
76+
// Four states that a Once can be in, encoded into the lower bits of
77+
// `state_and_queue` in the Once structure.
78+
const INCOMPLETE: usize = 0x0;
79+
const POISONED: usize = 0x1;
80+
const RUNNING: usize = 0x2;
81+
const COMPLETE: usize = 0x3;
82+
83+
// Mask to learn about the state. All other bits are the queue of waiters if
84+
// this is in the RUNNING state.
85+
const STATE_MASK: usize = 0x3;
86+
87+
// Representation of a node in the linked list of waiters, used while in the
88+
// RUNNING state.
89+
// Note: `Waiter` can't hold a mutable pointer to the next thread, because then
90+
// `wait` would both hand out a mutable reference to its `Waiter` node, and keep
91+
// a shared reference to check `signaled`. Instead we hold shared references and
92+
// use interior mutability.
93+
#[repr(align(4))] // Ensure the two lower bits are free to use as state bits.
94+
struct Waiter {
95+
thread: Cell<Option<Thread>>,
96+
signaled: AtomicBool,
97+
next: *const Waiter,
98+
}
99+
100+
// Head of a linked list of waiters.
101+
// Every node is a struct on the stack of a waiting thread.
102+
// Will wake up the waiters when it gets dropped, i.e. also on panic.
103+
struct WaiterQueue<'a> {
104+
state_and_queue: &'a AtomicPtr<Masked>,
105+
set_state_on_drop_to: *mut Masked,
106+
}
107+
108+
impl Once {
109+
#[inline]
110+
pub const fn new() -> Once {
111+
Once { state_and_queue: AtomicPtr::new(ptr::invalid_mut(INCOMPLETE)) }
112+
}
113+
114+
#[inline]
115+
pub fn is_completed(&self) -> bool {
116+
// An `Acquire` load is enough because that makes all the initialization
117+
// operations visible to us, and, this being a fast path, weaker
118+
// ordering helps with performance. This `Acquire` synchronizes with
119+
// `Release` operations on the slow path.
120+
self.state_and_queue.load(Ordering::Acquire).addr() == COMPLETE
121+
}
122+
123+
// This is a non-generic function to reduce the monomorphization cost of
124+
// using `call_once` (this isn't exactly a trivial or small implementation).
125+
//
126+
// Additionally, this is tagged with `#[cold]` as it should indeed be cold
127+
// and it helps let LLVM know that calls to this function should be off the
128+
// fast path. Essentially, this should help generate more straight line code
129+
// in LLVM.
130+
//
131+
// Finally, this takes an `FnMut` instead of a `FnOnce` because there's
132+
// currently no way to take an `FnOnce` and call it via virtual dispatch
133+
// without some allocation overhead.
134+
#[cold]
135+
#[track_caller]
136+
pub fn call(&self, ignore_poisoning: bool, init: &mut dyn FnMut(&public::OnceState)) {
137+
let mut state_and_queue = self.state_and_queue.load(Ordering::Acquire);
138+
loop {
139+
match state_and_queue.addr() {
140+
COMPLETE => break,
141+
POISONED if !ignore_poisoning => {
142+
// Panic to propagate the poison.
143+
panic!("Once instance has previously been poisoned");
144+
}
145+
POISONED | INCOMPLETE => {
146+
// Try to register this thread as the one RUNNING.
147+
let exchange_result = self.state_and_queue.compare_exchange(
148+
state_and_queue,
149+
ptr::invalid_mut(RUNNING),
150+
Ordering::Acquire,
151+
Ordering::Acquire,
152+
);
153+
if let Err(old) = exchange_result {
154+
state_and_queue = old;
155+
continue;
156+
}
157+
// `waiter_queue` will manage other waiting threads, and
158+
// wake them up on drop.
159+
let mut waiter_queue = WaiterQueue {
160+
state_and_queue: &self.state_and_queue,
161+
set_state_on_drop_to: ptr::invalid_mut(POISONED),
162+
};
163+
// Run the initialization function, letting it know if we're
164+
// poisoned or not.
165+
let init_state = public::OnceState {
166+
inner: OnceState {
167+
poisoned: state_and_queue.addr() == POISONED,
168+
set_state_on_drop_to: Cell::new(ptr::invalid_mut(COMPLETE)),
169+
},
170+
};
171+
init(&init_state);
172+
waiter_queue.set_state_on_drop_to = init_state.inner.set_state_on_drop_to.get();
173+
break;
174+
}
175+
_ => {
176+
// All other values must be RUNNING with possibly a
177+
// pointer to the waiter queue in the more significant bits.
178+
assert!(state_and_queue.addr() & STATE_MASK == RUNNING);
179+
wait(&self.state_and_queue, state_and_queue);
180+
state_and_queue = self.state_and_queue.load(Ordering::Acquire);
181+
}
182+
}
183+
}
184+
}
185+
}
186+
187+
fn wait(state_and_queue: &AtomicPtr<Masked>, mut current_state: *mut Masked) {
188+
// Note: the following code was carefully written to avoid creating a
189+
// mutable reference to `node` that gets aliased.
190+
loop {
191+
// Don't queue this thread if the status is no longer running,
192+
// otherwise we will not be woken up.
193+
if current_state.addr() & STATE_MASK != RUNNING {
194+
return;
195+
}
196+
197+
// Create the node for our current thread.
198+
let node = Waiter {
199+
thread: Cell::new(Some(thread::current())),
200+
signaled: AtomicBool::new(false),
201+
next: current_state.with_addr(current_state.addr() & !STATE_MASK) as *const Waiter,
202+
};
203+
let me = &node as *const Waiter as *const Masked as *mut Masked;
204+
205+
// Try to slide in the node at the head of the linked list, making sure
206+
// that another thread didn't just replace the head of the linked list.
207+
let exchange_result = state_and_queue.compare_exchange(
208+
current_state,
209+
me.with_addr(me.addr() | RUNNING),
210+
Ordering::Release,
211+
Ordering::Relaxed,
212+
);
213+
if let Err(old) = exchange_result {
214+
current_state = old;
215+
continue;
216+
}
217+
218+
// We have enqueued ourselves, now lets wait.
219+
// It is important not to return before being signaled, otherwise we
220+
// would drop our `Waiter` node and leave a hole in the linked list
221+
// (and a dangling reference). Guard against spurious wakeups by
222+
// reparking ourselves until we are signaled.
223+
while !node.signaled.load(Ordering::Acquire) {
224+
// If the managing thread happens to signal and unpark us before we
225+
// can park ourselves, the result could be this thread never gets
226+
// unparked. Luckily `park` comes with the guarantee that if it got
227+
// an `unpark` just before on an unparked thread it does not park.
228+
thread::park();
229+
}
230+
break;
231+
}
232+
}
233+
234+
#[stable(feature = "std_debug", since = "1.16.0")]
235+
impl fmt::Debug for Once {
236+
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
237+
f.debug_struct("Once").finish_non_exhaustive()
238+
}
239+
}
240+
241+
impl Drop for WaiterQueue<'_> {
242+
fn drop(&mut self) {
243+
// Swap out our state with however we finished.
244+
let state_and_queue =
245+
self.state_and_queue.swap(self.set_state_on_drop_to, Ordering::AcqRel);
246+
247+
// We should only ever see an old state which was RUNNING.
248+
assert_eq!(state_and_queue.addr() & STATE_MASK, RUNNING);
249+
250+
// Walk the entire linked list of waiters and wake them up (in lifo
251+
// order, last to register is first to wake up).
252+
unsafe {
253+
// Right after setting `node.signaled = true` the other thread may
254+
// free `node` if there happens to be has a spurious wakeup.
255+
// So we have to take out the `thread` field and copy the pointer to
256+
// `next` first.
257+
let mut queue =
258+
state_and_queue.with_addr(state_and_queue.addr() & !STATE_MASK) as *const Waiter;
259+
while !queue.is_null() {
260+
let next = (*queue).next;
261+
let thread = (*queue).thread.take().unwrap();
262+
(*queue).signaled.store(true, Ordering::Release);
263+
// ^- FIXME (maybe): This is another case of issue #55005
264+
// `store()` has a potentially dangling ref to `signaled`.
265+
queue = next;
266+
thread.unpark();
267+
}
268+
}
269+
}
270+
}
271+
272+
impl OnceState {
273+
#[inline]
274+
pub fn is_poisoned(&self) -> bool {
275+
self.poisoned
276+
}
277+
278+
#[inline]
279+
pub fn poison(&self) {
280+
self.set_state_on_drop_to.set(ptr::invalid_mut(POISONED));
281+
}
282+
}
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
// A "once" is a relatively simple primitive, and it's also typically provided
2+
// by the OS as well (see `pthread_once` or `InitOnceExecuteOnce`). The OS
3+
// primitives, however, tend to have surprising restrictions, such as the Unix
4+
// one doesn't allow an argument to be passed to the function.
5+
//
6+
// As a result, we end up implementing it ourselves in the standard library.
7+
// This also gives us the opportunity to optimize the implementation a bit which
8+
// should help the fast path on call sites.
9+
//
10+
// So to recap, the guarantees of a Once are that it will call the
11+
// initialization closure at most once, and it will never return until the one
12+
// that's running has finished running. This means that we need some form of
13+
// blocking here while the custom callback is running at the very least.
14+
// Additionally, we add on the restriction of **poisoning**. Whenever an
15+
// initialization closure panics, the Once enters a "poisoned" state which means
16+
// that all future calls will immediately panic as well.
17+
//
18+
// So to implement this, one might first reach for a `Mutex`, but those cannot
19+
// be put into a `static`. It also gets a lot harder with poisoning to figure
20+
// out when the mutex needs to be deallocated because it's not after the closure
21+
// finishes, but after the first successful closure finishes.
22+
//
23+
// All in all, this is instead implemented with atomics and lock-free
24+
// operations! Whee!
25+
26+
cfg_if::cfg_if! {
27+
if #[cfg(any(
28+
target_os = "linux",
29+
target_os = "android",
30+
all(target_arch = "wasm32", target_feature = "atomics"),
31+
target_os = "freebsd",
32+
target_os = "openbsd",
33+
target_os = "dragonfly",
34+
target_os = "fuchsia",
35+
target_os = "hermit",
36+
))] {
37+
mod futex;
38+
pub use futex::{Once, OnceState};
39+
} else {
40+
mod generic;
41+
pub use generic::{Once, OnceState};
42+
}
43+
}

0 commit comments

Comments
 (0)
Please sign in to comment.