Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
316 changes: 316 additions & 0 deletions std/atomic.d
Original file line number Diff line number Diff line change
@@ -0,0 +1,316 @@
/**
* The atomic module provides atomic struct support for lock-free
* concurrent programming.
*
* Copyright: Copyright Roy David Margalit 2022 - 2025.
* License: $(LINK2 http://www.boost.org/LICENSE_1_0.txt, Boost License 1.0)
* Authors: Roy David Margalit
* Source: $(DRUNTIMESRC core/_atomic.d)
*/
module std.atomic;

/** Atomic data like std::atomic

Params:
T = Integral type or pointer for atomic operations

Example:
-------------------------
__gshared Atomic!int a;
assert(a == 0);
assert(a++ == 0)
assert(a == 1);
-------------------------

*/
struct Atomic(T)
if (__traits(isIntegral, T) || isPointer!T)
{
import core.atomic : atomicLoad, atomicStore, atomicExchange, atomicFetchAdd,
atomicFetchSub, atomicCas = cas, atomicCasWeak = casWeak, atomicOp;

private T val;
Copy link
Contributor

@TurkeyMan TurkeyMan Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the shared thing, you should put shared here.


/// Constructor
this(T init) shared
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove shared, the whole point of this type is to wrap that detail out of the public API.

{
val.atomicStore(init);
}

private shared(T)* ptr() shared
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should not be shared, but it does correctly return a shared(*).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I make sure Atomic!((Atomic!int)*) works?

{
return &val;
}

/** Load the value from the atomic location with SC access
Params:
mo = Memory order

Returns: The stored value
*/
T load(MemoryOrder mo = MemoryOrder.seq)() shared
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More shared; get rid of all of these; these methods are not shared.

{
return val.atomicLoad!(mo.toCore);
}

/// ditto
alias load this;


/** Store the value to the atomic location
Params:
mo = Memory order
newVal = Value to store to atomic
*/
void store(MemoryOrder mo = MemoryOrder.seq)(T newVal) shared
{
return val.atomicStore!(mo.toCore)(newVal);
}

/// Store using SC access
alias opAssign = store;

/** Atomically increment the value
Params:
mo = Memory order
mod = Value to add to atomic

Returns: The old stored value
*/
T fadd(MemoryOrder mo = MemoryOrder.seq)(T mod) shared
{
return atomicFetchAdd!(mo.toCore)(val, mod);
}

/** Atomically decrement the value
Params:
mo = Memory order
mod = Value to decrement from atomic

Returns: The old stored value
*/
T fsub(MemoryOrder mo = MemoryOrder.seq)(T mod) shared
{
return atomicFetchSub!(mo.toCore)(val, mod);
}

/** Atomically swap the value
Params:
mo = Memory order
desired = New value to store

Returns: The old stored value
*/
T exchange(MemoryOrder mo = MemoryOrder.seq)(T desired) shared
{
return atomicExchange!(mo.toCore)(&val, desired);
}

/** Compare and swap
Params:
mo = Memory order on success
fmo = Memory order on failure
oldVal = Expected value to preform the swap
newVal = New value to store if condition holds

Returns: If the value was swapped
*/
bool cas(MemoryOrder mo = MemoryOrder.seq, MemoryOrder fmo = MemoryOrder.seq)(T oldVal, T newVal) shared
{
return atomicCas!(mo.toCore, fmo.toCore)(ptr, oldVal, newVal);
}

/** Compare and swap (May fail even if current value is equal to oldVal)
Params:
mo = Memory order on success
fmo = Memory order on failure
oldVal = Expected value to preform the swap
newVal = New value to store if condition holds

Returns: If the value was swapped
*/
bool casWeak(MemoryOrder mo = MemoryOrder.seq, MemoryOrder fmo = MemoryOrder.seq)(T oldVal,
T newVal) shared
{
return atomicCasWeak!(mo.toCore, fmo.toCore)(ptr, oldVal, newVal);
}

/** Op assign with SC semantics
Params:
op = Assignment operator
rhs = Value to assign

Returns: Computation result
*/
T opOpAssign(string op)(T rhs) shared
{
return val.atomicOp!(op ~ `=`)(rhs);
}

/** Implicit conversion to FADD with SC access
Params:
op = ++

Returns: Pre-incremented value
*/
T opUnary(string op)() shared if (op == `++`)
{
return fadd(1);
}

/** Implicit conversion to FSUB with SC access
Params:
op = --

Returns: Pre-decremented value
*/
T opUnary(string op)() shared if (op == `--`)
{
return fsub(1);
}

/** Dereference atomic pointer with SC access
Params:
op = *

Returns: Reference to the pointed location
*/
auto ref opUnary(string op)() shared if (op == `*`)
{
return *(load);
}
}

static import core.atomic;
/**
* Specifies the memory ordering semantics of an atomic operation.
*
* See_Also:
* $(HTTP en.cppreference.com/w/cpp/atomic/memory_order)
*/
enum MemoryOrder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this? I don't see a reason to repeat the type.
Why not just alias druntime's type here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To name relaxed access correctly for the struct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh man, don't get me started!
Just rename it in druntime, and wear the people that complain that you broke their code... maybe add a synonym to the druntime enum, and deprecate the stupid name?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/// ...
rlx,
raw=rlx,

It is a simple change over in druntime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing gods work :P

{
/**
* Corresponds to $(LINK2 https://llvm.org/docs/Atomics.html#monotonic, LLVM AtomicOrdering.Monotonic)
* and C++11/C11 `memory_order_relaxed`.
*/
rlx = cast(int) core.atomic.MemoryOrder.raw,
/**
* Corresponds to $(LINK2 https://llvm.org/docs/Atomics.html#acquire, LLVM AtomicOrdering.Acquire)
* and C++11/C11 `memory_order_acquire`.
*/
acq = cast(int) core.atomic.MemoryOrder.acq,
/**
* Corresponds to $(LINK2 https://llvm.org/docs/Atomics.html#release, LLVM AtomicOrdering.Release)
* and C++11/C11 `memory_order_release`.
*/
rel = cast(int) core.atomic.MemoryOrder.rel,
/**
* Corresponds to $(LINK2 https://llvm.org/docs/Atomics.html#acquirerelease, LLVM AtomicOrdering.AcquireRelease)
* and C++11/C11 `memory_order_acq_rel`.
*/
acq_rel = cast(int) core.atomic.MemoryOrder.acq_rel,
/**
* Corresponds to $(LINK2 https://llvm.org/docs/Atomics.html#sequentiallyconsistent, LLVM AtomicOrdering.SequentiallyConsistent)
* and C++11/C11 `memory_order_seq_cst`.
*/
seq = cast(int) core.atomic.MemoryOrder.seq,
}

private auto toCore(MemoryOrder mo)
{
static import core.atomic;
return cast(core.atomic.MemoryOrder) mo;
}

@safe unittest
{
shared Atomic!int a;
assert(a == 0);
assert(a.load == 0);
assert(a.fadd!(MemoryOrder.rlx)(5) == 0);
assert(a.load!(MemoryOrder.acq) == 5);
assert(!a.casWeak(4, 5));
assert(!a.cas(4, 5));
assert(a.cas!(MemoryOrder.rel, MemoryOrder.acq)(5, 4));
assert(a.fsub!(MemoryOrder.acq_rel)(2) == 4);
assert(a.exchange!(MemoryOrder.acq_rel)(3) == 2);
assert(a.load!(MemoryOrder.rlx) == 3);
a.store!(MemoryOrder.rel)(7);
assert(a.load == 7);
a = 32;
assert(a == 32);
a += 5;
assert(a == 37);
assert(a++ == 37);
assert(a == 38);
}

// static array of shared atomics
@safe unittest
{
static shared(Atomic!int)[5] arr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An array of 5 atomic int's! This is a perfect example of a disaster waiting to happen.
Interactions between more than one atomic always require an exceptionally high amount of care. I would be concerned that this benign looking line essentially communicates to an expert that you're not qualified to be messing with this sort of thing ;)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why. An implementation of a bounded chase-lev queue will do exactly this. The atomic is there to give you easy to work with primitives to write safe concurrent data structures. If you want the unbounded version then yes, you'll need the array pointer to also be atomic.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An implementation of a bounded chase-lev queue will do exactly this.

Are you sure? I would expect to need exactly 2 indices, for the top and the bottom.
If there are multiple queues; then I would expect them to be individually encapsulated and that would be the array-ed object.

That aside, of course I could certainly manifest situations where this isn't strictly invalid; but a cluster of atomics in an array doesn't give a strong implication that they are strictly independent. If the elements together are taken to represent some sort of coupled logical state, this is almost certainly a disaster waiting to happen. This feels like a code-smell at best. Most people in my experience fail to handle multiple atomic moving parts, at least in the event it becomes more complicated than a pair of queue cursors.

I mean, it's just not clear where this line is going. It's rare to see more than 2 cursor's at a time, unless this were an array of cursors (length == num_threads). My feeling is that as sample code, it presents a dangerous idea.

I'm nit-picking here because my key concern with this whole thing is that it presents to a user the impression that atomics are like, no big deal.
In more complicated scenarios where you have multiple work-stealing queues (one for each worker thread) or something like that, then I would expect the atomic details to be enclosed in the queue objects. It would be best to structurally prevent handling them in conjunction, it's too difficult to reason about in practise.

And again, that principle generally just leads me to the position that a call to cas, load/acq, store/rel, inside a tool (like a queue implementation) is just not a big deal, and somewhat more direct and readable in practise.

Convenience is just not a goal where atomics are concerned from my perspective; absolute maximum clarity is the only goal I would recognise. It's almost always only one or one pair of lines; a whole tool to hide one or 2 lines of code which you can follow in a direct and linear way just doesn't feel like it carries it's weight to me.

Do you strongly feel a tool like this here has value? Be honest with yourself; what lines of code are you trying to make disappear behind this tool? How many such lines exist in your software? Chances are the number of lines is countable on your fingers... and if it's more than that, I would get nervous.

If you feel like you can make a strong case for its value, I'd like the unit tests (ie; samples) to present realistic patterns, for the sake of not misleading readers.

arr[4] = 4;
assert(arr[4].load == 4);
}

@system unittest
{
import core.thread : Thread;

shared(Atomic!int)[2] arr;

void reltest() @safe
{
arr[0].store!(MemoryOrder.rel)(1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Release is unnecessary here. Also, there's no reason for arr[0] to be atomic at all.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really depends on what this example signifies. If you're thinking of this tiny benchmark, then yes. If you're thinking of more complicated things then no. This is the canonical example for MP-idiom.

Copy link
Contributor

@TurkeyMan TurkeyMan Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, I'm not sure what the example signifies; all I see is a bad example... this example as written is showing wrong code; 2 consecutive releases, and 2 consecutive acquires are not a correct implementation of this pattern shown (or anything like this, I couldn't describe this as canonical?). In this example, only arr[1] should be atomic.

Showing arr[0] as a second atomic could only confuse readers; they'll probably assume the example was written by an expert and try and make sense of it... whatever conclusion they manifest to explain the code they see will be wrong, and they may then go off and write bad code with their misunderstanding.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never considered unit-tests to be examples of how the code should be used. Only as a utility to test that the code does work.
For the record, if arr[0] was not atomic and the underlying implementation of the store to arr[1] was downgraded to relaxed instead of release, the unittest becomes undefined behavior instead of just failing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a unittest block has a ddoc comment on it, that becomes part of the documentation as an example. It attaches to the previous non-unittest symbol.

///
unittest {
}

Without the ddoc comment, it's only for verifying the behaviour.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never considered unit-tests to be examples of how the code should be used. Only as a utility to test that the code does work. For the record, if arr[0] was not atomic and the underlying implementation of the store to arr[1] was downgraded to relaxed instead of release, the unittest becomes undefined behavior instead of just failing.

arr[1] is very clearly acq/rel, it's not relaxed... I have no idea what "downgraded" means; I think you mean "changing the code arbitrarily and introducing a bug"; anyone can introduce any bug anywhere at any time by arbitrarily changing code in a way that creates a bug.

The point of lockless/atomic code is to be efficient; you don't arbitrarily perform excessive cache synchronisations, that defeats the whole purpose. It's not 'defensive', it's just wrong.

Without the ddoc comment, it's only for verifying the behaviour.

People can still read unittests and nobody expects a unittest to be wrong.

arr[1].store!(MemoryOrder.rel)(1);
}

void acqtest() @safe
{
while (arr[1].load!(MemoryOrder.acq) != 1) { }
assert(arr[0].load!(MemoryOrder.acq) == 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second acquire is unnecessary.

Frankly, these kinds of mistakes are demonstrating why I wouldn't introduce a tool like this. Atomics are exclusively for experts.

Copy link
Author

@rymrg rymrg Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we optimize, might as well make the first load relaxed as well and put a fence acquire after the loop. But interestingly enough it will change behaviors on hardware (not only because fence acquire, acquires from everything, but because I could not reproduce Store-Buffer on some ARM when using acq/rel instead of rlx/rlx).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's likely to be an optimisation (depends on the probability of contention), and if it is, there would be a better optimisation...
I would leave acquire in the loop, assume the loop is likely to succeed, and save the bytes of program code from the extra operation.
If probability of contention here is not close to zero, you would use a well-formed spinlock instead with a backoff strategy instead of hot spinning waiting for the value to change; let the hyper-threads have more time, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you could have both :P

}

auto t1 = new Thread(&acqtest);
auto t2 = new Thread(&reltest);
t2.start;
t1.start;
t2.join;
t1.join;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't possibly test anything like this with one sample.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I also cannot test it on x86 at all. I think ARM requires to add alignment to the variables as well (and even then, we need to test it in million of times).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to add alignment:

align(16) struct Atomic {
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alignment needs to be for the variable instance, we don't want to align all atomics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah no, you don't want it on the field, it has to go on the struct.

On the field it only effects the layout of the Atomic struct, whereas you want the struct placed in other layouts like classes and structs with that alignment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I use the wrong terminology. This was needed on ARM for the SB example (not this MP example).

align (1024) __gshared int x;
align (1024) __gshared int y;

}

@safe unittest
{
shared Atomic!(shared(int)) a = 5;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of shared here is pointless; but shared in D is completely broken, and we really need to enable the preview that's been sitting there for 5-6 years.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I allocate a global without shared / __gshared?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use static for TLS. However I do question if it couldn't be on the stack.

assert(a.load == shared(int)(5));
a = 2;
assert(a == 2);
}

@safe unittest
{
shared Atomic!(shared(int)*) ptr = new shared(int);
*ptr.load!(MemoryOrder.rlx)() = 5;
assert(*ptr.load == 5);
*(ptr.load) = 42;
assert(*ptr.load == 42);
}

@safe unittest
{
shared Atomic!(shared(int)*) ptr = new shared(int);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ptr is not shared; it is a stack local which is thread-local. The T = shared(int)* is also needlessly shared, because shared is transitive and val should be shared internally so it's redundant here.

*ptr = 5;
assert(*ptr == 5);
*ptr = 42;
assert(*ptr == 42);
}

@safe unittest
{
//shared Atomic!(shared(Atomic!(int))*) ptr = new shared(Atomic!int);
}

private enum bool isAggregateType(T) = is(T == struct) || is(T == union)
|| is(T == class) || is(T == interface);
private enum bool isPointer(T) = is(T == U*, U) && !isAggregateType!T;