Actors systems can span complex communication graphs that make it hard to decide when actors are no longer needed. As a result, manually managing lifetime of actors is merely impossible. For this reason, CAF implements a garbage collection strategy for actors based on weak and strong reference counts.
The C++ standard library already offers shared_ptr
and
weak_ptr
to manage objects with complex shared ownership. The
standard implementation is a solid general purpose design that covers most use
cases. Weak and strong references to an object are stored in a control
block. However, CAF uses a slightly different design. The reason for this is
twofold. First, we need the control block to store the identity of an actor.
Second, we wanted a design that requires less indirections, because actor
handles are used extensively copied for messaging, and this overhead adds up.
Before discussing the approach to shared ownership in CAF, we look at the design of shared pointers in the C++ standard library.
The figure above depicts the default memory layout when using shared pointers.
The control block is allocated separately from the data and thus stores a
pointer to the data. This is when using manually-allocated objects, for example
shared_ptr<int> iptr{new int}
. The benefit of this design is that
one can destroy T
independently from its control block. While
irrelevant for small objects, it can become an issue for large objects.
Notably, the shared pointer stores two pointers internally. Otherwise,
dereferencing it would require to get the data location from the control block
first.
When using make_shared
or allocate_shared
, the standard
library can store reference count and data in a single memory block as shown
above. However, shared_ptr
still has to store two pointers, because
it is unaware where the data is allocated.
Finally, the design of the standard library becomes convoluted when an object
should be able to hand out a shared_ptr
to itself. Classes must
inherit from std::enable_shared_from_this
to navigate from an
object to its control block. This additional navigation path is required,
because std::shared_ptr
needs two pointers. One to the data and one
to the control block. Programmers can still use make_shared
for
such objects, in which case the object is again stored along with the control
block.
In CAF, we use a different approach than the standard library because (1) we
always allocate actors along with their control block, (2) we need additional
information in the control block, and (3) we can store only a single raw
pointer internally instead of the two raw pointers std::shared_ptr
needs. The following figure summarizes the design of smart pointers to actors.
CAF uses strong_actor_ptr
instead of
std::shared_ptr<...>
and weak_actor_ptr
instead of
std::weak_ptr<...>
. Unlike the counterparts from the standard
library, both smart pointer types only store a single pointer.
Also, the control block in CAF is not a template and stores the identity of an
actor (actor_id
plus node_id
). This allows CAF to
access this information even after an actor died. The control block fits
exactly into a single cache line (64 Bytes). This makes sure no false
sharing occurs between an actor and other actors that have references to it.
Since the size of the control block is fixed and CAF guarantees the
memory layout enforced by actor_storage
, CAF can compute the
address of an actor from the pointer to its control block by offsetting it by
64 Bytes. Likewise, an actor can compute the address of its control block.
The smart pointer design in CAF relies on a few assumptions about actor types.
Most notably, the actor object is placed 64 Bytes after the control block. This
starting address is cast to abstract_actor*
. Hence, T*
must be convertible to abstract_actor*
via
reinterpret_cast
. In practice, this means actor subclasses must not
use virtual inheritance, which is enforced in CAF with a
static_assert
.
A strong reference manipulates the strong refs
counter as
shown above. An actor is destroyed if there are zero strong references
to it. If two actors keep strong references to each other via member variable,
neither actor can ever be destroyed because they produce a cycle
breaking-cycles. Strong references are formed by
strong_actor_ptr
, actor
, and
typed_actor<...>
actor-reference_.
A weak reference manipulates the weak refs
counter. This
counter keeps track of how many references to the control block exist. The
control block is destroyed if there are zero weak references to an actor
(which cannot occur before strong refs
reached zero as
well). No cycle occurs if two actors keep weak references to each other,
because the actor objects themselves can get destroyed independently from their
control block. A weak reference is only formed by actor_addr
actor-address_.
The function actor_cast
converts between actor pointers and
handles. The first common use case is to convert a strong_actor_ptr
to either actor
or typed_actor<...>
before being able
to send messages to an actor. The second common use case is to convert
actor_addr
to strong_actor_ptr
to upgrade a weak
reference to a strong reference. Note that casting actor_addr
to a
strong actor pointer or handle can result in invalid handles. The syntax for
actor_cast
resembles builtin C++ casts. For example,
actor_cast<actor>(x)
converts x
to an handle of type
actor
.
Cycles can occur only when using class-based actors when storing references to
other actors via member variable. Stateful actors stateful-actor_ break
cycles by destroying the state when an actor terminates, before the
destructor of the actor itself runs. This means an actor releases all
references to others automatically after calling quit
. However,
class-based actors have to break cycles manually, because references to others
are not released until the destructor of an actor runs. Two actors storing
references to each other via member variable produce a cycle and neither
destructor can ever be called.
Class-based actors can break cycles manually by overriding
on_exit()
and calling destroy(x)
on each
handle actor-handle_. Using a handle after destroying it is undefined
behavior, but it is safe to assign a new value to the handle.