-
Notifications
You must be signed in to change notification settings - Fork 13
Description
While working on the MRA benchmark, I noticed that the design of the streaming terminals forces us to make two copies for each reduction: the new element to be reduced and the copy of the result into the old element. That seems way too inefficient...
The current signature looks like this:
T&& reduce_fn(T&& t, U&& u);
Notice that reduce_fn takes ownership of both arguments and returns ownership of the result (which might be t, as in MRA). This might be ok for simple types (integers, doubles) but is terrible for structures, esp. PODs, as it forces us to make a copy before invocation in order to keep the copy that we're tracking intact (after all, we might want to pass it to other tasks too). See https://github.com/TESSEorg/ttg/blob/master/ttg/ttg/parsec/ttg.h#L1349 Note that since reduce_fn is wrapped by std::function the compiler cannot elide the return value copy.
I'd like to propose changing the signature to so that no copies in the backend are required:
void reduce_fn(T& t, const U& u);
That way, t can be updated with u but u stays constant. If that involves copying u so be it, but we shouldn't be forced to make a copy preemptively in the backend.