RFC: more verbose, more readable TT construction#264
RFC: more verbose, more readable TT construction#264devreal wants to merge 3 commits intoTESSEorg:masterfrom
Conversation
Idea: create a TT from a functor and add inputs, outputs, and names step by step to improve code readability. Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
|
Aren't you missing the ttb in the |
|
Ahh right, added. |
|
Alternative: make the edge the center piece of the connection instead of the TT: Since edges can have names too we would default to the edge name for the terminals if no explicit name is given. This really is syntactic sugar based on the initial proposal. It makes it clearer which inputs and outputs are connected. |
|
Inspired by https://github.com/mangpo/floem: do we really need edges? We use them as vehicle to glue to terminals together. A possible alternative could use the stream operators on the terminals directly: When connecting only two terminals the order can be reversed: Connecting multiple TT inputs to the first output of The stream operator does suggest some form of ordering between the TT inputs, which is not intended here. That may be somewhat misleading? |
|
Notes from the August 31, 2023 meeting The current state with strong typing of inputs and outputs for improved readability: ttg::Edge<int, float> a2b, b2a;
/* construct without edges */
auto tta = ttg::make_tt([](int key, float value){}, ttg::input(a2b), ttg::output(b2a),
"A", ttg::input_names{"b2a"}, ttg::output_names{"a2b"});
auto ttb = ttg::make_tt([](int key, float value){}, ttg::input(b2a), ttg::output(a2b),
"B", ttg::input_names{"a2b"}, ttg::output_names{"b2a"});Still somewhat confusing and even more cluttered (IMO). As @therault suggested, having a way to specify the signature of the task upon creation and providing implementations later would be useful for multi-implementation tasks: auto tta = ttg::make_multi_tt(ttg::key<int>{}, ttg::args<float>{}); /* names are placeholders */
tta->add_impl<ttg::ExecutionSpace::CPU>([](int, float){ /* host implementation here */ });
tta->add_impl<ttg::ExecutionSpace::CUDA>([](int, float){ /* CUDA implementation here */ });
tta->add_impl<ttg::ExecutionSpace::HIP>([](int, float){ /* HIP implementation here */ });This would allow us to use generic arguments again. The question on how to express the connection between TTs is orthogonal. @evaleev expressed concern that this adds yet another way of doing the same thing (on top of the original |
|
Alternative to chaining the
|
|
Another proposal at the end of the day: |
could be shortened to: by overloading |
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
Signed-off-by: Joseph Schuchart <schuchart@icl.utk.edu>
|
At the last meeting we discussed using auto tta = ttg::make_tt([](int key, float value){});
auto ttb = ttg::make_tt([](int key, float value){});
ttg::connect(ttb->out<0>(), tta->in<0>()); // B -> A
ttg::connect(tta->out<0>(), ttb->in<0>()); // A -> BThe problem is that we know the type of input terminals (from the argument list) but not the type of output terminals (because we have no edges to infer from). I found that there is already a version of auto tta = ttg::make_tt([](int key, float value){});
auto ttb = ttg::make_tt([](int key, float value){});
ttg::connect<0, 0>(ttb, tta); // B -> A
ttg::connect<0, 0>(tta, ttb); // A -> BWe could default the indices to 0 if we wanted. Probably should add the ability to pass names... Also missing: output terminal fusion. Maybe we can pass a tuple of TTs and array of indices from which to fuse the outputs? auto tta = ttg::make_tt([](int key, float value){});
auto ttb = ttg::make_tt([](int key, float value){});
auto ttc = ttg::make_tt([](int key, float value){});
ttg::connect<{0, 0} 0>({ttb, ttc}, tta); // {B|C} -> A
ttg::connect<0, 0>(tta, ttb); // A -> B(C++-20 allows arrays as template arguments without specifying their size but that only seems to work with GCC so far) |
|
I just explained the interface of TTG to someone using the MRA code and I realized once again how non-intuitive this is. Esp. the Edges make it hard because just by looking at the code it's not clear what data goes where. So instead of ttg::Edge<mra::Key<NDIM>, void> project_control;
ttg::Edge<mra::Key<NDIM>, mra::FunctionsReconstructedNode<T, NDIM>> project_result;
ttg::Edge<mra::Key<NDIM>, mra::FunctionsCompressedNode<T, NDIM>> compress_result;
auto start = make_start(project_control);
auto project = make_project(db, gauss_buffer, N, K, functiondata, T(1e-6), project_control, project_result);
// C(P)
auto compress = make_compress(N, K, functiondata, project_result, compress_result);
// // R(C(P))
auto reconstruct = make_reconstruct(N, K, functiondata, compress_result, reconstruct_result);I think this would be easier to parse: auto start = make_start(project_control);
auto project = make_project(db, gauss_buffer, N, K, functiondata, T(1e-6), start.out<0>());
// C(P)
auto compress = make_compress(N, K, functiondata, project.out<0>());
// R(C(P))
auto reconstruct = make_reconstruct(N, K, functiondata, compress.out<0>());Inside the |
a2878ce to
a49eb91
Compare
Based on a discussion between @bosilca, @therault and myself, I started to extend the API to improve readability. This gets us a bit closer to the style of TaskFlow.
Example:
The old way:
The new way:
This patch preserves the old way and adds the partial construction. The named member functions provide context for the arguments and help structure the code, at the cost of being more verbose.
Notes:
make_tt.