Skip to content

Commit d962cc5

Browse files
committed
doc: Use inner/outer splitter, analogize to zip/product
Additional minor wording changes for clarity.
1 parent 3233fab commit d962cc5

File tree

2 files changed

+16
-15
lines changed

2 files changed

+16
-15
lines changed

docs/source/explanation/design-approach.rst

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,16 +7,16 @@ Rationale
77

88
Scientific workflows often require sophisticated analyses that encompass a large collection
99
of algorithms.
10-
The algorithms, that were originally not necessarily designed to work together,
11-
and were written by different authors.
12-
Some may be written in Python, while others might require calling external programs.
10+
These algorithms are frequently written by different authors, and rarely designed to work together.
11+
Some may be written in Python, our language of choice,
12+
while others might require calling external programs.
1313
It is a common practice to create semi-manual workflows that require the scientists
1414
to handle the files and interact with partial results from algorithms and external tools.
1515
This approach is conceptually simple and easy to implement, but the resulting workflow
1616
is often time consuming, error-prone and difficult to share with others.
1717
Consistency, reproducibility and scalability demand scientific workflows
18-
to be organized into fully automated pipelines.
19-
This was the motivation behind Pydra - a new dataflow engine written in Python.
18+
to be organized into fully-automated pipelines.
19+
This was the motivation behind Pydra - a dataflow engine written in Python.
2020

2121
History
2222
-------
@@ -39,8 +39,8 @@ Goals
3939

4040
The goal of Pydra is to provide a lightweight dataflow engine for computational graph construction,
4141
manipulation, and distributed execution, as well as ensuring reproducibility of scientific pipelines.
42-
In Pydra, a dataflow is represented as a directed acyclic graph, where each node represents a Python
43-
function, execution of an external tool, or another reusable dataflow.
42+
In Pydra, a dataflow is represented as a directed acyclic graph, where each node represents
43+
the invocation of a Python function, an external tool, or another reusable dataflow.
4444
The combination of several key features makes Pydra a customizable and powerful dataflow engine:
4545

4646
- Composable dataflows: Any node of a dataflow graph can be another dataflow, allowing for nested

docs/source/explanation/splitting-combining.rst

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -31,30 +31,30 @@ nodes represent stateless copies of the original Task after splitting the input,
3131

3232
Types of Splitter
3333
-----------------
34-
Whenever a *Task* has more complicated inputs,
35-
i.e. multiple fields, there are two ways of creating the mapping,
36-
each one is used for different application.
37-
These *splitters* are called *scalar splitter* and *outer splitter*.
34+
Whenever a *Task* has more complicated inputs, for example, multiple fields,
35+
there are two ways of creating the mapping, each one is used for different application.
36+
These *splitters* are called *inner splitter* and *outer splitter*.
3837
They use a special, but Python-based syntax as described next.
3938

40-
Scalar Splitter
39+
Inner Splitter
4140
---------------
42-
A *scalar splitter* performs element-wise mapping and requires that the lists of
43-
values for two or more fields to have the same length. The *scalar splitter* uses
41+
A *inner splitter* performs element-wise mapping and requires that the lists of
42+
values for two or more fields to have the same length. The *inner splitter* uses
4443
Python tuples and its operation is therefore represented by a parenthesis, ``()``:
4544

4645
.. math::
4746
4847
S = (x, y) : x=[x_1, x_2, .., x_n],~y=[y_1, y_2, .., y_n] \mapsto (x, y)=(x_1, y_1),..., (x, y)=(x_n, y_n),
4948
50-
5149
where `S` represents the *splitter*, `x` and `y` are the input fields.
5250
This is also represented as a diagram:
5351

5452
.. figure:: ../_static/images/nd_spl_4.png
5553
:figclass: h!
5654
:scale: 80%
5755

56+
Inner splitters can be analogized to the Python builtin function :func:`zip`.
57+
5858

5959
Outer Splitter
6060
--------------
@@ -85,5 +85,6 @@ and `inp3`. This can be extended to arbitrary complexity.
8585
In additional, the output can be merge at the end if needed.
8686
This will be explained in the next section.
8787

88+
Outer splitters can be analogized to the Python function :func:`itertools.product`.
8889

8990
.. _Map-Reduce: https://en.wikipedia.org/wiki/MapReduce

0 commit comments

Comments
 (0)