Skip to content

Commit 2705bc8

Browse files
alambwiedld2010YOUY01
authored
Minor: improve documentation of AggregateMode (#14946)
* Minor: improve documentation of AggregateMode * Update datafusion/physical-plan/src/aggregates/mod.rs Co-authored-by: wiedld <[email protected]> * Update datafusion/physical-plan/src/aggregates/mod.rs Co-authored-by: Yongting You <[email protected]> * chore: fix link --------- Co-authored-by: wiedld <[email protected]> Co-authored-by: Yongting You <[email protected]>
1 parent 787adf0 commit 2705bc8

File tree

1 file changed

+28
-9
lines changed
  • datafusion/physical-plan/src/aggregates

1 file changed

+28
-9
lines changed

datafusion/physical-plan/src/aggregates/mod.rs

Lines changed: 28 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -57,41 +57,60 @@ mod row_hash;
5757
mod topk;
5858
mod topk_stream;
5959

60-
/// Hash aggregate modes
60+
/// Aggregation modes
6161
///
6262
/// See [`Accumulator::state`] for background information on multi-phase
6363
/// aggregation and how these modes are used.
6464
#[derive(Debug, Copy, Clone, PartialEq, Eq)]
6565
pub enum AggregateMode {
66+
/// One of multiple layers of aggregation, any input partitioning
67+
///
6668
/// Partial aggregate that can be applied in parallel across input
6769
/// partitions.
6870
///
6971
/// This is the first phase of a multi-phase aggregation.
7072
Partial,
73+
/// *Final* of multiple layers of aggregation, in exactly one partition
74+
///
7175
/// Final aggregate that produces a single partition of output by combining
7276
/// the output of multiple partial aggregates.
7377
///
7478
/// This is the second phase of a multi-phase aggregation.
79+
///
80+
/// This mode requires that the input is a single partition
81+
///
82+
/// Note: Adjacent `Partial` and `Final` mode aggregation is equivalent to a `Single`
83+
/// mode aggregation node. The `Final` mode is required since this is used in an
84+
/// intermediate step. The [`CombinePartialFinalAggregate`] physical optimizer rule
85+
/// will replace this combination with `Single` mode for more efficient execution.
86+
///
87+
/// [`CombinePartialFinalAggregate`]: https://docs.rs/datafusion/latest/datafusion/physical_optimizer/combine_partial_final_agg/struct.CombinePartialFinalAggregate.html
7588
Final,
89+
/// *Final* of multiple layers of aggregation, input is *Partitioned*
90+
///
7691
/// Final aggregate that works on pre-partitioned data.
7792
///
78-
/// This requires the invariant that all rows with a particular
79-
/// grouping key are in the same partitions, such as is the case
80-
/// with Hash repartitioning on the group keys. If a group key is
81-
/// duplicated, duplicate groups would be produced
93+
/// This mode requires that all rows with a particular grouping key are in
94+
/// the same partitions, such as is the case with Hash repartitioning on the
95+
/// group keys. If a group key is duplicated, duplicate groups would be
96+
/// produced
8297
FinalPartitioned,
98+
/// *Single* layer of Aggregation, input is exactly one partition
99+
///
83100
/// Applies the entire logical aggregation operation in a single operator,
84101
/// as opposed to Partial / Final modes which apply the logical aggregation using
85102
/// two operators.
86103
///
87104
/// This mode requires that the input is a single partition (like Final)
88105
Single,
106+
/// *Single* layer of Aggregation, input is *Partitioned*
107+
///
89108
/// Applies the entire logical aggregation operation in a single operator,
90-
/// as opposed to Partial / Final modes which apply the logical aggregation using
91-
/// two operators.
109+
/// as opposed to Partial / Final modes which apply the logical aggregation
110+
/// using two operators.
92111
///
93-
/// This mode requires that the input is partitioned by group key (like
94-
/// FinalPartitioned)
112+
/// This mode requires that the input has more than one partition, and is
113+
/// partitioned by group key (like FinalPartitioned).
95114
SinglePartitioned,
96115
}
97116

0 commit comments

Comments
 (0)