Skip to content

Commit

Permalink
Use "aggregating functions" consistently (#686) (#689)
Browse files Browse the repository at this point in the history
Plus some minor editorial corrections.
  • Loading branch information
JPryce-Aklundh authored Aug 11, 2023
1 parent 5ff2645 commit fd00ac1
Showing 1 changed file with 30 additions and 31 deletions.
61 changes: 30 additions & 31 deletions modules/ROOT/pages/functions/aggregating.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

Aggregating functions take a set of values and calculate an aggregated value over them.
Aggregation can be computed over all the matching paths, or it can be further divided by introducing grouping keys.
Grouping keys are non-aggregate expressions that are used to group the values going into the aggregate functions.
Grouping keys are non-aggregating expressions that are used to group the values going into the aggregating functions.

For example, given the following query containing two return expressions, `n` and `+count(*)+`:

Expand All @@ -16,25 +16,25 @@ For example, given the following query containing two return expressions, `n` an
RETURN n, count(*)
----

The first, `n` is not an aggregate function, so it will be the grouping key.
The latter, `+count(*)+` is an aggregate expression.
The first, `n` is not an aggregating function, so it will be the grouping key.
The latter, `count(*)` is an aggregating function.
The matching paths will be divided into different buckets, depending on the grouping key.
The aggregate function will then be run on these buckets, calculating an aggregate value per bucket.
The aggregating function will then be run on these buckets, calculating an aggregate value per bucket.

The input expression of an aggregation function can contain any expression, including expressions that are not grouping keys.
However, not all expressions can be composed with aggregation functions.
The example below will throw an error since `n.x`, which is not a grouping key, is combined with the aggregation expression `+count(*)+`.
The input expression of an aggregating function can contain any expression, including expressions that are not grouping keys.
However, not all expressions can be composed with aggregating functions.
The example below will throw an error since `n.x`, which is not a grouping key, is combined with the aggregating function `count(*)`.
For more information, see xref:functions/aggregating.adoc#grouping-keys[Grouping keys].

[source, cypher, role=test-skip]
----
RETURN n.x + count(*)
----

To use aggregations to sort the result set, the aggregation must be included in the `RETURN` to be used in the `ORDER BY`.
To sort the result set using aggregating functions, the aggregation must be included in the `ORDER BY` sub-clause following the`RETURN` clause.

The `DISTINCT` operator works in conjunction with aggregation.
It is used to make all values unique before running them through an aggregate function.
It is used to make all values unique before running them through an aggregating function.
More information about `DISTINCT` can be found in xref::syntax/operators.adoc#query-operators-aggregation[Syntax -> Aggregation operators].

== Example graph
Expand Down Expand Up @@ -1063,30 +1063,29 @@ The sum of the two supplied Durations is returned:
[[grouping-keys]]
== Grouping keys

Aggregation expressions are expressions which contain one or more aggregation functions.
A simple aggregation expression consists of a single aggregation function.
For instance, `SUM(x.a)` is an aggregation expression that only consists of the aggregation function `SUM( )` with `x.a` as its argument.
Aggregation expressions are also allowed to be more complex, where the result of one or more aggregation functions are input arguments to other expressions.
For instance, `0.1 * (SUM(x.a) / COUNT(x.b))` is an aggregation expression that contains two aggregation functions, `SUM( )` with `x.a` as its argument and `COUNT( )` with `x.b` as its argument.
Aggregating expressions are expressions which contain one or more aggregating functions.
A simple aggregating expression consists of a single aggregating function.
For instance, `sum(x.a)` is an aggregating expression that only consists of the aggregating function `sum( )` with `x.a` as its argument.
Aggregating expressions are also allowed to be more complex, where the result of one or more aggregating functions are input arguments to other expressions.
For instance, `0.1 * (sum(x.a) / count(x.b))` is an aggregating expression that contains two aggregating functions, `sum( )` with `x.a` as its argument and `count( )` with `x.b` as its argument.
Both are input arguments to the division expression.


For aggregation expressions to be correctly computable for the buckets formed by the grouping key(s), they have to fulfill some requirements.
Specifically, each sub expression in an aggregation expression has to be either:
For aggregating expressions to be correctly computable for the buckets formed by the grouping key(s), they have to fulfill some requirements.
Specifically, each sub-expression in an aggregating expression has to be either:

* an aggregation function, e.g. `SUM(x.a)`,
* an aggregating function, e.g. `sum(x.a)`,
* a constant, e.g. `0.1`,
* a parameter, e.g. `$param`,
* a grouping key, e.g. the `a` in `RETURN a, count(*)`
* a local variable, e.g. the `x` in `count(*) + size([ x IN range(1, 10) | x ])`, or
* a subexpression, all whose operands are operands allowed in an aggregation expression.
* a sub-expression, all operands of which have to be allowed in an aggregating expression.


[[grouping-key-examples]]
=== Examples of aggregation expressions.
=== Examples of aggregating expressions


.Simple aggregation without any grouping keys:
.Simple aggregation without any grouping keys
======
.Query
Expand All @@ -1107,7 +1106,7 @@ RETURN max(p.age)
======


.Addition of an aggregation and a constant, without any grouping keys:
.Addition of an aggregation and a constant, without any grouping keys
======
.Query
Expand All @@ -1128,7 +1127,7 @@ RETURN max(p.age) + 1
======


.Subtraction of a property access and an aggregation.
.Subtraction of a property access and an aggregation
======
Note that `p` is a grouping key:
Expand Down Expand Up @@ -1174,17 +1173,17 @@ RETURN p.age, p.age - max(f.age)
Grouping keys themselves can be complex expressions.
For better query readability, Cypher only recognizes a sub-expression in aggregation expressions as a grouping key if the grouping key is either:
For better query readability, Cypher only recognizes a sub-expression in aggregating expressions as a grouping key if the grouping key is either:
- A variable - e.g. the `p` in `RETURN p, p.age - max(f.age)`
- A property access - e.g. the `p.age` in `RETURN p.age, p.age - max(f.age)`
- A map access - e.g. the `p.age` in `WITH {name:'Keanu Reeves', age:58} AS p RETURN p.age, p.age - max(p.age)`
- A variable - e.g. the `p` in `RETURN p, p.age - max(f.age)`.
- A property access - e.g. the `p.age` in `RETURN p.age, p.age - max(f.age)`.
- A map access - e.g. the `p.age` in `WITH {name:'Keanu Reeves', age:58} AS p RETURN p.age, p.age - max(p.age)`.
If more complex grouping keys are needed as operands in aggregation expression, it is always possible to project them in advance with `WITH`.
If more complex grouping keys are needed as operands in aggregating expression, it is always possible to project them in advance using `WITH`.
Using the property `p.age` will throw an exception, since `p.age` is not a grouping key.
Therefore, it cannot be used in the expressions which contain the aggregating function.
The below two queries would consequently return the same error message:
Using the property `p.age` will throw an exception, since `p.age` is not a grouping key.
Therefore, it cannot be used in the expressions which contain the aggregating function.
The below two queries would consequently return the same error message:
.Query
[source, cypher, role=test-fail]
Expand Down

0 comments on commit fd00ac1

Please sign in to comment.