wildarch · wildarch · Nov 13, 2025 · Nov 10, 2025 · Nov 12, 2025 · Nov 12, 2025
diff --git a/_config.yml b/_config.yml
@@ -5,6 +5,10 @@ theme: just-the-docs
 url: https://wildarch.dev/graphalg
 
 callouts:
+  note:
+    title: Note
+    color: blue
+
   warning:
     title: Warning
     color: red

diff --git a/spec/core/relalg.md b/spec/core/relalg.md
@@ -6,8 +6,211 @@ nav_order: 3
 ---
 
 # Conversion to Relational Algebra
-TODO:
-- Data model for the conversion
-- Definition of the assumed loop operator
-- Definition of the assumed aggregation operation
-- Rewrite rules for the conversion
+This page describes how GraphAlg Core operations can be converted into an extended relational algebra.
+
+{: .note }
+Conversion to relational algebra requires that all matrices use concrete types rather than abstract dimension symbols.
+The `graphalg-set-dimensions` pass can be used to set concrete values for dimension symbols.
+
+## Data Model
+The target system/algebra is assumed to support the following data types:
+- Boolean, denoted `i1`
+- 64-bit signed integer, denoted `i64`
+- 64-bit floating-point (IEEE 754 is assumed not strictly required), denoted `f64`
+
+It must support the following standard relational algebra operators:
+- projection: Drops or reorders input columns, or computes new columns based on existing ones using per-tuple operations.
+- selection: Keeps a subset of the input tuples based on a per-tuple predicate.
+- join: Combines two or more relations.
+
+Additionally, an operator to perform aggregation is required, which is not strictly part of relational algebra, but is commonly available in relational database systems.
+The `aggregate` operator must support grouping tuples by key columns, and it must support the following aggregator functions to combine values:
+- `sum`, `min` and `max` over `i64` and `f64`
+- `or` over `i1` (representing logical OR)
+- `argmin(arg, val)`, which finds the tuple with minimal `val` and produces the `arg` for that tuple.
+
+A last requirement, and one unusual to relational database systems, is a loop operator.
+Its semantics are similar to `ForConstOp`.
+
+## Loop Operator
+The loop operator repeatedly evaluates a collection of relational algebra expressions based on *loop variables* whose state can change between iterations. The initial state of the loop variable is provided as inputs to the loop operator in the form of
+
+The loop operator has a number of inputs:
+- initial states for the loop variables, provided as relational algebra expressions
+- a maximum number of iterations, given as an integer constant
+- a result index, indicating which of the loop variables will become the final output of the loop
+
+The loop operator embeds *loop body expressions*, one relational algebra expression per loop variable, which together represent the loop body.
+Besides all the usual query operators, a loop body expression can reference loop variables to read the current state of a variable.
+
+Optionally included is a *break expression*, a relational algebra expression that indicates if the loop should terminate early, before completing the maximum number of iterations. This expression should produce tuples with a single boolean value, where `true` in one of the tuple indicates that the loop should terminate.
+
+To run one iteration of the loop, first the loop body expressions are evaluated based on the current state of the loop variables.
+Then, the values of the loop variables are replaced with the results of loop body expressions.
+This process continue until the maximum number of iterations has been reached, or until the *break expression* returns a tuple with value to `true`.
+
+## Functions
+The conversion applies to individual functions, i.e. a single function call (with relational algebra expressions for the parameters) converts into a relational algebra expression.
+This is sufficient because as opposed to the full GraphAlg language, in GraphAlg Core functions do not call other functions.
+The `return` expression in the body becomes the root of the resulting expression.
+
+## Decomposing `MatMulOp` and `ReduceOp`
+`MatMulOp` and `ReduceOp` are replaced with `MatMulJoinOp`, `UnionOp` and `DeferredReduceOp` in the `graphalg-split-aggregate` pass.
+`MatMulOp` is decomposed into `MatMulJoinOp + DeferredReduceOp`, whereas `ReduceOp` becomes `UnionOp + DeferredReduceOp`.
+For this reason, only the translation of `MatMulJoinOp`, `UnionOp` and `DeferredReduceOp` is considered.
+
+## Matrix Expressions
+
+### `TransposeOp`
+Translates into a projection that swaps the row and column indices.
+
+```
+projection(
+    <expr>,
+    {
+        row = col,
+        col = row,
+        val = val,
+    }
+)
+```
+
+For vector inputs that lack either `row` or `col` columns, transpose is a no-op.
+
+### `DiagOp`
+Translates into a projection that adds a column index equal to the row index.
+
+```
+projection(
+    <expr>,
+    {
+        row = row,
+        col = row,
+        val = val,
+    }
+)
+```
+
+If the input is a row vector rather than a column vector, a row index equal to the column index is added instead.
+
+### `BroadcastOp`
+For each dimension added, join with a constant table of all possible indices for that dimension.
+
+### `ForConstOp`
+Translates directly to the relational algebra loop definition.
+For loops that produce multiple outputs, the loop is duplicated.
+Because all operations are deterministic, this does not change the semantics of the program.
+
+### `PickAnyOp`
+Remove zero elements, then select zero (col,val) combinations per row:
+
+```
+aggregate(
+    selection(
+        <expr>,
+        { val != <zero> }
+    ),
+    group by { row },
+    aggregate {
+        min(col),
+        argmin(val, col)
+    }
+)
+```
+
+{: .note }
+For boolean matrices, the value of `argmin(val, col)` is always `true`, so this aggregator can be omitted.
+
+### `ApplyOp`
+- Join all inputs based on available columns (some inputs may not have all dimensions and need to be broadcast)
+- If none of the inputs provides a requested output column, add additional joins to constant tables to broadcast them
+- Create a projection with the body of the `ApplyOp`. Keep only one row/col slot (as necessary for the output).
+
+### `TrilOp`
+Keep all tuples with `col < row`:
+
+```
+selection(
+    <expr>,
+    { col < row }
+)
+```
+
+### `ConstantMatrixOp`
+Create a constant table.
+For a
+
+```
+(row, col, val)
+(0, 0, 42)
+(0, 1, 42)
+(0, 2, 42)
+...
+(1, 0, 42)
+(1, 1, 42)
+(1, 2, 42)
+...
+```
+
+### `MatMulJoinOp`
+- Join the matrices on `<lhs>.col = <rhs>.row`
+- Project out the multiplied values
+
+### `DeferredReduceOp`
+Aggregate, grouping by row/col, merging values according to the semiring add operator (see `AddOp` conversion).
+
+### `UnionOp`
+Simple relational union.
+
+## Scalar Expressions
+### `ApplyReturnOp`
+Becomes return op inside projection.
+
+### `ConstantOp`
+Becomes a constant in a plain data type depending on the semiring:
+- `i1` and `f64` are kept as-is
+- `i64` becomes `i64`
+- `trop_int` becomes `i64`. If the constant value is infinity, we map this to the the maximum value of the `i64` type (i.e. the largest possible integer)
+- `trop_max_int` becomes `i64`. If the constant value is infinity, we map this to the the minimum value of the `i64` type.
+- `trop_real` becomes `f64`. In IEEE 754, `f64` has a proper infinity value, so we can map `trop_real` infinity to that.
+
+### `CastScalarOp`
+Cases:
+- `* -> i1`: rewrite to `input != zero(inRing)`
+- `i1 -> *`: rewrite to `input ? one(outRing) : zero(outRing)`
+- `i64 -> f64`: i64eger promotion
+- `i64 -> trop_i64`: map to infinity (max value) if 0, otherwise leave unchanged.
+- `i64 -> trop_max_i64`: map to infinity if 0, otherwise leave unchanged.
+- `i64 -> trop_f64`: map to infinity if 0, otherwise promote
+- `f64 -> i64`: truncate
+- `f64 -> trop_i64`: map to infinity if 0, otherwise promote
+- `f64 -> trop_max_i64`: map to infinity if 0, otherwise promote.
+- `f64 -> trop_f64`: map to infinity if 0, otherwise leave unchanged.
+
+### `AddOp`
+Pick the operation based on the semiring:
+- `i1`: logical OR
+- `i64`: signed integer add
+- `f64`: floating point add
+- `trop_i64`/`trop_real`: `min`
+- `trop_max_i64`: `max`
+
+### `mlir::arith::SubIOp`
+Keep as-is.
+
+### `mlir::arith::SubFOp`
+Keep as-is.
+
+### `MulOp`
+Pick the operation based on the semiring:
+- `i1`: logical AND
+- `i64`: signed integer multiply
+- `f64`: floating point multiply
+- `trop_i64`/`trop_max_i64`: signed integer add
+- `trop_real`: floating point add
+
+### `mlir::arith::DivFOp`
+Keep as-is.
+
+### `EqOp`
+Convert to the equivalent compare operation in the target representation.