diff --git a/benchmarks/sql/linq_fold_chain_audit.md b/benchmarks/sql/linq_fold_chain_audit.md new file mode 100644 index 000000000..bc5924ec1 --- /dev/null +++ b/benchmarks/sql/linq_fold_chain_audit.md @@ -0,0 +1,1413 @@ +# `_fold` chain coverage audit — silent fall-off catalog + +Generated 2026-05-23 from `0a2da407f`. Probe files live under +`/tmp/audit_probes/` (re-runnable; see "How to re-run" at the bottom). + +## Status — what this audit has closed + +**Theme 1 (terminal `_select` extension) — landed 2026-05-24** (`59c4f3f98`): + +- **1a, 1e + motivating** (`plan_order_family` / `plan_decs_order_family`): terminal `_select` accepted after `take(N)`. Bounded-heap holds the raw element; projection runs ≤K times at return. +- **plan_reverse / plan_decs_reverse**: terminal `_select` accepted after `reverse [+ take(N)]`. Closes the natural "filter, reverse for newest-first, take K, project" idiom. NOT closed: the `reverse + _select + take` ordering (2c / 2e exact shape) — user must reorder to `reverse + take + _select`. +- **8b** (`plan_decs_join`): single trailing `_select` between `_join` and the implicit `to_array`. Substitutes via a let-bound join-result + projection. +- **7a** (`plan_zip`): 3-arg `zip(a, b, sel)` pre-lowered to 2-arg `zip(a, b) |> _select(sel-as-tuple)` — the natural dot-product idiom now splices. + +Coverage extension: 1395 → 1415 linq tests (10 new tests in `tests/linq/test_linq_fold_terminal_select.das`). + +**Theme 2 (trailing `_where` / HAVING) — landed 2026-05-24**: + +- **8a, C6** (`plan_decs_join`): single trailing `_where` between `_join` and the terminator. Predicate references join-result fields; emission binds the result once per pair and gates `count++` / `push_clone`. Composes with the terminal `_select` from Theme 1. +- **4a** (`plan_group_by`) + **4e** (`plan_decs_group_by` via shared `plan_group_by_core`): trailing `_where` AFTER `_select(reducer)`, i.e. SQL HAVING on the post-aggregate tuple. Binds the constructed output once per bucket and gates buf-emit / count-emit. Distinct from the existing `having_` slot (which is pre-select and can lift hidden reducer slots) — both can fire on the same chain. +- **5c** (`plan_loop_or_count` across all 4 lanes — counter / accumulator / early-exit / array): `take(N)._where(p).terminator` accepted. Take cap ticks unconditionally per element; trailing `_where` gates only the per-element contribution, preserving the "first N elements, then keep matching" semantic that auto-rewriting can't reproduce. + +Coverage extension: 1415 → 1437 linq tests (12 new tests in `tests/linq/test_linq_fold_theme2_trailing_where.das`). + +Still open (queued for the next session per the cross-cutting findings below): + +- Theme 3 — cross-arm composition (5 of 6 composition probes). +- Themes 4–8 — see "Cross-cutting findings" section. + + +The audit catalogs **silent fall-off** in `daslib/linq_fold.das`: chains where a +natural user phrasing makes the splice arm return null and the planner falls +back to the slow default cascade (`fold_linq_default`) — without any warning. +Every row below shows the post-macro `ast_dump` of one probe, classified as +SPLICE-FIRES (single-pass specialized loop) or FALLS-OFF (cascade of +`__::linq\`helper\`` calls plus intermediate `array<...>` allocations). + +Each "FALLS-OFF" row names the bail line in `linq_fold.das` and proposes +either a cheap user-side rewrite or an arm extension. The audit does NOT +change any code — every finding is a follow-up TODO. + +--- + +## Motivating example — closest 10 sounds + +The audit was prompted by this user scenario: "I have an array of sounds with +`(id, position)` and a head position. Give me the ids of the 10 closest." The +natural translation looks like: + +```das +let closest_ids <- _fold(each(sounds) + ._order_by(distance(_.position, head)) + .take(10) + ._select(_.id) + .to_array()) +``` + +That chain SILENTLY FALLS OFF the splice. The arm that should fire is +`plan_order_family` (linq_fold.das:1234), which emits a bounded-heap walk +holding at most N elements — but its accept list is `[where_*] order_* +[take|first]?`, and the `._select(_.id)` between `take` and `to_array` is not +in that list. The chain falls through line 1284's `else { return null }` and +into the default 3-pass cascade. + +**With `_select` (the natural form)** — `/tmp/audit_probes/motivating_with_select.das`: + +```das +return <- invoke($(var source : iterator) : array { + var pass_0 <- __::linq`order_by_to_array(source, $(_) { return distance(_.position, head); }); + __::linq`take_inplace(pass_0, 10); + var pass_2 <- __::linq`select(pass_0, $(_) { return _.id; }); + __::builtin`finalize(pass_0); + return <- pass_2; +}, __::builtin`each(sounds)) +``` + +Default cascade: full sort over N elements + allocation, then truncate, then +another allocation for the projection. + +**Without `_select` (the splice-eligible form)** — `/tmp/audit_probes/motivating_without_select.das`: + +```das +return <- invoke($(source : array) : array { + var order_buf : array; + for (it in source) + if (length(order_buf) < 10) + push_clone(order_buf, it) + spliced_push_heap(order_buf, $(v1, v2) => less(distance(v1.position, head), distance(v2.position, head))) + elif (less(distance(it.position, head), distance(order_buf[0].position, head))) + spliced_pop_heap(order_buf, ...) + order_buf[length(order_buf) - 1] = it + spliced_push_heap(order_buf, ...) + order_inplace(order_buf, ...) + return <- order_buf +}, sounds) +``` + +Single-pass bounded heap holding ≤10 elements. No N-sized allocation. + +The rest of this document is the systematic version of that comparison +across every `plan_*` arm in the splice machinery. + +--- + +## Chain 1 — `plan_order_family` / `plan_decs_order_family` + +**Accepts**: `[where_*] order_* [take(N)|first|first_or_default]?` (linq_fold.das:1234 array, :4547 decs) +**Common bails**: select-anywhere (line 1284 / 4594 fall-through), where-after-order (line 1252 / 4566), reverse-in-chain (line 1284 / 4594 fall-through), explicit comparator on bare `order` / `order_descending` (line 1264 / 4575) + +### 1a — Closest 10 sounds, return ids (array) + +**Probe** (`/tmp/audit_probes/chain1_1a.das`): +```das +def probe_1a(sounds : array; head : float3) : array { + unsafe { + return <- _fold(each(sounds)._order_by(distance(_.position, head)).take(10)._select(_.id).to_array()) + } +} +``` + +**Generated**: +```das +return <- invoke($(var source : iterator) : array { + var pass_0 <- __::linq`order_by_to_array(source, $(_) { return distance(_.position, head); }); + __::linq`take_inplace(pass_0, 10); + var pass_2 <- __::linq`select(pass_0, $(_) { return _.id; }); + __::builtin`finalize(pass_0); + return <- pass_2; +}, __::builtin`each(sounds)) +``` + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: `_select(_.id)` between `take(10)` and `to_array()` is not in plan_order_family's allowed call list (linq_fold.das:1284 fall-through). The cascade fully sorts N elements into `pass_0`, then truncates, then allocates a fresh `array` for the projection. Rewrite: split into two steps — `let top <- _fold(...take(10).to_array()); let ids <- [for (s in top); s.id]`. Extending the arm to accept a terminal `_select` after `take(N)` / `first` / `first_or_default` is the highest-impact fix since the bounded heap already holds ≤N elements; a final projection at emission time is essentially free. + +### 1b — Top 5 scores descending via order + reverse (array) + +**Probe** (`/tmp/audit_probes/chain1_1b.das`): +```das +def probe_1b(scores : array) : array { + unsafe { + return <- _fold(each(scores)._order_by(_.score).reverse().take(5).to_array()) + } +} +``` + +**Generated**: +```das +return <- invoke($(var source : iterator) : array { + var pass_0 <- __::linq`order_by_to_array(source, $(_) { return _.score; }); + __::linq`reverse_inplace(pass_0); + __::linq`take_inplace(pass_0, 5); + return <- pass_0; +}, __::builtin`each(scores)) +``` + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: `reverse()` between `_order_by` and `take` is not in the accepted vocabulary (line 1284 fall-through). The natural way for a user to ask for "top 5 descending" via the ascending key is `order_by(...).reverse().take(5)`; the splice path requires the user to know `_order_by_descending(...).take(5)` instead. The arm could recognize `_order_by(k).reverse()` and rewrite it to the `_order_by_descending(k)` form before the comparator-emission step. + +### 1c — Where-after-order (array) + +**Probe** (`/tmp/audit_probes/chain1_1c.das`): +```das +def probe_1c(employees : array) : array { + unsafe { + return <- _fold(each(employees)._order_by(_.salary)._where(_.dept == "eng").take(10).to_array()) + } +} +``` + +**Generated**: +```das +return <- invoke($(var source : iterator) : array { + var pass_0 <- __::linq`order_by_to_array(source, $(_) { return _.salary; }); + var pass_1 <- __::linq`where_(pass_0, $(_) { return _.dept == "eng"; }); + __::builtin`finalize(pass_0); + __::linq`take_inplace(pass_1, 10); + return <- pass_1; +}, __::builtin`each(employees)) +``` + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: `_where` after `_order_by` is explicitly rejected by `if (hasOrder) return null` at line 1252. Sorting first and then filtering is genuinely wasteful (sorts ~N elements just to drop most of them); the rewrite is mechanical — move the `_where` before the `_order_by`. Worth a lint suggestion rather than a splice extension: keeping the post-sort `_where` semantically correct in the splice would require either re-running the filter inside the bounded heap or re-allocating after the sort, both of which lose the cascade's correctness while not actually buying anything over the trivial user rewrite. + +### 1d — Closest 10 sounds, baseline (array) + +**Probe** (`/tmp/audit_probes/chain1_1d.das`): +```das +def probe_1d(sounds : array; head : float3) : array { + unsafe { + return <- _fold(each(sounds)._order_by(distance(_.position, head)).take(10).to_array()) + } +} +``` + +**Generated** (trimmed): +```das +return <- invoke($(source : array) : array { + var order_buf : array; + for (it in source) + if (length(order_buf) < 10) + push_clone(order_buf, it) + spliced_push_heap(order_buf, ...) + elif (less(distance(it.position, head), distance(order_buf[0].position, head))) + spliced_pop_heap(order_buf, ...) + order_buf[length(order_buf) - 1] = it + spliced_push_heap(order_buf, ...) + order_inplace(order_buf, ...) + return <- order_buf; +}, sounds) +``` + +**Classification**: SPLICE-FIRES — bounded-heap arm (line 1300 `useBoundedHeap`). + +**Conclusion**: Baseline confirms the bounded-heap splice path — O(N log K) heap maintenance over the walk, no full-N sort, no full-N allocation. + +### 1e — Closest 10 sounds, return ids (decs) + +**Probe** (`/tmp/audit_probes/chain1_1e.das`): same shape as 1a over `from_decs_template(type)`. + +**Generated** (trimmed): +```das +return <- invoke($(var source : iterator>) : array { + var pass_0 <- __::linq`order_by_to_array(source, ...); + __::linq`take_inplace(pass_0, 10); + var pass_2 <- __::linq`select(pass_0, ...); + __::builtin`finalize(pass_0); + return <- pass_2; +}, invoke($() : iterator> { + var res : array>; + for_each_archetype(..., $(arch) { + for (ds_id, ds_position in get_ro(arch, "ds_id", type), get_ro(arch, "ds_position", type)) + push(res, tuple(ds_id, ds_position)); + }); + return <- __::linq`to_sequence(res); +})) +``` + +**Classification**: FALLS-OFF — default cascade, with the **double penalty** that the decs source bridge eagerly materializes ALL rows into `res` before the array cascade even starts. + +**Conclusion**: Same trailing `_select` mismatch as 1a (line 4594 fall-through in `plan_decs_order_family`). Worse than 1a because when `plan_decs_order_family` returns null, the `from_decs_template` bridge has no other splice to bind it to — it degenerates to full materialization of every archetype row into a temp `res`, wrapped in `to_sequence` for the array cascade. Two allocations of the full row set + one projection. Extension fix: same as 1a — accept a terminal `_select` after `take(N)` / `first` / `first_or_default` in `plan_decs_order_family`. + +### 1f — Closest 10 sounds, baseline (decs) + +**Probe** (`/tmp/audit_probes/chain1_1f.das`): same shape as 1d over decs. + +**Generated** (trimmed): +```das +return <- invoke($() : array> { + var decs_buf : array>; + for_each_archetype(..., $(arch) { + for (ds_id, ds_position in get_ro(...), get_ro(...)) + if (length(decs_buf) < 10) + push_clone(decs_buf, tuple(ds_id, ds_position)) + spliced_push_heap(decs_buf, ...) + elif (less(distance(ds_position, head), distance(decs_buf[0].position, head))) + spliced_pop_heap(decs_buf, ...) + decs_buf[length(decs_buf) - 1] = tuple(...) + spliced_push_heap(decs_buf, ...) + }); + order_inplace(decs_buf, ...); + return <- decs_buf; +}) +``` + +**Classification**: SPLICE-FIRES — bounded-heap arm fused into a single `for_each_archetype` (line 4609 `useBoundedHeap`). + +**Conclusion**: Decs baseline confirms the bounded-heap path through the archetype walk: ≤10 push_clones to `decs_buf`, no eager materialization. + +### Chain 1 — follow-up TODOs + +- **Highest impact**: extend `plan_order_family` (line 1234) and `plan_decs_order_family` (line 4547) to accept a terminal `_select` after `take(N)` / `first` / `first_or_default`. The bounded-heap arm already holds at most N elements; emitting the projection during the final `order_inplace` walk is essentially free. The "closest N, return projected field" idiom is extremely natural. +- Recognize `_order_by(k).reverse()` and rewrite to `_order_by_descending(k)` (and dual) before the comparator-emission step. Currently `reverse()` mid-chain is a hard bail (line 1284 / 4594). +- Lint suggestion (style rule, not a splice extension): `_order_by(...)._where(...)` should reorder to `_where(...)._order_by(...)`. Sorting before filtering is wasteful in any execution mode. +- Decs FALLS-OFF cases are doubly penalized — the bridge's eager-materialize default lands behind every `plan_decs_order_family` bail. Worth a dedicated diagnostic when this path is hit. + +--- + +## Chain 2 — `plan_reverse` / `plan_decs_reverse` + +**Accepts**: `[where_*][select?] reverse [take(N)]? [count|first|first_or_default]?` (linq_fold.das:1764 array, :4802 decs) +**Common bails**: `where_` / `select` AFTER reverse or AFTER select (line 1797-1800 / 4833-4838 — `seenSelect || hasReverse` guards), order-anywhere (line 1813 / 4851 fall-through), double-reverse (line 1804 / 4843), `take(N)` paired with a separate terminator (line 1816 / 4854 bail). + +Note: `_where → _select → reverse → take → to_array` (select BEFORE reverse, no further select/where after) is ACCEPTED — the guards prevent `where AFTER select` and `select AFTER select`, not `select BEFORE reverse`. + +### 2a — Reverse + distinct_by (array) + +**Probe** (`/tmp/audit_probes/chain2_2a.das`): +```das +def probe_2a(events : array) : array { + unsafe { + return <- _fold(each(events).reverse()._distinct_by(_.kind).to_array()) + } +} +``` + +**Generated**: +```das +return <- invoke($(var source : iterator) : array { + var pass_0 <- __::linq`reverse_to_array(source); + __::linq`distinct_by_inplace(pass_0, $(_) { return _.kind; }); + return <- pass_0; +}, __::builtin`each(events)) +``` + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: `distinct_by` after `reverse` is not in plan_reverse's vocabulary (line 1813 fall-through), and the call appears before any recognized terminator so plan_reverse cannot peel one. plan_distinct in turn doesn't model `reverse` in its prelude either (line 1999 fall-through). Result: full `reverse_to_array` allocation + `distinct_by_inplace`. Cheap user rewrite: `_distinct_by` is order-stable, so the user could write `_distinct_by(_.kind).reverse()` — but that's a behavior change (different element survives per kind). A real splice extension would need `plan_reverse` to recognize `reverse + distinct_by` and emit a single-pass walk that retains the LAST element per key, then reverses. + +### 2b — Order then reverse (array) + +**Probe** (`/tmp/audit_probes/chain2_2b.das`): +```das +def probe_2b(events : array) : array { + unsafe { + return <- _fold(each(events)._order_by(_.ts).reverse().to_array()) + } +} +``` + +**Generated**: +```das +return <- invoke($(var source : iterator) : array { + var pass_0 <- __::linq`order_by_to_array(source, $(_) { return _.ts; }); + __::linq`reverse_inplace(pass_0); + return <- pass_0; +}, __::builtin`each(events)) +``` + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: Symmetric to 1b. `plan_reverse` bails at `_order_by` (line 1813 fall-through); `plan_order_family` bails at `reverse()` (line 1284 fall-through). Neither splice fires. User rewrite: `_order_by_descending(_.ts).to_array()`. Same TODO as chain 1: recognize `_order_by(k).reverse()` and normalize to `_order_by_descending(k)` before either planner gets it. + +### 2c — Select-after-reverse (array) + +**Probe** (`/tmp/audit_probes/chain2_2c.das`): +```das +def probe_2c(users : array) : array { + unsafe { + return <- _fold(each(users)._where(_.active).reverse()._select(_.name).take(5).to_array()) + } +} +``` + +**Generated**: +```das +return <- invoke($(var source : iterator) : array { + var pass_0 <- __::linq`where_to_array(source, $(_) { return _.active; }); + __::linq`reverse_inplace(pass_0); + var pass_2 <- __::linq`select(pass_0, $(_) { return _.name; }); + __::builtin`finalize(pass_0); + __::linq`take_inplace(pass_2, 5); + return <- pass_2; +}, __::builtin`each(users)) +``` + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: `_select` AFTER `reverse` trips the `hasReverse` half of the guard at line 1800 (`if (hasReverse || seenSelect) return null` inside the select arm). User rewrite to fire the splice is non-obvious — move the select before reverse: `_where(_.active)._select(_.name).reverse().take(5).to_array()` (which DOES splice, since select BEFORE reverse is accepted). Extension fix: in `plan_reverse`, when a terminal `_select(...)` follows `reverse` (or `reverse + take(N)`), treat it as a projection applied on the buffer return — same shape as the chain 1 terminal-projection extension. + +### 2d — Where + reverse + take (array baseline) + +**Probe** (`/tmp/audit_probes/chain2_2d.das`): +```das +def probe_2d(events : array) : array { + unsafe { + return <- _fold(each(events)._where(_.active).reverse().take(5).to_array()) + } +} +``` + +**Generated** (trimmed): +```das +return <- invoke($(source : array) : array { + var buf : array; + for (it in source) + if (it.active) + push_clone(buf, it); + reverse_inplace(buf); + resize(buf, length(buf) > 5 ? 5 : length(buf)); + return <- buf; +}, events) +``` + +**Classification**: SPLICE-FIRES — R5 buffer + reverse_inplace + resize arm. + +**Conclusion**: Baseline confirms the standard plan_reverse buffer arm — one filtered push pass, one in-place reverse, one resize to take(N). + +### 2e — Select-after-reverse (decs) + +**Probe** (`/tmp/audit_probes/chain2_2e.das`): same shape as 2c over decs. + +**Generated** (trimmed): identical structure to 2c, plus eager decs bridge materializing `res` first. + +**Classification**: FALLS-OFF — default cascade, doubled by eager decs materialization. + +**Conclusion**: Same root cause as 2c (line 4838 in plan_decs_reverse). Three full-N allocations: `res`, `pass_0` (where filtered), `pass_2` (selected). Same extension fix as 2c. + +### 2f — Where + reverse + take (decs baseline) + +**Probe** (`/tmp/audit_probes/chain2_2f.das`): same shape as 2d over decs. + +**Generated** (trimmed): +```das +return <- invoke($() : array> { + var decs_buf : array>; + for_each_archetype(..., $(arch) { + for (de_id, de_kind, de_ts, de_active in get_ro(...), get_ro(...), get_ro(...), get_ro(...)) + if (de_active) + push_clone(decs_buf, tuple(de_id, de_kind, de_ts, de_active)); + }); + reverse_inplace(decs_buf); + resize(decs_buf, 5 < length(decs_buf) ? 5 : length(decs_buf)); + return <- decs_buf; +}) +``` + +**Classification**: SPLICE-FIRES — plan_decs_reverse buffer + reverse_inplace + resize. + +**Conclusion**: Decs-side mirror of 2d. + +### Chain 2 — follow-up TODOs + +- **Highest impact**: accept a terminal `_select(...)` after `reverse` (and after `reverse + take(N)`) in BOTH `plan_reverse` (line 1764) and `plan_decs_reverse` (line 4802). The current bail catches the natural "filter-then-reverse-for-newest-first-then-project" idiom. +- Recognize `reverse + distinct_by` as a fused walk retaining the LAST element per key, single archetype walk for decs and single source walk + buffer for arrays. +- Recognize `_order_by(k).reverse()` and rewrite to `_order_by_descending(k)` (chain 1 also wants this). +- Decs FALLS-OFF cases hit the same eager-materialize double penalty as chain 1. + +--- + +## Chain 3 — `plan_distinct` / `plan_decs_distinct` + +**Accepts**: `[where_*][select?] (distinct|distinct_by) [take(N)]? [count|long_count|sum|to_array]?` (linq_fold.das:1945 array, :5049 decs) +**Common bails**: `where_` AFTER `select` or `distinct` (line 1979 / 5085), second `select` (line 1982 / 5090, seenSelect), order-anywhere (line 1998 / 5108 fall-through), reverse-anywhere (same), 2-arg `count` / `long_count` / `sum` with predicate (line 1953-1957 / 5057-5063 — terminator-peel only fires for 1-arg form), distinct-after-distinct (line 1986 / 5095), `take(N)` paired with non-implicit terminator (line 2002 / 5112). + +Note: `_select(_.field)._where(...) → distinct → count` flips the order to `_where AFTER _select` which DOES hit the seenSelect bail (line 1979). `_where → _select → distinct → count` is ACCEPTED. + +### 3a — Select then where then distinct then count (array) + +**Probe** (`/tmp/audit_probes/chain3_3a.das`): +```das +def probe_3a(events : array) : int { + unsafe { + return _fold(each(events)._select(_.email)._where(length(_) > 0).distinct().count()) + } +} +``` + +**Generated**: +```das +return invoke($(var source : iterator) : int { + var pass_0 <- __::linq`select_to_array(source, $(_) { return _.email; }); + var pass_1 <- __::linq`where_(pass_0, $(_) { return length(_) > 0; }); + __::builtin`finalize(pass_0); + __::linq`distinct_inplace(pass_1); + var pass_3 = __::linq`count(pass_1); + __::builtin`finalize(pass_1); + return <- pass_3; +}, __::builtin`each(events)) +``` + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: `_where` AFTER `_select` trips the seenSelect bail at line 1979. 4-pass cascade for what could be a single-walk streaming dedup (no buffer at all because terminator is `count`). `_where → _select → distinct → count` is splice-eligible; the user just has to swap the where and select. + +### 3b — Distinct then order (array) + +**Probe** (`/tmp/audit_probes/chain3_3b.das`): +```das +def probe_3b(rows : array) : array { + unsafe { + return <- _fold(each(rows)._distinct_by(_.user_id)._order_by(_.ts).to_array()) + } +} +``` + +**Generated**: +```das +return <- invoke($(var source : iterator) : array { + var pass_0 <- __::linq`distinct_by_to_array(source, $(_) { return _.user_id; }); + __::linq`order_by_inplace(pass_0, $(_) { return _.ts; }); + return <- pass_0; +}, __::builtin`each(rows)) +``` + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: `_order_by` after `_distinct_by` is unrecognized in plan_distinct (line 1998 fall-through), and plan_order_family doesn't model `_distinct_by` as an upstream call (line 1284). Cascade materializes the distinct-by result, then in-place sorts. The two ops don't commute (distinct-then-sort ≠ sort-then-distinct), so no obvious user rewrite. Extension fix: plan_order_family could recognize an upstream `_distinct_by(keyFn)` and emit a fused walk that hash-tracks seen keys while feeding survivors into the bounded heap. + +### 3c — Distinct then predicated count (array) + +**Probe** (`/tmp/audit_probes/chain3_3c.das`): +```das +def probe_3c(events : array) : int { + unsafe { + return _fold(each(events)._distinct_by(_.region)._count(_.active)) + } +} +``` + +**Generated**: +```das +return invoke($(var source : iterator) : int { + var pass_0 <- __::linq`distinct_by_to_array(source, $(_) { return _.region; }); + var pass_1 = __::linq`count(pass_0, $(_) { return _.active; }); + __::builtin`finalize(pass_0); + return <- pass_1; +}, __::builtin`each(events)) +``` + +**Classification**: FALLS-OFF — default cascade (tier-2 helpers). + +**Conclusion**: `_count(predicate)` is the 2-arg form, and the terminator-peel at line 1953 requires `length(calls.back()._0.arguments) == 1` — bails BY DESIGN since the 1-arg splice template would silently drop the predicate (emit `length(seen)` instead of counting predicate-true survivors). Extension fix: extend plan_distinct's terminator branch to recognize 2-arg `_count(p)` and `_long_count(p)` and emit `if (p(it)) cnt++` at the fresh-key site. + +### 3d — Where + distinct_by + count (array baseline) + +**Probe** (`/tmp/audit_probes/chain3_3d.das`): +```das +def probe_3d(events : array) : int { + return _fold(each(events)._where(_.recent)._distinct_by(_.user_id).count()) +} +``` + +**Generated** (trimmed): +```das +return invoke($(source : array) : int { + var inscope seen : table; + for (it in source) + if (it.recent) { + let k = unique_key(it.user_id); + if (!key_exists(seen, k)) + insert(seen, k); + } + return length(seen); +}, events) +``` + +**Classification**: SPLICE-FIRES — buffer-free count arm (terminator is `length(seen)`). + +**Conclusion**: Streaming-dedup arm — single source walk, hashed dedup, return length. No element buffer (line 2014-2016, `needBuffer = false` when terminator is count). + +### 3e — Select then where then distinct then count (decs) + +**Probe** (`/tmp/audit_probes/chain3_3e.das`): same shape as 3a over decs. + +**Classification**: FALLS-OFF — default cascade, doubled by eager decs materialization. + +**Conclusion**: plan_decs_distinct bails at line 5085. Three full-N allocations (`res` + `pass_0` + `pass_1`) to compute a scalar count. Same user rewrite (swap where and select) and same extension fix as 3a. + +### 3f — Where + distinct_by + count (decs baseline) + +**Probe** (`/tmp/audit_probes/chain3_3f.das`): + +**Generated** (trimmed): +```das +return invoke($() : int { + var inscope decs_seen : table; + for_each_archetype(..., $(arch) { + for (ded_user_id, ded_recent in get_ro(...), get_ro(...)) + if (ded_recent) { + let decs_k = unique_key(ded_user_id); + if (!key_exists(decs_seen, decs_k)) + insert(decs_seen, decs_k); + } + }); + return length(decs_seen); +}) +``` + +**Classification**: SPLICE-FIRES — plan_decs_distinct streaming-dedup arm with hoisted `decs_seen` table. + +**Conclusion**: Hoisted seen-table spans archetypes, single walk with where + key insert, return length. + +### Chain 3 — follow-up TODOs + +- **Highest impact**: extend the 1-arg terminator peel in plan_distinct (line 1953) and plan_decs_distinct (line 5057) to accept 2-arg `_count(p)` / `_long_count(p)`. Emit predicate as a gate at the fresh-key site. +- Document (possibly as a STYLE lint) that `_select → _where → distinct → terminator` should be rewritten to `_where → _select → distinct → terminator` — the pre-select form is splice-eligible. +- Niche: recognize `_distinct_by(keyFn) + _order_by(otherKey) + to_array` and emit a fused hash-track + bounded sort walk. +- Decs FALLS-OFF inherits the eager-materialize double penalty from the bridge. + +--- + +## Chain 4 — `plan_group_by` / `plan_decs_group_by` + +**Accepts**: `[where_*][select*] group_by_lazy(key) [having_]? select(reducer) [count]?` (linq_fold.das:3030 array, :4500 decs, shared core :2729) +**Common bails**: missing terminal select (line 3046 / 4516), missing group_by_lazy (line 3056 / 4526), unrecognized reducer specs (line 2808), bare reducer + hidden HAVING slots (line 2821). + +### 4a — Inventory: sum price per category, keep categories totaling >1000 + +**Probe** (`/tmp/audit_probes/chain4_4a.das`): +```das +return <- _fold(items + ._group_by(_.category) + ._select((C = _._0, T = _._1 |> select(@(i : Item) => i.price) |> sum)) + ._where(_.T > 1000)) +``` + +**Generated**: +```das +var pass_0 <- __::linq`group_by_lazy(source, $(_) { return _.category; }); +var pass_1 <- __::linq`select(pass_0, $(_) { + return tuple(_._0, __::linq`sum(__::linq`select(_._1, ))); +}); +finalize(pass_0); +var pass_2 <- __::linq`where_(pass_1, $(_) { return _.T > 1000; }); +finalize(pass_1); +return <- pass_2; +``` + +**Classification**: FALLS-OFF (default cascade — three eager array allocations). + +**Conclusion**: A `_where` AFTER `_select(reducer)` lives outside the group_by recognizer (linq_fold.das:3046 demands `select` to be the immediate tail, optionally with one `having_` between it and `group_by_lazy`). The user's "post-aggregate HAVING" is exactly what the optional `having_` slot is for — rewrite to `._group_by(_.category)._having(_._1 |> select(@(i:Item)=>i.price) |> sum > 1000)._select(...)` and the splice fires. Arm extension: peel a single trailing `_where` and translate it to a `having_` slot when its predicate references only post-projection field names. + +### 4b — Brands sorted by count + +**Probe** (`/tmp/audit_probes/chain4_4b.das`): +```das +return <- _fold(items + ._group_by(_.brand) + ._select((B = _._0, C = _._1 |> length)) + ._order_by(_.C)) +``` + +**Generated**: +```das +var pass_0 <- __::linq`group_by_lazy(source, $(_) { return _.brand; }); +var pass_1 <- __::linq`select(pass_0, $(_) { return tuple(_._0, length(_._1)); }); +finalize(pass_0); +__::linq`order_by_inplace(pass_1, $(_) { return _.C; }); +return <- pass_1; +``` + +**Classification**: FALLS-OFF (default cascade). + +**Conclusion**: Any post-`select(reducer)` op forces the recognizer to bail (line 3046). `plan_group_by_core` finishes first; `_order_by` then operates on the bucket-array shape via tier-2. No clean rewrite — this is a genuine two-stage pipeline. Arm extension: after `plan_group_by_core` emits its table, peel a trailing `_order_by` / `_reverse` / `take` cascade on the bucket output as a buffer-required post-pass. + +### 4c — Distinct names per brand + +**Probe** (`/tmp/audit_probes/chain4_4c.das`): +```das +return <- _fold(items + ._group_by(_.brand) + ._select((B = _._0, Names = _._1 |> select(@(i : Item) => i.name) |> distinct))) +``` + +**Generated**: +```das +var pass_0 <- __::linq`group_by_lazy(source, $(_) { return _.brand; }); +var pass_1 <- __::linq`select(pass_0, $(_) { + return tuple(_._0, __::linq`distinct(__::linq`select(_._1, ))); +}); +finalize(pass_0); +return <- pass_1; +``` + +**Classification**: FALLS-OFF (default cascade). + +**Conclusion**: `recognize_reducer_specs` (line 2807) only knows count / length / long_count / sum / min / max / first / average + their `select(...) |> reducer` variants. `distinct` is not a recognized reducer spec — `specs` comes back empty and we bail at line 2808. Extension: accept `distinct[_by]` / `reverse` / `to_array` as reducer ends, accumulating to `array` slot type (table-of-arrays accumulator pattern already exists). + +### 4d — Baseline: count per brand + +**Probe** (`/tmp/audit_probes/chain4_4d.das`): +```das +return <- _fold(items + ._group_by(_.brand) + ._select((B = _._0, C = _._1 |> length))) +``` + +**Generated** (trimmed): +```das +var inscope tab : table> +var dummy : tuple +for (it in source) { + let k = it.brand + let uk = unique_key(k) + var entry & = tab?[uk] ?? dummy + if (addr(entry) == addr(dummy)) { + entry._1 = 1 + dummy._0 = k; tab[uk] = dummy; dummy = default + } else { + ++entry._1 + } +} +var buf; reserve(buf, length(tab)) +for (kv in values(tab)) { buf |> push_clone(kv) } +return <- buf +``` + +**Classification**: SPLICE-FIRES (`plan_group_by_core` table-state arm — line 2853). + +**Conclusion**: Reference arm — table-of-accumulators + addr-compare first-key-wins state machine. + +### 4e — DECS variant of 4a (post-aggregate HAVING) + +Same shape as 4a over `from_decs_template`. Decs bridge unrolls (good) but bucket then materializes through the standard array cascade — worst of both worlds (for_each_archetype expansion + three array allocations + per-bucket reducer invoke). + +**Classification**: FALLS-OFF (line 4516 mirrors line 3046). + +**Conclusion**: `plan_group_by_core` is shared, so one extension covers both planners. + +### 4f — DECS baseline (count per brand) + +**Generated** (trimmed): +```das +var inscope decs_tab : table> +var decs_dummy : tuple +for_each_archetype(, , $(arch) { + for (e_brand in get_ro(arch,"e_brand",type)) + let decs_k = e_brand + let decs_uk = unique_key(decs_k) + var decs_entry & = decs_tab?[decs_uk] ?? decs_dummy + if (addr(decs_entry) == addr(decs_dummy)) { + decs_entry._1 = 1 + decs_dummy._0 = decs_k; decs_tab[decs_uk] = decs_dummy + decs_dummy = default + } else { ++decs_entry._1 } +}) +... reserve(decs_buf); for (kv in values(decs_tab)); push_clone ... +``` + +**Classification**: SPLICE-FIRES (`plan_decs_group_by` → shared core with decs adapter). + +**Conclusion**: Decs adapter routes the table-accumulator through `for_each_archetype`. User MUST write `.to_array()` explicitly here. + +### Chain 4 — follow-up TODOs + +- **HAVING-shaped trailing `_where`**: post-aggregate filter after `_select(reducer)` is the natural shape for "GROUP BY ... HAVING SUM(x) > N". Peel one trailing `_where` and translate to synthetic `having_`. +- **`_order_by` / `_reverse` / `take` on group buckets**: very common SQL shape. Add a post-pass to `plan_group_by_core` that inlines these into the buf-emit loop. +- **`distinct[_by]` as a per-bucket reducer**: extend `recognize_reducer_specs` with array-shaped reducers; slot type becomes `array`. + +--- + +## Chain 5 — `plan_loop_or_count` + +**Accepts**: `[where_*][select*][skip?][skip_while?][take_while?][take?] [terminator]?` over 17 terminator names (count / long_count / sum / min / max / average / first / first_or_default / any / all / contains / element_at / element_at_or_default / last / last_or_default / single / single_or_default / aggregate). Source must be array-typed via `each(...)` (linq_fold.das:1563). +**Common bails**: where-after-range (line 1603), select-after-range (line 1630), impure select before where (line 1607), duplicate range ops (lines 1647/1654/1662/1669), buffer-required op (line 1674 — order_by/distinct/group_by/reverse all bail here to their planners), unknown op (line 1678), identity ARRAY chain (line 1748). + +### 5a — Two `where_`s around a pure `_select` + +**Probe** (`/tmp/audit_probes/chain5_5a.das`): +```das +return _fold(each(items)._where(_.active)._select(_.score)._where(_ > 100).count()) +``` + +**Generated**: +```das +return invoke($(source) { + var acc = 0; + for (it in source) + if (it.active && (it.score > 100)) ++acc + return acc; +}, items) +``` + +**Classification**: SPLICE-FIRES (counter lane with merged predicate). + +**Conclusion**: where-after-select is HANDLED by the planner (linq_fold.das:1605-1620) — when the projection is pure, the second `where_` substitutes the projection into the predicate and merges with the first via `&&`. Zero allocation. Pure-select fast path does exactly what users expect. + +### 5b — Select after a range op + +**Probe** (`/tmp/audit_probes/chain5_5b.das`): +```das +return <- _fold(each(items)._select((x = _.score, y = _.active)).skip(5)._select(_.x).to_array()) +``` + +**Generated**: +```das +var pass_0 <- __::linq`select_to_array(source, ); +__::linq`skip_inplace(pass_0, 5); +var pass_2 <- __::linq`select(pass_0, $(_) { return _.x; }); +finalize(pass_0); +return <- pass_2; +``` + +**Classification**: FALLS-OFF (default cascade). + +**Conclusion**: linq_fold.das:1630 — the second `_select` arrives with `seenSkip == true` and the planner bails. A `_select` after any range op is structurally incompatible with the single-pass shape because the projection identity shifts mid-chain. User rewrite: collapse to a single projection. Arm extension would require multi-segment shape with per-segment binds — almost a new planner. + +### 5c — Where after take + +**Probe** (`/tmp/audit_probes/chain5_5c.das`): +```das +return _fold(each(items).take(100)._where(_.active).count()) +``` + +**Generated**: +```das +var pass_0 <- __::linq`take_to_array(source, 100); +var pass_1 <- __::linq`where_(pass_0, $(_) { return _.active; }); +finalize(pass_0); +var pass_2 = __::linq`count(pass_1); +finalize(pass_1); +return <- pass_2; +``` + +**Classification**: FALLS-OFF (default cascade). + +**Conclusion**: linq_fold.das:1603 — `where_` arrives with `seenTake == true`. Semantically distinct from `where.take`: `take(100)._where(...)` = "first 100 elements, then keep active ones" (count ≤ 100); `_where(...).take(100)` = "first 100 active ones" (count exactly 100 if there are ≥100 active). No automatic rewrite is safe. Extension: counter lane with take-cap that ticks BEFORE the where filter. + +### 5d — Aggregate terminator + +**Probe** (`/tmp/audit_probes/chain5_5d.das`): +```das +return _fold(each(items).aggregate(0, $(acc : int; x : Item) => acc + x.score)) +``` + +**Generated**: +```das +return invoke($(source) { + var agg = 0; + for (it in source) + agg = agg + it.score + return agg; +}, items) +``` + +**Classification**: SPLICE-FIRES (walk lane, Slice 5). + +**Conclusion**: `aggregate` is a recognized walk-lane terminator. Seed and reducer block are inlined; per-element body is just `agg = agg + body`. No invoke into `aggregate_impl`. + +### 5e — Baseline: where + select + take + sum + +**Probe** (`/tmp/audit_probes/chain5_5e.das`): +```das +return _fold(each(items)._where(_.active)._select(_.score).take(10).sum()) +``` + +**Generated**: +```das +return invoke($(source) { + var tc = 0; + var acc = 0; + for (it in source) + if (it.active) + if (tc >= 10) break + else { ++tc; acc += it.score } + return acc; +}, items) +``` + +**Classification**: SPLICE-FIRES (accumulator lane + take cap). + +**Conclusion**: Canonical happy path — where → select fuses, take adds a counter, sum is an accumulator. Reference arm. + +### Chain 5 — follow-up TODOs + +- **`where` after `take` / `take_while`**: not algebraically equivalent so can't auto-reorder, but the counter-lane shape could handle it manually. +- **`select` after `skip` / `take` / `take_while` / `skip_while`**: requires per-segment bind handling. Probably better to document canonical order in `skills/linq.md` and lint-warn. +- Unrecognized op cascade (order/distinct/group/reverse): the bail at line 1674 is the explicit handoff to per-family planners, not a fall-off. + +--- + +## Chain 6 — `plan_decs_unroll` + +**Accepts**: same shape as `plan_loop_or_count` (count/long_count/sum/min/max/average/first/first_or_default/any/all/contains/element_at/element_at_or_default/last/last_or_default/single/single_or_default/aggregate/min_by/max_by + implicit-to_array) over `from_decs_template(...)` bridges. Delegates to plan_decs_order_family / plan_decs_reverse / plan_decs_distinct / plan_decs_group_by / plan_decs_join for buffer-required shapes. +**Common bails**: source not a decs bridge (line 4455), no recognized terminator + no implicit to_array (line 4493), range extraction failed (line 4476), chain info failed (line 4478), sum/min/max/average over tuple element (line 4483), select before predicate-driven range (line 3568). + +### 6a — Select before predicate-driven range + +**Probe** (`/tmp/audit_probes/chain6_6a.das`): +```das +return _fold(from_decs_template(type)._select(_.score)._skip_while(_ < 0).count()) +``` + +**Generated**: +```das +var pass_0 <- __::linq`select_to_array(source, $(_) { return _.score; }); +var pass_1 <- __::linq`skip_while(pass_0, $(_) { return _ < 0; }); +finalize(pass_0); +var pass_2 = __::linq`count(pass_1); +finalize(pass_1); +return <- pass_2; +``` + +**Classification**: FALLS-OFF (default cascade for linq chain; bridge IS unrolled but doesn't connect to the rest). + +**Conclusion**: linq_fold.das:3568 — when suffix contains `skip_while` / `take_while`, prefix must be select-free (predicates use source tuple, not projected scalar). User rewrite: drop the `_select`, move comparison into `_skip_while`: `._skip_while(_.score < 0).count()`. Make the rule explicit in `skills/linq.md`. Arm extension would require predicate rewriting through projection. + +### 6b — Aggregate over decs + +**Probe** (`/tmp/audit_probes/chain6_6b.das`): +```das +return _fold(from_decs_template(type).aggregate(0, $(acc : int; x : auto) => acc + x.score)) +``` + +**Generated**: +```das +return invoke($() { + var decs_agg = 0; + for_each_archetype(, , $(arch) { + for (e_score in get_ro(arch, "e_score", type)) + decs_agg = decs_agg + e_score + }); + return decs_agg; +}) +``` + +**Classification**: SPLICE-FIRES (`emit_decs_walk_lane`, Slice 5f). + +**Conclusion**: Aggregate is in the `isWalk` set. Bridge fuses into accumulator loop — pruner trimmed to ONLY `e_score` reads (no `e_id`/`e_active`). Best-in-class shape. + +### 6c — min_by with where + +**Probe** (`/tmp/audit_probes/chain6_6c.das`): +```das +let r = _fold(from_decs_template(type)._where(_.active)._min_by(_.score)) +``` + +**Generated**: +```das +return invoke($() { + var decs_first = true; + var decs_bkey : int; + var decs_belem : tuple; + for_each_archetype(..., $(arch) { + for (e_id, e_score, e_active in get_ro(...), get_ro(...), get_ro(...)) + if (e_active) + let decs_key = e_score + if (decs_first) + decs_bkey = decs_key + decs_belem = tuple(e_id, e_score, e_active) + decs_first = false + elif (decs_key < decs_bkey) + decs_bkey = decs_key + decs_belem = tuple(e_id, e_score, e_active) + }); + return decs_belem; +}) +``` + +**Classification**: SPLICE-FIRES (`emit_decs_min_max_by`, streaming single-best state). + +**Conclusion**: Canonical streaming-min shape on decs. All three columns read since `min_by` returns the full element. + +### 6d — element_at with where + +**Probe** (`/tmp/audit_probes/chain6_6d.das`): +```das +let r = _fold(from_decs_template(type)._where(_.active).element_at(3)) +``` + +**Generated**: +```das +return invoke($() { + var decs_ec = 0; + var decs_found = false; + var decs_res : tuple<...>; + for_each_archetype_find(..., $(arch) : bool { + for (e_id, e_score, e_active in get_ro(...), get_ro(...), get_ro(...)) + if (e_active) + if (decs_ec == 3) + decs_res = tuple(e_id, e_score, e_active) + decs_found = true + return true + else { ++decs_ec } + return false; + }); + if (!decs_found) panic("element index out of range", ...) + return <- decs_res; +}) +``` + +**Classification**: SPLICE-FIRES (`emit_decs_element_at`, Slice 5f). + +**Conclusion**: `for_each_archetype_find` outer (returns bool to break early across archetypes) + counter inside, then panics if not found. Reference arm. + +### 6e — reverse + take + to_array (delegates to plan_decs_reverse) + +**Probe** (`/tmp/audit_probes/chain6_6e.das`): +```das +return <- _fold(from_decs_template(type).reverse().take(10).to_array()) +``` + +**Generated**: +```das +return invoke($() { + var decs_total : int64 = 0; + for_each_archetype(..., $(arch) { decs_total += arch.size; }); + let decs_actual = (decs_total > 10) ? 10 : decs_total; + let decs_skip = decs_total - decs_actual; + var decs_buf; + if (decs_actual == 0) return <- decs_buf + reserve(decs_buf, int(decs_actual)); + var decs_seen : int64 = 0; + for_each_archetype_find(..., $(arch) : bool { + if (decs_seen + arch.size <= decs_skip) + decs_seen += arch.size; return false + var decs_skips = (decs_skip > decs_seen) ? (decs_skip - decs_seen) : 0; + for (e_id, e_score, e_active in get_ro(...), get_ro(...), get_ro(...)) + if (decs_skips > 0) { --decs_skips; continue } + else + var decs_tup = tuple(e_id, e_score, e_active) + push_clone(decs_buf, decs_tup) + if (int64(length(decs_buf)) >= decs_actual) break + decs_seen += arch.size; + return int64(length(decs_buf)) >= decs_actual; + }); + __::linq`reverse_inplace(decs_buf); + return <- decs_buf; +}) +``` + +**Classification**: SPLICE-FIRES (`plan_decs_reverse` — PR #2834 reverse skip-into-tail pattern). + +**Conclusion**: `plan_decs_unroll` does NOT handle `reverse` itself — dispatch happens earlier through `plan_decs_reverse`. 2-pass shape (sum sizes → skip into tail → reverse_inplace) is exactly the PR #2834 win. Dispatch works as designed. + +### Chain 6 — follow-up TODOs + +- **`select` before `skip_while` / `take_while`**: same root cause as Chain 5 5b (predicate semantics differ pre- vs post-projection). Document canonical order. +- **sum/min/max/average over tuple element without `_select`**: line 4483 bail is correct but silent — emit a planner diagnostic when `isAccum && selectCount == 0`. +- **Implicit to_array gate**: line 4493 requires `expr._type.isGoodArrayType`. Failure mode for "no terminator at all" is opaque. + +--- + +## Chain 7 — `plan_zip` + +**Accepts**: `zip(srcB) [where_*][select?][skip?][skip_while?][take_while?][take?] [terminator]?` STRICTLY 2-arg zip (linq_fold.das:5395) +**Common bails**: 3-arg result-selector zip (line 5402), unrecognized intermediate op (line 5528), chained selects (line 5486) + +### 7a — `zip(srcB, result_selector)` 3-arg form + `sum()` + +**Probe** (`/tmp/audit_probes/chain7_7a.das`): +```das +return _fold(each(a) |> zip(each(b), $(x, y : int) => x + y) |> sum()) +``` + +**Generated**: +```das +return invoke($(var source : iterator) : int { + var pass_0 <- __::linq`zip_to_array(source, each(b), $(x,y:int) => x+y); + var pass_1 = __::linq`sum(pass_0); + finalize(pass_0); + return <- pass_1; +}, each(a)) +``` + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: The natural "sum of (a[i] op b[i])" — the dot-product idiom — bails at line 5402 because the result-selector lives inside `zip(...)`. To recover splice, rewrite as `zip(b) |> _select(_._0 * _._1) |> sum()` (probe 7d). Either lower the 3-arg form to 2-arg `zip + _select` inside the macro before reaching `plan_zip`, or extend `plan_zip` to peel a 3-arg zip's result_selector into the chain `projection` slot. + +### 7b — `zip` + `_order_by` terminator + +**Probe** (`/tmp/audit_probes/chain7_7b.das`): +```das +return <- _fold(each(a) |> zip(each(b)) |> _order_by(_._0) |> to_array()) +``` + +**Generated**: +```das +return <- invoke($(var source : iterator) : array> { + var pass_0 <- __::linq`zip_to_array(source, each(b)); + __::linq`order_by_inplace(pass_0, $(_) { return _._0; }); + return <- pass_0; +}, each(a)) +``` + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: `_order_by` after `zip` is unrecognized intermediate op (line 5528). A targeted "zip-then-order-by-then-take" arm would be the right next splice. + +### 7c — `zip` + chained `_select`s + +**Probe** (`/tmp/audit_probes/chain7_7c.das`): +```das +return <- _fold(each(a) |> zip(each(b)) |> _select(_._0) |> _select(_ * 2) |> to_array()) +``` + +**Generated**: +```das +var pass_0 <- __::linq`zip_to_array(source, each(b)); +var pass_1 <- __::linq`select(pass_0, $(_) { return _._0; }); +finalize(pass_0); +var pass_2 <- __::linq`select(pass_1, $(_) { return _ * 2; }); +finalize(pass_1); +return <- pass_2; +``` + +**Classification**: FALLS-OFF — default cascade (3 buffer allocations + 2 finalize calls). + +**Conclusion**: Two `_select`s in a row bail at line 5486. Collapse N consecutive `_select`s into a single projection via repeated `peel_lambda_rename_var` + body composition. Same shape unblocks plan_loop_or_count. + +### 7d — Baseline: `zip` + `_select` + `sum` + +**Probe** (`/tmp/audit_probes/chain7_7d.das`): +```das +return _fold(each(a) |> zip(each(b)) |> _select(_._0 * _._1) |> sum()) +``` + +**Generated**: +```das +return invoke($(srcA : array; srcB : array) : int { + var acc = 0; + for (itA, itB in srcA, srcB) + let it : tuple = tuple(itA, itB) + acc += (it._0 * it._1) + return acc; +}, a, b) +``` + +**Classification**: SPLICE-FIRES — inline parallel `for` + accumulator, zero intermediate buffers. + +**Conclusion**: User-facing gap: 7d's wording (`zip(b) |> _select(_._0 * _._1) |> sum()`) is strictly less readable than 7a's `zip(b, $(x,y) => x*y) |> sum()` form, yet the latter falls off. Splice ergonomics suffer when the "fast path" requires the awkward spelling. + +### Chain 7 — follow-up TODOs + +- Pre-lower 3-arg `zip(a, b, sel)` to 2-arg `zip(a, b) |> _select(...)` inside `LinqFold.visit` (or hoist the selector into `projection` directly inside `plan_zip`). Closes 7a. +- Extend `plan_zip` to accept `_order_by` / `reverse` between zip and a terminator that needs full materialization anyway. Closes 7b and unblocks "top-K of zip" patterns. +- Collapse N consecutive `_select` projections (line 5486 + plan_loop_or_count's analog) — symmetric with how N consecutive `where_` already compose via `&&`. Closes 7c. + +--- + +## Chain 8 — `plan_decs_join` + +**Accepts**: `_join(srcA, srcB, on, into) [count]?` strictly binary, primitive keys, no intermediate chain ops (linq_fold.das:5267) +**Common bails**: post-join chain op of ANY kind (line 5284), non-primitive key type (lines 5296-5303), keya/keyb untyped (line 5293). + +### 8a — `_join` + post-join `_where` + `count` + +**Probe** (`/tmp/audit_probes/chain8_8a.das`): +```das +return _fold(from_decs_template(type) |> _join(from_decs_template(type), + $(l, r) => l.dealer_id == r.id, + $(l, r) => (CarName = l.name, DealerName = r.name)) + |> _where(_.CarName != "") + |> count()) +``` + +**Generated** (trimmed): +```das +var pass_0 <- __::linq`join_to_array(source, invoke($() : iterator<...> { + var res : array<...>; for_each_archetype(...) { ... push(...) } + return <- to_sequence(res); +}), keya_block, keyb_block, result_block); +var pass_1 <- __::linq`where_(pass_0, predicate); +finalize(pass_0); +var pass_2 = __::linq`count(pass_1); +finalize(pass_1); +``` + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: Bails at line 5284 because `_where` sits between `_join` and `count`. Worst-case: full materialization of BOTH dealer and car archetypes into per-iterator buffers before `join_impl` runs, plus a second `where_to_array` pass and a third `count`. Fix: wrap the count-bump in `if (predicate) { ... }` inside the probe loop. + +### 8b — `_join` + post-join `_select` + `to_array` + +**Probe** (`/tmp/audit_probes/chain8_8b.das`): +```das +return <- _fold(from_decs_template(type) |> _join(from_decs_template(type), ..., ...) + |> _select(_.CarName) + |> to_array()) +``` + +**Classification**: FALLS-OFF — default cascade (3 buffer allocations). + +**Conclusion**: Same bail (line 5284). The natural projection-shaping idiom — produce the full join row then project — is universally faster as inline projection. Fix is symmetric with 8a: accept single trailing `_select` and substitute body into the result lambda position. + +### 8c — Composite (tuple) join key + +**Probe** (`/tmp/audit_probes/chain8_8c.das`): +```das +return _fold(_join(srcA, srcB, $(l, r) => (l.dealer_id, l.id) == (r.region, r.id), ...) |> count()) +``` + +**Generated** (trimmed): `join_to_array` instantiated with `tuple` keys; `unique_key(tuple) : string` invoked by `join_impl`. + +**Classification**: FALLS-OFF — default cascade. + +**Conclusion**: Bails at primitive-key gate (lines 5296-5303): `keyType.baseType == Type.tTuple` not in whitelist. The `_join` macro itself accepts tuple-equi form, so the gate is the ONLY reason this falls off. Two fixes: (a) plumb `unique_key(keyBody)` into probe/insert sites of the splice (matches `join_impl`); or (b) accept tuples-of-primitives directly as `table; array<...>>` key, since daslang tables hash tuples natively. (b) is cleaner. + +### 8d — Baseline: `_join` + `count` + +**Generated**: +```das +return invoke($() : int { + var decs_jcnt = 0; + var decs_hash : table>>; + for_each_archetype(, , $(arch) { + for (dealer_id, dealer_name in get_ro(arch,"dealer_id",type), get_ro(arch,"dealer_name",type)) + var decs_tup_b = tuple(dealer_id, dealer_name) + push_clone(decs_hash[decs_tup_b.id], decs_tup_b) + }); + for_each_archetype(, , $(arch) { + for (car_id, car_dealer_id, car_name in get_ro(...) [...3 cols]) + var decs_tup_a = tuple(car_id, car_dealer_id, car_name) + get(decs_hash, decs_tup_a.dealer_id, $(var decs_jarr) { + decs_jcnt += length(decs_jarr); + }) + }); + return decs_jcnt; +}) +``` + +**Classification**: SPLICE-FIRES — single hash, two `for_each_archetype` passes, count bumped by bucket-length. + +**Conclusion**: Confirms hashed-join splice fires for bench-supported shape. Narrow surface is the issue: any meaningful post-processing reverts to full materialization. + +### Chain 8 — follow-up TODOs + +- Add a single-trailing-`_where` arm (mirror plan_zip's `whereCond` slot). Closes 8a + C6. +- Add a single-trailing-`_select` arm: substitute select's lambda body into result-push position. Closes 8b. +- Accept tuples-of-primitives as keys directly (`table; array<...>>`); cascade non-primitive structs to `unique_key`. Closes 8c. +- Document the splice's narrow shape in `LinqJoin`'s docstring. + +--- + +## Composition probes + +When a user chain combines two splice families, dispatch order (linq_fold.das:5700-5727) claims one and the other op bails the whole arm. Six obvious user-natural compositions, all FALLS-OFF: + +### C1 — Distinct + order + take + +**Why interesting**: "Top-K most recent distinct users" — both `_distinct_by` and `_order_by` have splice arms, neither tolerates the other op. + +**Probe** (`/tmp/audit_probes/comp_C1.das`): +```das +return <- _fold(each(items) |> _distinct_by(_.user) |> _order_by(_.ts) |> take(10) |> to_array()) +``` + +**Generated**: +```das +var pass_0 <- __::linq`distinct_by_to_array(source, $(_) { return _.user; }); +__::linq`order_by_inplace(pass_0, $(_) { return _.ts; }); +__::linq`take_inplace(pass_0, 10); +return <- pass_0; +``` + +**Classification**: FALLS-OFF — `plan_distinct` runs first (dispatch line 5712), sees non-distinct trailing op, returns null; `plan_order_family` runs second, sees `distinct_by` upstream, returns null. Tier-2 cascade. + +**Conclusion**: Two splice arms exist but the planner picks neither because each insists on owning the whole chain. Fast shape would be bounded-heap of size 10 keyed on `(seen_users_set, _.ts)` — collect into heap during single source pass, gated by set-insert success. Cross-splice composition is the obvious gap. + +### C2 — Group-by + select + order-by + to_array + +**Why interesting**: "Brands sorted by frequency" — canonical SQL `GROUP BY ... ORDER BY COUNT(*)`. + +**Probe** (`/tmp/audit_probes/comp_C2.das`): +```das +return <- _fold(each(items) + |> _group_by(_.brand) + |> _select((B = _._0, C = _._1 |> count())) + |> _order_by(_.C) + |> to_array()) +``` + +**Generated**: +```das +var pass_0 <- __::linq`group_by_lazy_to_array(source, $(_) { return _.brand; }); +var pass_1 <- __::linq`select(pass_0, $(_) { return tuple(_._0, __::linq`count(_._1)); }); +finalize(pass_0); +__::linq`order_by_inplace(pass_1, $(_) { return _.C; }); +return <- pass_1; +``` + +**Classification**: FALLS-OFF — `plan_group_by` bails because trailing op is `_order_by`. + +**Conclusion**: `plan_group_by_core` already builds the bucket map directly. Letting `_select` + `_order_by` consume the bucket inside the same emission would give 1 hashmap walk + 1 inplace sort but skip the intermediate `array>>` materialization. Cross-cuts with 7b/C1 observation. + +### C3 — Decs join + select + group_by + select + +**Why interesting**: "Join cars onto dealers, group by region, count" — universal BI shape on decs. + +**Probe** (`/tmp/audit_probes/comp_C3.das`): +```das +return <- _fold(_join(decsCars, decsDealers, on=..., into=(Region=r.region, CarName=l.name)) + |> _group_by(_.Region) + |> _select((R = _._0, N = _._1 |> count())) + |> to_array()) +``` + +**Classification**: FALLS-OFF — `plan_decs_join` bails at 5284 (trailing chain ops); `plan_decs_group_by` requires a decs source on top, not a `_join` invoke; default cascade builds dealer-array → join-array → group-map → select-array. Three intermediate allocations. + +**Conclusion**: This is the "killer demo" composition. The structural fix is to refactor `plan_decs_join` so its emission integrates with `plan_decs_group_by`'s bucket-fill — instead of `push_clone(buf, result_lam(...))` in the probe loop, emit `bucket[keyExpr] |> push_clone(...)` directly. Largest single architectural change suggested by the audit. + +### C4 — Zip + reverse + to_array + +**Why interesting**: "Pair two parallel sequences, walk backward" — natural for time-reversed analyses. + +**Probe** (`/tmp/audit_probes/comp_C4.das`): +```das +return <- _fold(each(a) |> zip(each(b)) |> reverse() |> to_array()) +``` + +**Generated**: +```das +var pass_0 <- __::linq`zip_to_array(source, each(b)); +__::linq`reverse_inplace(pass_0); +return <- pass_0; +``` + +**Classification**: FALLS-OFF — `plan_zip` lists `reverse` as unrecognized op (line 5528); `plan_reverse` doesn't recognize a 2-source zip head. + +**Conclusion**: Cheapest fall-off in absolute cost (1 buffer + 1 inplace), but trivial to absorb: zip's natural emission can be `for i in length downto 0` parallel `for` — 1-line change when `reverse` is the only intermediate. Bundle with the 7b TODO. + +### C5 — Order-by + distinct + take + to_array + +**Why interesting**: Variant on C1 with order-first. + +**Probe** (`/tmp/audit_probes/comp_C5.das`): +```das +return <- _fold(each(items) |> _order_by(_.score) |> distinct() |> take(10) |> to_array()) +``` + +**Classification**: FALLS-OFF — `plan_order_family` doesn't recognize `distinct`; `plan_distinct` doesn't recognize `_order_by` upstream. + +**Conclusion**: Identical reasoning to C1, operator-order swapped. Confirms the "two splice families never cooperate" pattern is symmetric — not a property of which arm runs first. + +### C6 — Decs_join + post-join filter + +**Why interesting**: Composition view of 8a — confirms the failure mode is the same whether reached via "splice arm couldn't extend" or "two arms collide". + +**Probe** (`/tmp/audit_probes/comp_C6.das`): same as 8a. + +**Classification**: FALLS-OFF — same root cause as 8a (linq_fold.das:5284). + +**Conclusion**: Listed here to make the symmetry explicit. Closing 8a TODO closes this row too. + +### Composition — cross-cutting observation + +Five of six composition probes (C1, C2, C3, C5, C6) are blocked by the same architectural pattern: each splice arm currently requires `flatten_linq` to yield a contiguous run of recognized ops, with the planner pipeline trying one arm at a time and falling to tier-2 the moment ANY arm refuses. There is no cross-arm composition mechanism. The highest-leverage next investment isn't another arm — it's a "compose-aware" planner step that walks the call chain once, attributes each op to a candidate arm (or "boundary op" like `_where`/`_select` that any arm can host), and stitches the emissions. C4 is the lone outlier where one arm could absorb the second op trivially; the other five point at the same missing infrastructure. + +--- + +## Cross-cutting findings + +Synthesizing the per-chain TODOs into prioritized themes: + +### Theme 1 — Terminal `_select` extension (HIGH impact, MEDIUM effort) + +Recurs in: **chains 1, 2, 7, 8**. Almost every arm that produces a buffer or holds a bounded-K state could accept a terminal `_select` that projects during the emission/return — currently bails almost universally. The bounded-heap, R5 buffer, and join probe-loop arms all hold `≤K` or per-element values they then need to discard or project; absorbing the projection is a small qmacro splice each. + +Specific arms to extend: +- `plan_order_family` line 1234 + `plan_decs_order_family` line 4547 — accept terminal `_select` after `take(N)` / `first` / `first_or_default`. Closes 1a, 1e. +- `plan_reverse` line 1764 + `plan_decs_reverse` line 4802 — accept terminal `_select` after `reverse [take(N)]`. Closes 2c, 2e. +- `plan_decs_join` line 5267 — accept single trailing `_select` substituting into result lambda. Closes 8b. +- `plan_zip` line 5395 — pre-lower 3-arg `zip(a, b, sel)` to 2-arg `zip(a, b) |> _select(sel)`. Closes 7a. + +### Theme 2 — Trailing `_where` / HAVING (HIGH impact, MEDIUM effort) + +Recurs in: **chains 4, 5, 8**. The "trailing post-aggregate filter" idiom is universal in SQL-like usage and falls off whenever it appears in a splice arm: +- `plan_group_by_core` — peel trailing `_where` to synthetic `having_` slot (closes 4a, 4e). +- `plan_decs_join` — accept single trailing `_where` mirroring plan_zip's `whereCond` (closes 8a, C6). +- `plan_loop_or_count` — counter-lane with take-cap that ticks BEFORE the where filter (closes 5c). + +### Theme 3 — Cross-arm composition (HIGHEST impact, LARGE effort) + +Recurs in: **5 of 6 composition probes** (C1, C2, C3, C5, C6). The planner pipeline tries arms in order and fails to tier-2 if any arm refuses; there is no mechanism for two arms to share a chain. The structural fix is a "compose-aware" planner step that: +1. Walks the call chain once +2. Attributes each call to a candidate arm or to a "boundary op" (`_where` / `_select` are universal) +3. Stitches arm emissions at boundary points (e.g. plan_decs_join emits `bucket[keyExpr] |> push_clone(...)` directly into plan_decs_group_by's bucket-fill loop) + +This is the largest architectural change suggested by the audit but unlocks the most common BI-style queries. Closes C1, C2, C3, C5. + +### Theme 4 — 2-arg terminator predicates (LOW effort, MEDIUM impact) + +Recurs in: **chain 3, chain 5, chain 7**. Several splice arms only accept 1-arg `count()` / `long_count()` / `sum()` etc., and silently bail when the user adds a predicate. The extension is trivial: emit `if (p(it)) cnt++` at the existing increment site. + +- `plan_distinct` line 1953, `plan_decs_distinct` line 5057 — accept 2-arg `count(p)` / `long_count(p)`. Closes 3c. +- `plan_zip` lines 5412-5436 — same shape. (Not probed explicitly but observed in agent 1 inventory.) +- `plan_decs_unroll` line 4458 — same shape. + +### Theme 5 — `_order_by(k).reverse()` → `_order_by_descending(k)` normalization + +Recurs in: **chains 1, 2**. Pure rewrite at the macro level, before any planner sees the chain. Closes 1b, 2b. Trivial to implement; sized like a half-day. + +### Theme 6 — Decs-bridge double penalty (MEDIUM impact, LOW effort) + +Whenever a `plan_decs_*` arm bails, the `from_decs_template` bridge degenerates to full `for_each_archetype` materialization into a temp `res` array, which is then wrapped in `to_sequence` for the array-side cascade. This costs an EXTRA allocation on top of whatever cascade follows. + +Fix: in `FromDecsMacro` (or at the `_fold` dispatch point), emit a diagnostic (`compile_warning` style) when the bridge survives without any decs-side splice arm claiming it. Doesn't fix the underlying chain but tells the user where the perf cliff is. + +### Theme 7 — Chained `_select` collapse + +Recurs in: **chain 5 (5b), chain 7 (7c)**. N consecutive `_select` projections should collapse into a single projection via repeated `peel_lambda_rename_var` + body composition — symmetric with how N consecutive `_where` already compose via `&&`. Same mechanism unblocks both plan_loop_or_count and plan_zip. + +### Theme 8 — Specialized fusion arms (low priority) + +Recurs in: **chains 2, 3, C4**. Several "two specific arms could fuse" cases: +- `reverse + distinct_by` (chain 2a) — single walk retaining LAST element per key. +- `_distinct_by(keyFn) + _order_by(otherKey)` (chain 3b) — hash-track + bounded sort walk. +- `zip + reverse` (C4) — emit `for i in length downto 0`. + +Each is small and self-contained. Lower priority than themes 1-3 but cheap follow-ups when in the area. + +### Out-of-scope observations + +- **`linq_fold_patterns.rst` cross-check**: this audit did NOT systematically verify that every "splice arm exists" claim in the RST page is reachable via the canonical chain shape. A future doc-only PR should walk the RST table row-by-row and probe each shape (most are covered above; rows not represented are likely doc-only fictions). +- **JIT verification**: all probes here are INTERP-only. The JIT lane may behave differently — e.g. the bounded-heap arm's `spliced_push_heap` may or may not optimize well under llvm_jit. +- **Bench impact quantification**: the cross-cutting findings are ordered by "how natural is the user phrasing" + "how expensive is the cascade", not by measured ns/op. A follow-up bench round (writing N FALLS-OFF chains as new benches, measuring fall-off cost) would sharpen the prioritization. + +--- + +## How to re-run + +The audit is reproducible. Per-probe workflow: + +```bash +# Single probe: compile + dump +mcp__daslang__compile_check /tmp/audit_probes/chain1_1a.das +mcp__daslang__ast_dump file=/tmp/audit_probes/chain1_1a.das function=probe_1a mode=source + +# Whole audit: compile all probes +for f in /tmp/audit_probes/*.das; do + mcp__daslang__compile_check "$f" +done +``` + +To re-create the probe set after deleting `/tmp/audit_probes/`, follow each probe's "Probe" code block — each is self-contained (`options gen2` + `require` lines + struct + one `[export] def probe_NX` + stub `def main(){}`). The audit doesn't depend on any fixture outside the probe files themselves. + +Classification rules: +- `for_each_archetype` + inline state (heap, accumulator, counter, table) → **SPLICE-FIRES** +- `__::linq\`*_to_array\`` / `__::linq\`*_inplace\`` / cascade of `pass_0 → pass_1 → ...` → **FALLS-OFF** +- Direct `min_by_impl` / `top_n_by_impl` invocation without inlining → **BAILS-TO tier-2** diff --git a/benchmarks/sql/results.md b/benchmarks/sql/results.md index 4020ed60f..999c7add9 100644 --- a/benchmarks/sql/results.md +++ b/benchmarks/sql/results.md @@ -1,6 +1,6 @@ # Benchmarks — SQL / Array / Decs comparison -Generated 2026-05-23 from `62336a4a7` (PR for `plan_decs_join`). +Generated 2026-05-24 from `4b13eed9a` (Theme 2 — trailing-`_where` extension). Fixture size: n = 100 000 (cars), 100 dealers, 5 brands. Each row is one bench family in `benchmarks/sql/`; columns are nanoseconds per logical operation. `—` marks an intentionally absent lane — see @@ -26,114 +26,113 @@ before the timer resolution can measure them — they should be read as | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | Decs vs Array | |---|---:|---:|---:|---:| -| `aggregate_match` | 35.3 | 5.9 | 5.8 | 0.98× | -| `all_match` | 28.1 | 3.6 | 3.5 | 0.97× | +| `aggregate_match` | 35.1 | 6.0 | 5.8 | 0.97× | +| `all_match` | 27.8 | 3.6 | 3.5 | 0.97× | | `any_match` | 0.00 | 0.00 | 0.00 | — | -| `average_aggregate` | 29.9 | 5.9 | 8.8 | 1.49× | -| `bare_order_where` | 278.2 | 118.4 | 126.6 | 1.07× | -| `chained_where` | 38.5 | 6.7 | 6.7 | 1.00× | +| `average_aggregate` | 29.9 | 6.4 | 10.4 | 1.62× | +| `bare_order_where` | 278.4 | 119.0 | 126.8 | 1.07× | +| `chained_where` | 36.0 | 6.7 | 6.7 | 1.00× | | `contains_match` | 0.00 | 2.2 | 1.4 | 0.64× | -| `count_aggregate` | 29.1 | 4.1 | 4.2 | 1.02× | -| `distinct_by_count` | 40.6 | 15.8 | 16.0 | 1.01× | -| `distinct_count` | 41.1 | 16.1 | 16.0 | 0.99× | +| `count_aggregate` | 29.2 | 4.1 | 4.1 | 1.00× | +| `distinct_by_count` | 41.0 | 15.7 | 16.1 | 1.03× | +| `distinct_count` | 41.0 | 16.1 | 16.0 | 0.99× | | `distinct_take` | 0.00 | 0.00 | 0.00 | — | | `element_at_match` | 0.00 | 0.00 | 0.00 | — | | `first_match` | 0.00 | 0.00 | 0.00 | — | | `first_or_default_match` | 0.00 | 0.00 | 0.00 | — | -| `groupby_average` | 174.3 | 30.3 | 30.2 | 1.00× | -| `groupby_count` | 144.3 | 19.4 | 19.3 | 0.99× | -| `groupby_first` | — | 20.0 | 19.3 | 0.97× | -| `groupby_having_count` | 142.6 | 19.3 | 19.4 | 1.01× | -| `groupby_having_hidden_sum` | 175.7 | 24.5 | 24.1 | 0.98× | -| `groupby_max` | 176.7 | 25.0 | 25.4 | 1.02× | -| `groupby_min` | 175.7 | 25.1 | 25.4 | 1.01× | -| `groupby_multi_reducer` | 191.4 | 33.7 | 32.7 | 0.97× | -| `groupby_select_sum` | 207.1 | 36.8 | 36.7 | 1.00× | -| `groupby_sum` | 172.4 | 18.8 | 18.8 | 1.00× | -| `groupby_where_count` | 75.7 | 14.7 | 15.0 | 1.02× | -| `groupby_where_sum` | 86.7 | 14.3 | 14.7 | 1.03× | -| `indexed_lookup` | 1454.5 | 204673.2 | 472.2 | 0.00× | -| `join_count` | 38.0 | 121.2 | 64.0 | 0.53× | +| `groupby_average` | 171.0 | 30.3 | 30.5 | 1.01× | +| `groupby_count` | 143.6 | 19.8 | 19.5 | 0.98× | +| `groupby_first` | — | 18.6 | 19.3 | 1.04× | +| `groupby_having_count` | 142.7 | 19.2 | 19.3 | 1.01× | +| `groupby_having_hidden_sum` | 177.5 | 24.5 | 24.1 | 0.98× | +| `groupby_max` | 176.1 | 25.2 | 25.4 | 1.01× | +| `groupby_min` | 176.4 | 25.1 | 25.4 | 1.01× | +| `groupby_multi_reducer` | 191.6 | 32.5 | 32.6 | 1.00× | +| `groupby_select_sum` | 210.3 | 36.8 | 36.5 | 0.99× | +| `groupby_sum` | 173.0 | 18.8 | 18.8 | 1.00× | +| `groupby_where_count` | 75.6 | 14.8 | 15.0 | 1.01× | +| `groupby_where_sum` | 86.9 | 14.3 | 14.8 | 1.03× | +| `indexed_lookup` | 1499.4 | 204476.6 | 495.4 | 0.00× | +| `join_count` | 38.2 | 122.4 | 64.1 | 0.52× | | `last_match` | 0.00 | 5.9 | 14.0 | 2.37× | -| `long_count_aggregate` | 29.3 | 4.2 | 4.2 | 1.00× | -| `max_aggregate` | 30.8 | 6.1 | 6.9 | 1.13× | -| `min_aggregate` | 30.4 | 6.2 | 6.9 | 1.11× | -| `order_take_desc` | 38.1 | 15.9 | 20.1 | 1.26× | +| `long_count_aggregate` | 29.6 | 4.3 | 4.1 | 0.95× | +| `max_aggregate` | 30.5 | 6.1 | 6.9 | 1.13× | +| `min_aggregate` | 30.4 | 6.4 | 6.9 | 1.08× | +| `order_take_desc` | 37.8 | 16.0 | 20.1 | 1.26× | | `reverse_take` | 0.10 | 0.00 | 9.3 | — | -| `select_count` | 0.10 | 0.00 | 2.2 | — | -| `select_where` | 194.2 | 11.1 | 19.5 | 1.76× | -| `select_where_count` | 32.5 | 5.2 | 7.4 | 1.42× | -| `select_where_order_take` | 36.4 | 12.2 | 14.9 | 1.22× | -| `select_where_sum` | 37.0 | 7.5 | 7.5 | 1.00× | +| `select_count` | 0.10 | 0.00 | 2.9 | — | +| `select_where` | 193.4 | 11.2 | 22.2 | 1.98× | +| `select_where_count` | 32.6 | 5.2 | 7.4 | 1.42× | +| `select_where_order_take` | 36.3 | 12.2 | 14.8 | 1.21× | +| `select_where_sum` | 37.1 | 7.8 | 7.5 | 0.96× | | `single_match` | 0.00 | 2.9 | 5.5 | 1.90× | | `skip_take` | 0.50 | 0.10 | 0.20 | 2.00× | -| `skip_while_match` | 3.4 | 5.3 | 5.3 | 1.00× | -| `sort_first` | 37.9 | 11.1 | 13.4 | 1.21× | -| `sort_take` | 38.1 | 16.4 | 20.3 | 1.24× | -| `sum_aggregate` | 30.0 | 2.2 | 2.1 | 0.95× | -| `sum_where` | 32.8 | 4.3 | 4.3 | 1.00× | -| `take_count` | 3.6 | 0.20 | 0.40 | 2.00× | +| `skip_while_match` | 3.5 | 5.3 | 5.3 | 1.00× | +| `sort_first` | 37.9 | 11.5 | 13.5 | 1.17× | +| `sort_take` | 38.1 | 16.4 | 20.4 | 1.24× | +| `sum_aggregate` | 30.1 | 2.2 | 2.1 | 0.95× | +| `sum_where` | 33.0 | 4.3 | 4.3 | 1.00× | +| `take_count` | 3.7 | 0.20 | 0.40 | 2.00× | | `take_count_filtered` | — | 0.20 | 0.20 | 1.00× | | `take_sum_aggregate` | — | 0.10 | 0.10 | 1.00× | -| `take_while_match` | 7.9 | 2.5 | 2.5 | 1.00× | -| `to_array_filter` | 70.1 | 11.7 | 11.9 | 1.02× | -| `zip_dot_product` | — | 8.1 | 4.8 | 0.59× | +| `take_while_match` | 8.0 | 2.5 | 2.5 | 1.00× | +| `to_array_filter` | 70.3 | 11.8 | 11.8 | 1.00× | +| `zip_dot_product` | — | 8.0 | 4.8 | 0.60× | ## JIT - | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | Decs vs Array | |---|---:|---:|---:|---:| -| `aggregate_match` | 34.4 | 0.40 | 0.70 | 1.75× | -| `all_match` | 27.4 | 0.30 | 0.20 | 0.67× | +| `aggregate_match` | 34.3 | 0.40 | 0.70 | 1.75× | +| `all_match` | 27.4 | 0.40 | 0.20 | 0.50× | | `any_match` | 0.00 | 0.00 | 0.00 | — | -| `average_aggregate` | 29.7 | 1.0 | 3.6 | 3.60× | -| `bare_order_where` | 185.9 | 33.7 | 35.0 | 1.04× | -| `chained_where` | 35.9 | 0.60 | 0.80 | 1.33× | +| `average_aggregate` | 29.8 | 1.0 | 3.6 | 3.60× | +| `bare_order_where` | 187.4 | 33.8 | 35.0 | 1.04× | +| `chained_where` | 36.1 | 0.60 | 0.80 | 1.33× | | `contains_match` | 0.00 | 0.20 | 0.10 | 0.50× | -| `count_aggregate` | 29.0 | 0.40 | 0.60 | 1.50× | +| `count_aggregate` | 29.1 | 0.40 | 0.60 | 1.50× | | `distinct_by_count` | 40.8 | 2.1 | 2.1 | 1.00× | -| `distinct_count` | 41.0 | 2.1 | 2.1 | 1.00× | +| `distinct_count` | 41.2 | 2.1 | 2.1 | 1.00× | | `distinct_take` | 0.00 | 0.00 | 0.00 | — | | `element_at_match` | 0.00 | 0.00 | 0.00 | — | | `first_match` | 0.00 | 0.00 | 0.00 | — | | `first_or_default_match` | 0.00 | 0.00 | 0.00 | — | -| `groupby_average` | 170.7 | 2.6 | 2.9 | 1.12× | -| `groupby_count` | 141.1 | 2.4 | 2.5 | 1.04× | +| `groupby_average` | 171.0 | 2.6 | 2.9 | 1.12× | +| `groupby_count` | 142.0 | 2.4 | 2.5 | 1.04× | | `groupby_first` | — | 2.2 | 3.1 | 1.41× | -| `groupby_having_count` | 147.0 | 2.4 | 2.5 | 1.04× | -| `groupby_having_hidden_sum` | 174.1 | 2.5 | 2.8 | 1.12× | -| `groupby_max` | 172.0 | 2.4 | 2.7 | 1.13× | -| `groupby_min` | 174.2 | 2.4 | 2.7 | 1.13× | -| `groupby_multi_reducer` | 191.1 | 2.7 | 3.0 | 1.11× | -| `groupby_select_sum` | 198.8 | 3.2 | 3.7 | 1.16× | -| `groupby_sum` | 173.6 | 2.4 | 2.7 | 1.13× | -| `groupby_where_count` | 75.6 | 1.7 | 1.8 | 1.06× | -| `groupby_where_sum` | 86.8 | 1.7 | 1.8 | 1.06× | -| `indexed_lookup` | 1266.6 | 36139.0 | 104.1 | 0.00× | -| `join_count` | 38.0 | 36.2 | 13.3 | 0.37× | -| `last_match` | 0.00 | 0.60 | 1.4 | 2.33× | -| `long_count_aggregate` | 29.3 | 0.40 | 0.60 | 1.50× | -| `max_aggregate` | 30.6 | 0.60 | 0.50 | 0.83× | -| `min_aggregate` | 30.7 | 0.60 | 0.50 | 0.83× | -| `order_take_desc` | 37.8 | 0.70 | 1.4 | 2.00× | +| `groupby_having_count` | 141.3 | 2.4 | 2.5 | 1.04× | +| `groupby_having_hidden_sum` | 175.4 | 2.5 | 2.8 | 1.12× | +| `groupby_max` | 172.6 | 2.4 | 2.7 | 1.13× | +| `groupby_min` | 173.8 | 2.4 | 2.7 | 1.13× | +| `groupby_multi_reducer` | 190.9 | 2.7 | 3.0 | 1.11× | +| `groupby_select_sum` | 207.6 | 3.2 | 3.7 | 1.16× | +| `groupby_sum` | 170.5 | 2.4 | 2.7 | 1.13× | +| `groupby_where_count` | 76.1 | 1.7 | 1.8 | 1.06× | +| `groupby_where_sum` | 87.0 | 1.7 | 1.9 | 1.12× | +| `indexed_lookup` | 1258.1 | 35549.6 | 103.3 | 0.00× | +| `join_count` | 37.9 | 36.1 | 13.4 | 0.37× | +| `last_match` | 0.00 | 0.50 | 1.4 | 2.80× | +| `long_count_aggregate` | 36.6 | 0.40 | 0.70 | 1.75× | +| `max_aggregate` | 48.1 | 0.70 | 0.50 | 0.71× | +| `min_aggregate` | 31.7 | 0.70 | 0.50 | 0.71× | +| `order_take_desc` | 37.9 | 0.70 | 1.4 | 2.00× | | `reverse_take` | 0.00 | 0.00 | 1.1 | — | | `select_count` | 0.10 | 0.00 | 0.00 | — | -| `select_where` | 105.6 | 4.1 | 5.5 | 1.34× | +| `select_where` | 105.6 | 4.7 | 5.5 | 1.17× | | `select_where_count` | 32.4 | 0.40 | 0.60 | 1.50× | | `select_where_order_take` | 36.4 | 0.70 | 1.4 | 2.00× | -| `select_where_sum` | 36.9 | 0.50 | 0.60 | 1.20× | +| `select_where_sum` | 36.8 | 0.50 | 0.60 | 1.20× | | `single_match` | 0.00 | 0.40 | 1.1 | 2.75× | | `skip_take` | 0.30 | 0.00 | 0.00 | — | -| `skip_while_match` | 3.5 | 0.40 | 0.40 | 1.00× | -| `sort_first` | 37.5 | 0.40 | 1.3 | 3.25× | -| `sort_take` | 38.0 | 0.70 | 1.4 | 2.00× | -| `sum_aggregate` | 30.3 | 0.40 | 0.30 | 0.75× | +| `skip_while_match` | 3.4 | 0.40 | 0.40 | 1.00× | +| `sort_first` | 37.4 | 0.40 | 1.3 | 3.25× | +| `sort_take` | 37.9 | 0.70 | 1.4 | 2.00× | +| `sum_aggregate` | 29.9 | 0.40 | 0.40 | 1.00× | | `sum_where` | 33.0 | 0.40 | 0.60 | 1.50× | | `take_count` | 1.8 | 0.10 | 0.10 | 1.00× | | `take_count_filtered` | — | 0.00 | 0.00 | — | | `take_sum_aggregate` | — | 0.00 | 0.00 | — | | `take_while_match` | 8.0 | 0.20 | 0.30 | 1.50× | -| `to_array_filter` | 48.3 | 3.2 | 3.4 | 1.06× | +| `to_array_filter` | 48.5 | 3.3 | 3.4 | 1.03× | | `zip_dot_product` | — | 0.50 | 0.50 | 1.00× | ## Notes on missing lanes (the `—` cells) diff --git a/daslib/linq_fold.das b/daslib/linq_fold.das index 89504c0cd..0faf8bf21 100644 --- a/daslib/linq_fold.das +++ b/daslib/linq_fold.das @@ -664,6 +664,7 @@ def private emit_accumulator_lane( var topExprs : array; var projection : Expression?; var whereCond : Expression?; + var postTakeWhereCond : Expression?; var intermediateBinds : array; var preCondStmts : array; var elementType : TypeDeclPtr; @@ -745,6 +746,12 @@ def private emit_accumulator_lane( } perMatchStmts |> push_from <| build_accumulator_perelement_stmts(opName, accName, valBindName, firstName, cntName, valueExpr, workhorse, isDoubleAccType) prepend_binds(perMatchStmts, intermediateBinds) + // Theme 2 5c: post-take where wraps the per-match work (acc++ / += / cmp+update) BEFORE the take cap so the cap still ticks unconditionally per iteration. + if (postTakeWhereCond != null) { + var gated = wrap_with_condition(stmts_to_expr(perMatchStmts), postTakeWhereCond) + perMatchStmts |> clear + perMatchStmts |> push(gated) + } wrap_with_ranges(perMatchStmts, skipExpr, takeExpr, skipWhileCond, takeWhileCond, names) var loopBody = prepend_precond(wrap_with_condition(stmts_to_expr(perMatchStmts), whereCond), preCondStmts) // Collect all body statements into one list so they share scope when spliced via $b. @@ -789,6 +796,7 @@ def private emit_early_exit_lane( var topExprs : array; var projection : Expression?; var whereCond : Expression?; + var postTakeWhereCond : Expression?; var intermediateBinds : array; var preCondStmts : array; var elementType : TypeDeclPtr; @@ -1088,6 +1096,12 @@ def private emit_early_exit_lane( return null } prepend_binds(perMatchStmts, intermediateBinds) + // Theme 2 5c: post-take where wraps the per-match work BEFORE the take cap so the cap still ticks unconditionally per iteration. + if (postTakeWhereCond != null) { + var gated = wrap_with_condition(stmts_to_expr(perMatchStmts), postTakeWhereCond) + perMatchStmts |> clear + perMatchStmts |> push(gated) + } wrap_with_ranges(perMatchStmts, skipExpr, takeExpr, skipWhileCond, takeWhileCond, names) var loopBody = prepend_precond(wrap_with_condition(stmts_to_expr(perMatchStmts), whereCond), preCondStmts) // Single-$b body so all stmts (skip/take counters + prelude + for + tail) share scope @@ -1243,6 +1257,8 @@ def private plan_order_family(var expr : Expression?) : Expression? { var firstName : string var firstDefaultExpr : Expression? var hasOrder = false + var selectLam : Expression? + var selectElemType : TypeDeclPtr let at = calls[0]._0.at let itName = qn("it", at) for (i in 0 .. length(calls)) { @@ -1274,13 +1290,20 @@ def private plan_order_family(var expr : Expression?) : Expression? { if (arg == null || arg._type == null || arg._type.baseType != Type.tInt) return null takeExpr = clone_expression(arg) } elif (name == "first" || name == "first_or_default") { - // order + first → min/max (O(N) instead of sort + index). Must be terminal. + // order + first → min/max (O(N) instead of sort + index). Must be terminal (no select after). if (!hasOrder || takeExpr != null || firstName != "" || i != length(calls) - 1) return null firstName = name if (name == "first_or_default") { if ((cll._0.arguments |> length) < 2) return null firstDefaultExpr = clone_expression(cll._0.arguments[1]) } + } elif (name == "select") { + // Terminal _select after take/first: project at return, heap cmp stays on source type. + if (i != length(calls) - 1 || !hasOrder + || cll._0._type == null || cll._0._type.firstType == null) return null + selectLam = cll._0.arguments[1] + if (selectLam == null) return null + selectElemType = clone_type(cll._0._type.firstType) } else { return null } @@ -1299,6 +1322,8 @@ def private plan_order_family(var expr : Expression?) : Expression? { // Streaming-min / bounded-heap fast paths (mirror of plan_decs_order_family). When the key is inline-able, skip the materialize-all + min_by/top_n* dispatch in favor of a per-walk state (single best for first[_or_default], heap of size N for take). For first[_or_default]: avoids the per-element `invoke(keyLambda, x)` cost in min_by_impl (~28 ns/op win on 100K-row sort_first). For take(N): avoids materializing the full filtered set before top_n_by (~7-9 ns/op win on sort_take / select_where_order_take). let useBoundedHeap = takeExpr != null && inlineCmp != null && firstName == "" let useStreamingMin = firstName != "" && inlineCmp != null + // Terminal _select only splices on inline-cmp / where_+order paths; direct calls would re-emit the cascade. + if (selectLam != null && !useStreamingMin && !useBoundedHeap && whereCond == null) return null if (useStreamingMin) { let bestName = qn("order_best", at) let seenName = qn("order_seen", at) @@ -1325,27 +1350,43 @@ def private plan_order_family(var expr : Expression?) : Expression? { } } var emission : Expression? + let outElemType = (selectLam != null) ? selectElemType : elemType if (firstName == "first") { - emission = qmacro(invoke($($i(srcName) : $t(srcParamType)) : $t(elemType) { + var firstRetExpr : Expression? + if (selectLam != null) { + firstRetExpr = peel_lambda_replace_var(selectLam, qmacro($i(bestName))) + } else { + firstRetExpr = qmacro($i(bestName)) + } + emission = qmacro(invoke($($i(srcName) : $t(srcParamType)) : $t(outElemType) { var $i(bestName) = default<$t(elemType)> var $i(seenName) = false for ($i(itName) in $i(srcName)) { $e(perElement) } panic("sequence contains no elements") if (!$i(seenName)) - return $i(bestName) + return $e(firstRetExpr) }, $e(topExpr))) } else { let dBindName = qn("order_d", at) - emission = qmacro(invoke($($i(srcName) : $t(srcParamType)) : $t(elemType) { + var bestRetExpr : Expression? + var dRetExpr : Expression? + if (selectLam != null) { + bestRetExpr = peel_lambda_replace_var(selectLam, qmacro($i(bestName))) + dRetExpr = peel_lambda_replace_var(selectLam, qmacro($i(dBindName))) + } else { + bestRetExpr = qmacro($i(bestName)) + dRetExpr = qmacro($i(dBindName)) + } + emission = qmacro(invoke($($i(srcName) : $t(srcParamType)) : $t(outElemType) { let $i(dBindName) = $e(firstDefaultExpr) var $i(bestName) = default<$t(elemType)> var $i(seenName) = false for ($i(itName) in $i(srcName)) { $e(perElement) } - return $i(bestName) if ($i(seenName)) - return $i(dBindName) + return $e(bestRetExpr) if ($i(seenName)) + return $e(dRetExpr) }, $e(topExpr))) } return finalize_invoke(emission, at) @@ -1378,16 +1419,39 @@ def private plan_order_family(var expr : Expression?) : Expression? { } } // No `reserve(takeN)` on the bounded buf — matches the upstream top_n_by_with_cmp iterator-variant policy (linq.das:482-484). Caller may pass takeN >> actual source size, so pre-reserving N risks a large upfront allocation for no win. - var emission : Expression? = qmacro(invoke($($i(srcName) : $t(srcParamType)) : array<$t(bufElemType)> { - let $i(takeNName) = $e(takeExpr) - var $i(bhBufName) : array<$t(bufElemType)> - return <- $i(bhBufName) if ($i(takeNName) <= 0) - for ($i(itName) in $i(srcName)) { - $e(perElement) - } - _::order_inplace($i(bhBufName), $e(inlineCmp)) - return <- $i(bhBufName) - }, $e(topExpr))) + var emission : Expression? + if (selectLam != null) { + // Terminal _select projects ≤K heap survivors at return (heap holds raw type for cmp). + let outBufName = qn("order_proj_buf", at) + let elemName = qn("order_proj_e", at) + var projBody = peel_lambda_replace_var(selectLam, qmacro($i(elemName))) + emission = qmacro(invoke($($i(srcName) : $t(srcParamType)) : array<$t(selectElemType)> { + let $i(takeNName) = $e(takeExpr) + var $i(bhBufName) : array<$t(bufElemType)> + var $i(outBufName) : array<$t(selectElemType)> + return <- $i(outBufName) if ($i(takeNName) <= 0) + for ($i(itName) in $i(srcName)) { + $e(perElement) + } + _::order_inplace($i(bhBufName), $e(inlineCmp)) + $i(outBufName) |> reserve(length($i(bhBufName))) + for ($i(elemName) in $i(bhBufName)) { + $i(outBufName) |> push_clone($e(projBody)) + } + return <- $i(outBufName) + }, $e(topExpr))) + } else { + emission = qmacro(invoke($($i(srcName) : $t(srcParamType)) : array<$t(bufElemType)> { + let $i(takeNName) = $e(takeExpr) + var $i(bhBufName) : array<$t(bufElemType)> + return <- $i(bhBufName) if ($i(takeNName) <= 0) + for ($i(itName) in $i(srcName)) { + $e(perElement) + } + _::order_inplace($i(bhBufName), $e(inlineCmp)) + return <- $i(bhBufName) + }, $e(topExpr))) + } if (needIterWrap) { emission = qmacro($e(emission).to_sequence_move()) } @@ -1490,6 +1554,13 @@ def private plan_order_family(var expr : Expression?) : Expression? { $e(loopBody) } } + // Terminal _select projects at return; buffer/scalar carries source type so cmp/sort sees raw. + let elemName = qn("order_proj_e", at) + let outBufName = qn("order_proj_buf", at) + var projBody : Expression? + if (selectLam != null) { + projBody = peel_lambda_replace_var(selectLam, qmacro($i(elemName))) + } if (firstName == "first") { // where + order + first → min/max on prefilter buffer. Empty buf must panic to match eager `first()` semantics; min/max return uninitialized refs on empty. stmts |> push <| qmacro_expr() { @@ -1501,8 +1572,15 @@ def private plan_order_family(var expr : Expression?) : Expression? { } else { minMaxCall = qmacro($c(minMaxName)($i(bufName))) } - stmts |> push <| qmacro_expr() { - return $e(minMaxCall) + if (selectLam != null) { + stmts |> push_from <| qmacro_block_to_array() { + let $i(elemName) = $e(minMaxCall) + return $e(projBody) + } + } else { + stmts |> push <| qmacro_expr() { + return $e(minMaxCall) + } } } elif (firstName == "first_or_default") { // No min_by_or_default helper exists; route through top_n*(_, 1, _) + first_or_default for the empty-buf case. @@ -1514,8 +1592,18 @@ def private plan_order_family(var expr : Expression?) : Expression? { } else { topNCall = qmacro($c(topNName)($i(bufName), 1)) } - stmts |> push <| qmacro_expr() { - return _::first_or_default($e(topNCall), $e(firstDefaultExpr)) + if (selectLam != null) { + // first_or_default + select: bind default once (side-effect order), project both branches. + let dBindName = qn("order_d", at) + stmts |> push_from <| qmacro_block_to_array() { + let $i(dBindName) = $e(firstDefaultExpr) + let $i(elemName) = _::first_or_default($e(topNCall), $i(dBindName)) + return $e(projBody) + } + } else { + stmts |> push <| qmacro_expr() { + return _::first_or_default($e(topNCall), $e(firstDefaultExpr)) + } } } elif (takeExpr == null) { // Sort the prefilter buffer in place and return it. order*_inplace is void @@ -1529,8 +1617,19 @@ def private plan_order_family(var expr : Expression?) : Expression? { sortCall = qmacro($c(inplaceName)($i(bufName))) } stmts |> push(sortCall) - stmts |> push <| qmacro_expr() { - return <- $i(bufName) + if (selectLam != null) { + stmts |> push_from <| qmacro_block_to_array() { + var $i(outBufName) : array<$t(selectElemType)> + $i(outBufName) |> reserve(length($i(bufName))) + for ($i(elemName) in $i(bufName)) { + $i(outBufName) |> push_clone($e(projBody)) + } + return <- $i(outBufName) + } + } else { + stmts |> push <| qmacro_expr() { + return <- $i(bufName) + } } } else { // top_n* on the prefilter buffer. @@ -1542,8 +1641,21 @@ def private plan_order_family(var expr : Expression?) : Expression? { } else { topNCall = qmacro($c(topNName)($i(bufName), $e(takeExpr))) } - stmts |> push <| qmacro_expr() { - return <- $e(topNCall) + if (selectLam != null) { + let topResName = qn("order_top_res", at) + stmts |> push_from <| qmacro_block_to_array() { + var $i(topResName) <- $e(topNCall) + var $i(outBufName) : array<$t(selectElemType)> + $i(outBufName) |> reserve(length($i(topResName))) + for ($i(elemName) in $i(topResName)) { + $i(outBufName) |> push_clone($e(projBody)) + } + return <- $i(outBufName) + } + } else { + stmts |> push <| qmacro_expr() { + return <- $e(topNCall) + } } } var bodyBlock = new ExprBlock(at = at) @@ -1578,6 +1690,8 @@ def private plan_loop_or_count(var expr : Expression?) : Expression? { let accName = qn("acc", at) let names <- make_range_names(at) var whereCond : Expression? + // postTakeWhereCond — Theme 2 5c: gates per-element contribution AFTER the take cap fires. Distinct from whereCond (which wraps the entire take/skip body); this preserves take.where semantics ("first N elements, then filter") that auto-rewriting can't reproduce. + var postTakeWhereCond : Expression? var projection : Expression? var intermediateBinds : array // preConditionStmts evaluate UNCONDITIONALLY per element, BEFORE the where filter — @@ -1599,8 +1713,8 @@ def private plan_loop_or_count(var expr : Expression?) : Expression? { var cll & = unsafe(calls[i]) let opName = cll._1.name if (opName == "where_") { - // skip/take/skip_while/take_while-after-where is rejected — canonical chain order is - if (seenSkip || seenSkipWhile || seenTakeWhile || seenTake) return null + // Theme 2 5c — `take(N)._where(p)` allowed (routed to postTakeWhereCond, gates contribution only); other prior range ops still bail; single post-take where in v1. + if (seenSkip || seenSkipWhile || seenTakeWhile || (seenTake && postTakeWhereCond != null)) return null var predicate : Expression? if (seenSelect) { // Phase 3d / single-eval: where-after-select. Bind the current projection @@ -1621,7 +1735,9 @@ def private plan_loop_or_count(var expr : Expression?) : Expression? { } else { predicate = peel_lambda_rename_var(cll._0.arguments[1], itName) } - if (whereCond == null) { + if (seenTake) { + postTakeWhereCond = predicate + } elif (whereCond == null) { whereCond = predicate } else { whereCond = qmacro($e(whereCond) && $e(predicate)) @@ -1694,7 +1810,7 @@ def private plan_loop_or_count(var expr : Expression?) : Expression? { laneTops |> push(top) var laneSrcs : array laneSrcs |> push(srcName) - return emit_accumulator_lane(lastName, laneTops, projection, whereCond, + return emit_accumulator_lane(lastName, laneTops, projection, whereCond, postTakeWhereCond, intermediateBinds, preCondStmts, elementType, laneSrcs, accName, itName, names, skipExpr, takeExpr, skipWhileCond, takeWhileCond, at) } @@ -1709,7 +1825,7 @@ def private plan_loop_or_count(var expr : Expression?) : Expression? { laneTops |> push(top) var laneSrcs : array laneSrcs |> push(srcName) - return emit_early_exit_lane(lastName, laneTops, projection, whereCond, + return emit_early_exit_lane(lastName, laneTops, projection, whereCond, postTakeWhereCond, intermediateBinds, preCondStmts, elementType, terminatorCall, laneSrcs, itName, names, skipExpr, takeExpr, skipWhileCond, takeWhileCond, at) } @@ -1724,29 +1840,34 @@ def private plan_loop_or_count(var expr : Expression?) : Expression? { var $i(finalBindName) = $e(projection) } } - stmts |> push <| qmacro_expr() { + // Theme 2 5c: when postTakeWhereCond is set, gate JUST the acc++ — the take cap still ticks unconditionally above. + var incExpr = qmacro_expr() { $i(accName) ++ } + stmts |> push(wrap_with_condition(incExpr, postTakeWhereCond)) prepend_binds(stmts, intermediateBinds) wrap_with_ranges(stmts, skipExpr, takeExpr, skipWhileCond, takeWhileCond, names) loopBody = prepend_precond(wrap_with_condition(stmts_to_expr(stmts), whereCond), preCondStmts) } else { // Array lane. `push_clone` is the safe append everywhere: for workhorse types it's a var stmts : array + var pushExpr : Expression? if (projection != null) { - stmts |> push <| qmacro_expr() { + pushExpr = qmacro_expr() { $i(accName) |> push_clone($e(projection)) } - } elif (whereCond != null || skipExpr != null || takeExpr != null + } elif (whereCond != null || postTakeWhereCond != null || skipExpr != null || takeExpr != null || skipWhileCond != null || takeWhileCond != null) { // Identity push: `it` aliases the source element. Reached when chain is bare - stmts |> push <| qmacro_expr() { + pushExpr = qmacro_expr() { $i(accName) |> push_clone($i(itName)) } } else { // identity chain — nothing to fuse; let the caller fall through. return null } + // Theme 2 5c: postTakeWhereCond gates JUST the push — same shape as counter lane. + stmts |> push(wrap_with_condition(pushExpr, postTakeWhereCond)) prepend_binds(stmts, intermediateBinds) wrap_with_ranges(stmts, skipExpr, takeExpr, skipWhileCond, takeWhileCond, names) loopBody = prepend_precond(wrap_with_condition(stmts_to_expr(stmts), whereCond), preCondStmts) @@ -1781,6 +1902,8 @@ def private plan_reverse(var expr : Expression?) : Expression? { var hasReverse = false var seenSelect = false var takeExpr : Expression? + var terminalSelectLam : Expression? + var terminalSelectElemType : TypeDeclPtr let at = calls[0]._0.at let srcName = qn("source", at) let itName = qn("it", at) @@ -1797,9 +1920,19 @@ def private plan_reverse(var expr : Expression?) : Expression? { if (hasReverse || seenSelect) return null whereCond = merge_where_cond(whereCond, peel_lambda_rename_var(cll._0.arguments[1], itName)) } elif (name == "select") { - if (hasReverse || seenSelect) return null - seenSelect = true - projection = peel_lambda_rename_var(cll._0.arguments[1], itName) + if (!hasReverse && !seenSelect) { + // Pre-reverse select: existing path (buffer holds projected values). + seenSelect = true + projection = peel_lambda_rename_var(cll._0.arguments[1], itName) + } elif (hasReverse && !seenSelect && terminalSelectLam == null && i == length(calls) - 1) { + // Terminal post-reverse select: project at return (R1-R4 buf or first scalar). + terminalSelectLam = cll._0.arguments[1] + if (terminalSelectLam == null + || cll._0._type == null || cll._0._type.firstType == null) return null + terminalSelectElemType = clone_type(cll._0._type.firstType) + } else { + return null + } } elif (name == "reverse") { if (hasReverse) return null hasReverse = true @@ -1813,7 +1946,9 @@ def private plan_reverse(var expr : Expression?) : Expression? { return null } } - if (!hasReverse || (takeExpr != null && terminatorName != "")) return null + // count + terminal _select would drop projection side effects (count ≡ count after pure select). Defer. + if (!hasReverse || (takeExpr != null && terminatorName != "") + || (terminalSelectLam != null && terminatorName == "count")) return null var body : Expression? if (terminatorName == "count") { // Reverse is identity for count — counter loop, no buffer. Side-effecting projection still fires per match. @@ -1853,6 +1988,16 @@ def private plan_reverse(var expr : Expression?) : Expression? { $i(foundName) = true } var perElement = wrap_with_condition(matchBlock, whereCond) + // Terminal _select: `last` stays source-typed; project (and the default) at return. + var lastRetExpr : Expression? + var dRetExpr : Expression? + if (terminalSelectLam != null) { + lastRetExpr = peel_lambda_replace_var(terminalSelectLam, qmacro($i(lastName))) + dRetExpr = peel_lambda_replace_var(terminalSelectLam, qmacro($i(dBindName))) + } else { + lastRetExpr = qmacro($i(lastName)) + dRetExpr = qmacro($i(dBindName)) + } if (terminatorName == "first") { body = qmacro_block() { var $i(foundName) = false @@ -1863,7 +2008,7 @@ def private plan_reverse(var expr : Expression?) : Expression? { if (!$i(foundName)) { panic("sequence contains no elements") } - return $i(lastName) + return $e(lastRetExpr) } } else { body = qmacro_block() { @@ -1873,14 +2018,22 @@ def private plan_reverse(var expr : Expression?) : Expression? { for ($i(itName) in $i(srcName)) { $e(perElement) } - return $i(foundName) ? $i(lastName) : $i(dBindName) + return $i(foundName) ? $e(lastRetExpr) : $e(dRetExpr) } } } else { // R1-R4 path: buffer + reverse_inplace + optional resize + return buffer. let needIterWrap = expr._type.isIterator var bufElemType = strip_const_ref(clone_type(reverseCall._type.firstType)) + // Terminal _select projects buffer survivors at return (after resize trims to take(N)). + let outBufName = qn("rev_proj_buf", at) + let elemName = qn("rev_proj_e", at) + var projBody : Expression? + if (terminalSelectLam != null) { + projBody = peel_lambda_replace_var(terminalSelectLam, qmacro($i(elemName))) + } let canBackwardIndex = (takeExpr != null && projection == null && whereCond == null + && terminalSelectLam == null && (top._type.isGoodArrayType || top._type.isArray)) if (canBackwardIndex) { // R6: visit only the last takeN indices — skips full-source push + O(length) reverse_inplace. @@ -1923,16 +2076,36 @@ def private plan_reverse(var expr : Expression?) : Expression? { $i(bufName) |> resize($e(takeExpr) <= 0 ? 0 : ($e(takeExpr) < length($i(bufName)) ? $e(takeExpr) : length($i(bufName)))) } } - var returnExpr = buffer_return(bufName, needIterWrap) - body = qmacro_block() { - var $i(bufName) : array<$t(bufElemType)> - $b(reserveStmts) - for ($i(itName) in $i(srcName)) { - $e(pushExpr) + if (terminalSelectLam != null) { + // Post-reverse projection: outBuf returned in place of bufName. + var returnExpr = buffer_return(outBufName, needIterWrap) + body = qmacro_block() { + var $i(bufName) : array<$t(bufElemType)> + $b(reserveStmts) + for ($i(itName) in $i(srcName)) { + $e(pushExpr) + } + _::reverse_inplace($i(bufName)) + $b(resizeStmts) + var $i(outBufName) : array<$t(terminalSelectElemType)> + $i(outBufName) |> reserve(length($i(bufName))) + for ($i(elemName) in $i(bufName)) { + $i(outBufName) |> push_clone($e(projBody)) + } + $e(returnExpr) + } + } else { + var returnExpr = buffer_return(bufName, needIterWrap) + body = qmacro_block() { + var $i(bufName) : array<$t(bufElemType)> + $b(reserveStmts) + for ($i(itName) in $i(srcName)) { + $e(pushExpr) + } + _::reverse_inplace($i(bufName)) + $b(resizeStmts) + $e(returnExpr) } - _::reverse_inplace($i(bufName)) - $b(resizeStmts) - $e(returnExpr) } } } @@ -2730,6 +2903,7 @@ def private plan_group_by_core(var calls : array>; var keyBlock : Expression?; var groupProjCall : ExprCall?; var havingCall : ExprCall?; + var trailingWhereCall : ExprCall?; terminatorName : string; exprIsIterator : bool; at : LineInfo; @@ -2745,6 +2919,7 @@ def private plan_group_by_core(var calls : array>; let bindName = qn("{prefix}gpb", at) let cntName = qn("{prefix}cnt", at) let dummyName = qn("{prefix}dummy", at) + let outName = qn("{prefix}out", at) // Walk upstream where_/select* into segments. Each where guards everything AFTER it; a select after a where flushes a new segment so the projection bind lives inside the where's guard. var segBinds : array> var segWheres : array @@ -2816,6 +2991,12 @@ def private plan_group_by_core(var calls : array>; havingPred = rewrite_having_pred(rawPred, hbName, kvName, specs) if (havingPred == null || expr_uses_var(havingPred, hbName)) return null } + // Peel optional trailing _where (Theme 2). Predicate references the post-aggregate output tuple via outName. + var trailingWherePred : Expression? + if (trailingWhereCall != null) { + trailingWherePred = peel_lambda_rename_var(trailingWhereCall.arguments[1], outName) + if (trailingWherePred == null) return null + } let hasHidden = (specs |> length) > userVisibleSlotCount // Bare reducer + hidden slot needs typedecl(invoke(...)) growth inside a qmacro — can't grow that dynamically. Cascade. if (hasHidden && !usesNamedTuple) return null @@ -2929,28 +3110,11 @@ def private plan_group_by_core(var calls : array>; } // Adapter-specific source loop emission (for(it in src) for array, for_each_archetype + inner for for decs). stmts |> push(adapter_emit_source_loop(adapter, body, at)) - // Terminator emission + retType derivation. - var retType : TypeDeclPtr - if (terminatorName == "count") { - retType = new TypeDecl(baseType = Type.tInt) - if (havingPred != null) { - stmts |> push <| qmacro_block() { - var $i(cntName) = 0 - for ($i(kvName) in values($i(tabName))) { - if ($e(havingPred)) { - $i(cntName) ++ - } - } - return $i(cntName) - } - } else { - stmts |> push <| qmacro_expr() { - return length($i(tabName)) - } - } - } else { - // to_array lane (implicit when no count terminator): build result buffer by walking the table. - var outputExpr : Expression? + // Compute output tuple + bufElemType — needed by to_array always, and by count when trailingWherePred is present (predicate is bound against the constructed output). + let needOutput = terminatorName != "count" || trailingWherePred != null + var outputExpr : Expression? + var bufElemType : TypeDeclPtr + if (needOutput) { if (!usesNamedTuple) { if (hasAvg) { outputExpr = mk_avg_divide_expr(at, kvName, 1) @@ -2989,12 +3153,57 @@ def private plan_group_by_core(var calls : array>; $i(kvName) } } - var bufElemType = clone_type(groupProjBody._type) + bufElemType = clone_type(groupProjBody._type) if (bufElemType != null) { bufElemType.flags.constant = false bufElemType.flags.ref = false } - // retType matches exprIsIterator: iterator via to_sequence_move(buf) tail, or array via raw buf return. Both paths use buffer_return(bufName, exprIsIterator) for the final stmt. + } + // Terminator emission + retType derivation. + var retType : TypeDeclPtr + if (terminatorName == "count") { + retType = new TypeDecl(baseType = Type.tInt) + if (havingPred != null && trailingWherePred != null) { + stmts |> push <| qmacro_block() { + var $i(cntName) = 0 + for ($i(kvName) in values($i(tabName))) { + if ($e(havingPred)) { + let $i(outName) : $t(bufElemType) = $e(outputExpr) + if ($e(trailingWherePred)) { + $i(cntName) ++ + } + } + } + return $i(cntName) + } + } elif (trailingWherePred != null) { + stmts |> push <| qmacro_block() { + var $i(cntName) = 0 + for ($i(kvName) in values($i(tabName))) { + let $i(outName) : $t(bufElemType) = $e(outputExpr) + if ($e(trailingWherePred)) { + $i(cntName) ++ + } + } + return $i(cntName) + } + } elif (havingPred != null) { + stmts |> push <| qmacro_block() { + var $i(cntName) = 0 + for ($i(kvName) in values($i(tabName))) { + if ($e(havingPred)) { + $i(cntName) ++ + } + } + return $i(cntName) + } + } else { + stmts |> push <| qmacro_expr() { + return length($i(tabName)) + } + } + } else { + // to_array lane: walk table → buf; iterator-typed context wraps via buffer_return(..., true). if (exprIsIterator) { retType = new TypeDecl(baseType = Type.tIterator) retType.firstType = clone_type(bufElemType) @@ -3006,7 +3215,27 @@ def private plan_group_by_core(var calls : array>; var $i(bufName) : array<$t(bufElemType)> $i(bufName) |> reserve(length($i(tabName))) } - if (havingPred != null) { + if (havingPred != null && trailingWherePred != null) { + stmts |> push <| qmacro_expr() { + for ($i(kvName) in values($i(tabName))) { + if ($e(havingPred)) { + let $i(outName) : $t(bufElemType) = $e(outputExpr) + if ($e(trailingWherePred)) { + $i(bufName) |> push_clone($i(outName)) + } + } + } + } + } elif (trailingWherePred != null) { + stmts |> push <| qmacro_expr() { + for ($i(kvName) in values($i(tabName))) { + let $i(outName) : $t(bufElemType) = $e(outputExpr) + if ($e(trailingWherePred)) { + $i(bufName) |> push_clone($i(outName)) + } + } + } + } elif (havingPred != null) { stmts |> push <| qmacro_expr() { for ($i(kvName) in values($i(tabName))) { if ($e(havingPred)) { @@ -3042,6 +3271,13 @@ def private plan_group_by(var expr : Expression?) : Expression? { calls |> pop } } + // Optional: trailing _where AFTER _select(reducer) — SQL HAVING shape, predicate on post-aggregate tuple. Theme 2 (closes audit 4a). Distinct from `having_` (which lives between group_by_lazy and select and can lift hidden reducer slots): trailing _where binds the constructed output tuple and gates the buf-emit loop. + var trailingWhereCall : ExprCall? + if (!empty(calls) && calls.back()._1.name == "where_") { + trailingWhereCall = calls.back()._0 + if ((trailingWhereCall.arguments |> length) < 2) return null + calls |> pop + } // Required tail: select(group_proj) — without it the chain yields raw buckets (no fusion). if (empty(calls) || calls.back()._1.name != "select") return null var groupProjCall = calls.back()._0 @@ -3068,7 +3304,7 @@ def private plan_group_by(var expr : Expression?) : Expression? { arraySrcName := qn("source", at), decsBridge = null ) - return plan_group_by_core(calls, keyBlock, groupProjCall, havingCall, terminatorName, expr._type.isIterator, at, adapter) + return plan_group_by_core(calls, keyBlock, groupProjCall, havingCall, trailingWhereCall, terminatorName, expr._type.isIterator, at, adapter) } // ── decs eager-bridge unroll (Approach Z — for_each_archetype + nested _fold) ─────── @@ -4512,6 +4748,13 @@ def private plan_decs_group_by(var expr : Expression?) : Expression? { calls |> pop } } + // Optional: trailing _where AFTER _select(reducer) — SQL HAVING shape on post-aggregate tuple. Theme 2 (closes audit 4e). Decs mirror of the array-side pop. + var trailingWhereCall : ExprCall? + if (!empty(calls) && calls.back()._1.name == "where_") { + trailingWhereCall = calls.back()._0 + if ((trailingWhereCall.arguments |> length) < 2) return null + calls |> pop + } // Required tail: select(group_proj) — without it the chain yields raw buckets (no fusion). if (empty(calls) || calls.back()._1.name != "select") return null var groupProjCall = calls.back()._0 @@ -4538,7 +4781,7 @@ def private plan_decs_group_by(var expr : Expression?) : Expression? { arraySrcName := "", decsBridge = bridge ) - return plan_group_by_core(calls, keyBlock, groupProjCall, havingCall, terminatorName, expr._type.isIterator, at, adapter) + return plan_group_by_core(calls, keyBlock, groupProjCall, havingCall, trailingWhereCall, terminatorName, expr._type.isIterator, at, adapter) } // ── decs order family splice (Slice 5d — buffer + order_inplace/top_n/min_by/max_by) ─────── @@ -4559,6 +4802,8 @@ def private plan_decs_order_family(var expr : Expression?) : Expression? { var takeExpr : Expression? var firstName : string = "" var firstDefaultExpr : Expression? + var selectLam : Expression? + var selectElemType : TypeDeclPtr for (i in 0 .. length(calls)) { var cll & = unsafe(calls[i]) let name = cll._1.name @@ -4591,6 +4836,13 @@ def private plan_decs_order_family(var expr : Expression?) : Expression? { if ((cll._0.arguments |> length) < 2) return null firstDefaultExpr = clone_expression(cll._0.arguments[1]) } + } elif (name == "select") { + // Terminal _select after take(N): heap/sort sees raw tuple, project at return. + if (i != length(calls) - 1 || !hasOrder + || cll._0._type == null || cll._0._type.firstType == null) return null + selectLam = cll._0.arguments[1] + if (selectLam == null) return null + selectElemType = clone_type(cll._0._type.firstType) } else { return null } @@ -4681,21 +4933,46 @@ def private plan_decs_order_family(var expr : Expression?) : Expression? { } var forExprNode = build_decs_inner_for_pruned(bridge, tupName, perElement, at) var bhStmts : array - bhStmts |> reserve(7) + bhStmts |> reserve(10) // No `reserve(takeN)` on the bounded buf — matches the policy in linq.das top_n_by_with_cmp iterator variant. Caller may pass takeN >> actual source size, and the decs cardinality is unknown ahead of the walk; pre-reserving N slots would risk a large upfront allocation for no win (fill phase grows geometrically to min(N, M) in O(log) reallocs anyway). - bhStmts |> push_from <| qmacro_block_to_array() { - let $i(takeNName) = $e(takeExpr) - var $i(bufName) : array<$t(elemType)> - return <- $i(bufName) if ($i(takeNName) <= 0) - for_each_archetype($e(bridge.reqHashExpr), $e(bridge.erqExpr), $($i(archName) : Archetype) { - $e(forExprNode) - }) - _::order_inplace($i(bufName), $e(inlineCmp)) - return <- $i(bufName) + if (selectLam != null) { + // Terminal _select projects ≤K heap survivors at return (heap holds raw tuples). + let outBufName = qn("decs_proj_buf", at) + let elemName = qn("decs_proj_e", at) + var projBody = peel_lambda_replace_var(selectLam, qmacro($i(elemName))) + bhStmts |> push_from <| qmacro_block_to_array() { + let $i(takeNName) = $e(takeExpr) + var $i(bufName) : array<$t(elemType)> + var $i(outBufName) : array<$t(selectElemType)> + return <- $i(outBufName) if ($i(takeNName) <= 0) + for_each_archetype($e(bridge.reqHashExpr), $e(bridge.erqExpr), $($i(archName) : Archetype) { + $e(forExprNode) + }) + _::order_inplace($i(bufName), $e(inlineCmp)) + $i(outBufName) |> reserve(length($i(bufName))) + for ($i(elemName) in $i(bufName)) { + $i(outBufName) |> push_clone($e(projBody)) + } + return <- $i(outBufName) + } + emission = qmacro(invoke($() : array<$t(selectElemType)> { + $b(bhStmts) + })) + } else { + bhStmts |> push_from <| qmacro_block_to_array() { + let $i(takeNName) = $e(takeExpr) + var $i(bufName) : array<$t(elemType)> + return <- $i(bufName) if ($i(takeNName) <= 0) + for_each_archetype($e(bridge.reqHashExpr), $e(bridge.erqExpr), $($i(archName) : Archetype) { + $e(forExprNode) + }) + _::order_inplace($i(bufName), $e(inlineCmp)) + return <- $i(bufName) + } + emission = qmacro(invoke($() : array<$t(elemType)> { + $b(bhStmts) + })) } - emission = qmacro(invoke($() : array<$t(elemType)> { - $b(bhStmts) - })) return finalize_decs_emission(emission, at, needIterWrap) } var perElement : Expression? = qmacro_expr() { @@ -4769,12 +5046,29 @@ def private plan_decs_order_family(var expr : Expression?) : Expression? { sortCall = qmacro($c(inplaceName)($i(bufName))) } bodyStmts |> push(sortCall) - bodyStmts |> push <| qmacro_expr() { - return <- $i(bufName) + if (selectLam != null) { + let outBufName = qn("decs_proj_buf", at) + let elemName = qn("decs_proj_e", at) + var projBody = peel_lambda_replace_var(selectLam, qmacro($i(elemName))) + bodyStmts |> push_from <| qmacro_block_to_array() { + var $i(outBufName) : array<$t(selectElemType)> + $i(outBufName) |> reserve(length($i(bufName))) + for ($i(elemName) in $i(bufName)) { + $i(outBufName) |> push_clone($e(projBody)) + } + return <- $i(outBufName) + } + emission = qmacro(invoke($() : array<$t(selectElemType)> { + $b(bodyStmts) + })) + } else { + bodyStmts |> push <| qmacro_expr() { + return <- $i(bufName) + } + emission = qmacro(invoke($() : array<$t(elemType)> { + $b(bodyStmts) + })) } - emission = qmacro(invoke($() : array<$t(elemType)> { - $b(bodyStmts) - })) } else { // order + take → top_n* dispatch on the buffer. var topNCall : Expression? @@ -4785,12 +5079,31 @@ def private plan_decs_order_family(var expr : Expression?) : Expression? { } else { topNCall = qmacro($c(topNName)($i(bufName), $e(takeExpr))) } - bodyStmts |> push <| qmacro_expr() { - return <- $e(topNCall) + if (selectLam != null) { + let topResName = qn("decs_top_res", at) + let outBufName = qn("decs_proj_buf", at) + let elemName = qn("decs_proj_e", at) + var projBody = peel_lambda_replace_var(selectLam, qmacro($i(elemName))) + bodyStmts |> push_from <| qmacro_block_to_array() { + var $i(topResName) <- $e(topNCall) + var $i(outBufName) : array<$t(selectElemType)> + $i(outBufName) |> reserve(length($i(topResName))) + for ($i(elemName) in $i(topResName)) { + $i(outBufName) |> push_clone($e(projBody)) + } + return <- $i(outBufName) + } + emission = qmacro(invoke($() : array<$t(selectElemType)> { + $b(bodyStmts) + })) + } else { + bodyStmts |> push <| qmacro_expr() { + return <- $e(topNCall) + } + emission = qmacro(invoke($() : array<$t(elemType)> { + $b(bodyStmts) + })) } - emission = qmacro(invoke($() : array<$t(elemType)> { - $b(bodyStmts) - })) } // Bare order + take both return array; wrap to iterator when the user's outer context demands it. first/first_or_default return scalar — no wrap. return finalize_decs_emission(emission, at, needIterWrap && firstName == "") @@ -4826,6 +5139,8 @@ def private plan_decs_reverse(var expr : Expression?) : Expression? { var hasReverse = false var seenSelect = false var takeExpr : Expression? + var terminalSelectLam : Expression? + var terminalSelectElemType : TypeDeclPtr for (i in 0 .. length(calls)) { var cll & = unsafe(calls[i]) let name = cll._1.name @@ -4835,10 +5150,18 @@ def private plan_decs_reverse(var expr : Expression?) : Expression? { if (pred == null) return null whereCond = merge_where_cond(whereCond, pred) } elif (name == "select") { - if (hasReverse || seenSelect) return null - seenSelect = true - projection = peel_lambda_rename_var(cll._0.arguments[1], tupName) - if (projection == null) return null + if (!hasReverse && !seenSelect) { + seenSelect = true + projection = peel_lambda_rename_var(cll._0.arguments[1], tupName) + if (projection == null) return null + } elif (hasReverse && !seenSelect && terminalSelectLam == null && i == length(calls) - 1) { + terminalSelectLam = cll._0.arguments[1] + if (terminalSelectLam == null + || cll._0._type == null || cll._0._type.firstType == null) return null + terminalSelectElemType = clone_type(cll._0._type.firstType) + } else { + return null + } } elif (name == "reverse") { if (hasReverse) return null hasReverse = true @@ -4851,7 +5174,8 @@ def private plan_decs_reverse(var expr : Expression?) : Expression? { return null } } - if (!hasReverse || (takeExpr != null && terminatorName != "")) return null + if (!hasReverse || (takeExpr != null && terminatorName != "") + || (terminalSelectLam != null && terminatorName == "count")) return null let archName = bridge.archName if (terminatorName == "count") { // Reverse is identity for count — counter loop, no buffer. Side-effecting projection still fires per match. @@ -4914,9 +5238,20 @@ def private plan_decs_reverse(var expr : Expression?) : Expression? { } } var forExprNode = build_decs_inner_for_pruned(bridge, tupName, perElement, at) + // Terminal _select: `last` stays source-typed; project at return. + let outElemType = (terminalSelectLam != null) ? terminalSelectElemType : lastType + var lastRetExpr : Expression? + var dRetExpr : Expression? + if (terminalSelectLam != null) { + lastRetExpr = peel_lambda_replace_var(terminalSelectLam, qmacro($i(lastName))) + dRetExpr = peel_lambda_replace_var(terminalSelectLam, qmacro($i(dBindName))) + } else { + lastRetExpr = qmacro($i(lastName)) + dRetExpr = qmacro($i(dBindName)) + } var emission : Expression? if (terminatorName == "first") { - emission = qmacro(invoke($() : $t(lastType) { + emission = qmacro(invoke($() : $t(outElemType) { var $i(foundName) = false var $i(lastName) : $t(lastType) = default<$t(lastType)> for_each_archetype($e(bridge.reqHashExpr), $e(bridge.erqExpr), $($i(archName) : Archetype) { @@ -4925,17 +5260,17 @@ def private plan_decs_reverse(var expr : Expression?) : Expression? { if (!$i(foundName)) { panic("sequence contains no elements") } - return $i(lastName) + return $e(lastRetExpr) })) } else { - emission = qmacro(invoke($() : $t(lastType) { + emission = qmacro(invoke($() : $t(outElemType) { let $i(dBindName) = $e(terminatorCall.arguments[1]) var $i(foundName) = false var $i(lastName) : $t(lastType) = default<$t(lastType)> for_each_archetype($e(bridge.reqHashExpr), $e(bridge.erqExpr), $($i(archName) : Archetype) { $e(forExprNode) }) - return $i(foundName) ? $i(lastName) : $i(dBindName) + return $i(foundName) ? $e(lastRetExpr) : $e(dRetExpr) })) } emission.force_at(at) @@ -4947,7 +5282,7 @@ def private plan_decs_reverse(var expr : Expression?) : Expression? { let needIterWrap = expr._type.isIterator var bufElemType = strip_const_ref(clone_type(projection != null ? projection._type : bridge.elementType)) // Skip-into-tail fast path: `reverse |> take(N) |> to_array` with no where/select. Walk archetypes once to sum `arch.size` (cheap, no entity load), compute skip = total - takeN, then for_each_archetype_find skips whole archetypes whose size still fits below the skip threshold and short-circuits once the buffer reaches takeN. `where` would invalidate the size-based skip (count after filter is unknown without iterating); `select` would only affect element shape, not count, but is skipped here to keep v1 minimal. - if (takeExpr != null && whereCond == null && projection == null) { + if (takeExpr != null && whereCond == null && projection == null && terminalSelectLam == null) { let takeNName = qn("take_n", at) let totalName = qn("decs_total", at) let actualName = qn("decs_actual", at) @@ -5031,15 +5366,36 @@ def private plan_decs_reverse(var expr : Expression?) : Expression? { $i(bufName) |> resize($i(takeNName) <= 0 ? 0 : ($i(takeNName) < length($i(bufName)) ? $i(takeNName) : length($i(bufName)))) } } - var emission : Expression? = qmacro(invoke($() : array<$t(bufElemType)> { - var $i(bufName) : array<$t(bufElemType)> - for_each_archetype($e(bridge.reqHashExpr), $e(bridge.erqExpr), $($i(archName) : Archetype) { - $e(forExprNode) - }) - _::reverse_inplace($i(bufName)) - $b(resizeStmts) - return <- $i(bufName) - })) + var emission : Expression? + if (terminalSelectLam != null) { + let outBufName = qn("decs_rev_proj_buf", at) + let elemName = qn("decs_rev_proj_e", at) + var projBody = peel_lambda_replace_var(terminalSelectLam, qmacro($i(elemName))) + emission = qmacro(invoke($() : array<$t(terminalSelectElemType)> { + var $i(bufName) : array<$t(bufElemType)> + for_each_archetype($e(bridge.reqHashExpr), $e(bridge.erqExpr), $($i(archName) : Archetype) { + $e(forExprNode) + }) + _::reverse_inplace($i(bufName)) + $b(resizeStmts) + var $i(outBufName) : array<$t(terminalSelectElemType)> + $i(outBufName) |> reserve(length($i(bufName))) + for ($i(elemName) in $i(bufName)) { + $i(outBufName) |> push_clone($e(projBody)) + } + return <- $i(outBufName) + })) + } else { + emission = qmacro(invoke($() : array<$t(bufElemType)> { + var $i(bufName) : array<$t(bufElemType)> + for_each_archetype($e(bridge.reqHashExpr), $e(bridge.erqExpr), $($i(archName) : Archetype) { + $e(forExprNode) + }) + _::reverse_inplace($i(bufName)) + $b(resizeStmts) + return <- $i(bufName) + })) + } return finalize_decs_emission(emission, at, needIterWrap) } @@ -5277,10 +5633,28 @@ def private plan_decs_join(var expr : Expression?) : Expression? { calls |> pop } } + // Trailing _select composes with the join's result lambda: bind once, project once. Element type is derived later from expr._type.firstType (set by the user's downstream typer) — no need to record it here. + var selectLam : Expression? + if (terminatorName != "count" && !empty(calls) && calls.back()._1.name == "select") { + var selCall = calls.back()._0 + selectLam = selCall.arguments[1] + if (selectLam == null || selCall._type == null || selCall._type.firstType == null) return null + calls |> pop + } + // Trailing _where (Theme 2 — closes audit 8a + C6). Predicate references join-result fields, so we bind the result once per pair and gate the push/incr. Comes BEFORE select in chain order, so pop AFTER select. + var whereLam : Expression? + if (!empty(calls) && calls.back()._1.name == "where_") { + var wCall = calls.back()._0 + if (wCall.arguments |> length < 2) return null + whereLam = wCall.arguments[1] + if (whereLam == null) return null + calls |> pop + } // Must end on a single `join` call now — interleaved where/select unsupported in v1. if (empty(calls) || calls.back()._1.name != "join") return null var joinCall = calls.back()._0 calls |> pop + // Iterator-typed context bails regardless of selectLam: the emission below returns `array` and is not wrapped to iterator. (Currently user-unreachable when selectLam != null — `_select` after `_join` can't infer its result_selector without a downstream terminator — but kept for defensive splice hygiene.) if (!empty(calls) || (terminatorName == "" && !expr._type.isGoodArrayType)) return null // Both sides must be from_decs_template eager bridges. var bridgeA = extract_decs_bridge(top) @@ -5330,8 +5704,28 @@ def private plan_decs_join(var expr : Expression?) : Expression? { preludeStmts |> push <| qmacro_expr() { var $i(cntName) : int = 0 } - probeStmts |> push <| qmacro_expr() { - $i(cntName) += length($i(arrName)) + if (whereLam == null) { + // Fast path: bucket-length sum. + probeStmts |> push <| qmacro_expr() { + $i(cntName) += length($i(arrName)) + } + } else { + // HAVING-shape: bind result, evaluate predicate, conditional incr. + var resultLam = joinCall.arguments[4] + if (resultLam == null || resultLam._type == null || resultLam._type.firstType == null) return null + var resultBody = peel_lambda_rename_2vars(resultLam, tupAName, bElemName) + if (resultBody == null) return null + let joinResultType = strip_const_ref(clone_type(resultLam._type.firstType)) + let resBindName = qn("decs_jres", at) + var wherePred = peel_lambda_replace_var(whereLam, qmacro($i(resBindName))) + probeStmts |> push <| qmacro_expr() { + for ($i(bElemName) in $i(arrName)) { + let $i(resBindName) : $t(joinResultType) = $e(resultBody) + if ($e(wherePred)) { + $i(cntName) += 1 + } + } + } } returnStmt = qmacro_expr() { return $i(cntName) @@ -5346,9 +5740,41 @@ def private plan_decs_join(var expr : Expression?) : Expression? { preludeStmts |> push <| qmacro_expr() { var $i(bufName) : array<$t(resultType)> } - probeStmts |> push <| qmacro_expr() { - for ($i(bElemName) in $i(arrName)) { - $i(bufName) |> push_clone($e(resultBody)) + let needBind = selectLam != null || whereLam != null + if (needBind) { + // Bind join result once per pair (side effects once), then optionally filter / project. + let joinResultType = strip_const_ref(clone_type(resultLam._type.firstType)) + let resBindName = qn("decs_jres", at) + var pushExpr : Expression? + if (selectLam != null) { + var projBody = peel_lambda_replace_var(selectLam, qmacro($i(resBindName))) + pushExpr = qmacro($i(bufName) |> push_clone($e(projBody))) + } else { + pushExpr = qmacro($i(bufName) |> push_clone($i(resBindName))) + } + if (whereLam != null) { + var wherePred = peel_lambda_replace_var(whereLam, qmacro($i(resBindName))) + probeStmts |> push <| qmacro_expr() { + for ($i(bElemName) in $i(arrName)) { + let $i(resBindName) : $t(joinResultType) = $e(resultBody) + if ($e(wherePred)) { + $e(pushExpr) + } + } + } + } else { + probeStmts |> push <| qmacro_expr() { + for ($i(bElemName) in $i(arrName)) { + let $i(resBindName) : $t(joinResultType) = $e(resultBody) + $e(pushExpr) + } + } + } + } else { + probeStmts |> push <| qmacro_expr() { + for ($i(bElemName) in $i(arrName)) { + $i(bufName) |> push_clone($e(resultBody)) + } } } returnStmt = qmacro_expr() { @@ -5398,8 +5824,8 @@ def private plan_zip(var expr : Expression?) : Expression? { if (empty(calls) || calls[0]._1.name != "zip") return null var zipCall = calls[0]._0 let zipArgCount = zipCall.arguments |> length - // Z6 bail: result-selector form (3-arg zip = 2 sources + selector) yields scalar element stream — different splice shape, defer. - if (zipArgCount != 2) return null + // 3-arg zip(a, b, sel): pre-lower the selector into `projection` (peeled with it._0/_1 binds). + if (zipArgCount != 2 && zipArgCount != 3) return null // Identify recognized terminator. Counter: count/long_count. Accumulator: sum/min/max/average. Early-exit: first/first_or_default/any/all/contains. Anything else: treat as no-terminator (bare → ARRAY lane); unrecognized chain op bails inside the chain walk. var lastName = "" var intermediateEnd = length(calls) @@ -5467,6 +5893,29 @@ def private plan_zip(var expr : Expression?) : Expression? { var seenTakeWhile = false var seenTake = false var allProjectionsPure = true + // Pre-lower 3-arg zip(a,b,sel) → seeded projection (2-arg lambda replaced with it._0/_1). + if (zipArgCount == 3) { + var resultLam = zipCall.arguments[2] + if (resultLam == null || !(resultLam is ExprMakeBlock)) return null + var mblk = resultLam as ExprMakeBlock + var blk = mblk._block as ExprBlock + if (blk == null || blk.arguments |> length != 2 || blk.list |> length != 1 + || !(blk.list[0] is ExprReturn)) return null + var ret = blk.list[0] as ExprReturn + if (ret.subexpr == null || ret.subexpr._type == null) return null + var projBody = clone_expression(ret.subexpr) + var projElemType = strip_const_ref(clone_type(ret.subexpr._type)) + var zipRules : Template + zipRules |> replaceVariable(string(blk.arguments[0].name), qmacro($i(itName)._0)) + zipRules |> replaceVariable(string(blk.arguments[1].name), qmacro($i(itName)._1)) + projBody = apply_template(zipRules, projBody.at, projBody) + projection = projBody + elementType = projElemType + seenSelect = true + if (has_sideeffects(projection)) { + allProjectionsPure = false + } + } // Z3 chain walk: fuse where_/select/take/skip/take_while/skip_while between zip and terminator. Predicates/projections receive the tuple element via peel_lambda_replace_var substitution with `(itA, itB)` — typer collapses `t._0/_1` to the raw iter vars. for (i in 1 .. intermediateEnd) { var cll & = unsafe(calls[i]) @@ -5546,7 +5995,7 @@ def private plan_zip(var expr : Expression?) : Expression? { var intermediateBinds : array var laneTops <- [srcAExpr, srcBExpr] let laneSrcs <- [srcAName, srcBName] - return emit_accumulator_lane(lastName, laneTops, projection, whereCond, + return emit_accumulator_lane(lastName, laneTops, projection, whereCond, null, intermediateBinds, preCondStmts, elementType, laneSrcs, accName, itName, names, skipExpr, takeExpr, skipWhileCond, takeWhileCond, at) } @@ -5559,7 +6008,7 @@ def private plan_zip(var expr : Expression?) : Expression? { let terminatorCall = calls.back()._0 var laneTops <- [srcAExpr, srcBExpr] let laneSrcs <- [srcAName, srcBName] - return emit_early_exit_lane(lastName, laneTops, projection, whereCond, + return emit_early_exit_lane(lastName, laneTops, projection, whereCond, null, intermediateBinds, preCondStmts, elementType, terminatorCall, laneSrcs, itName, names, skipExpr, takeExpr, skipWhileCond, takeWhileCond, at) } diff --git a/doc/source/reference/linq_fold_patterns.rst b/doc/source/reference/linq_fold_patterns.rst index aa15680cf..4696368a2 100644 --- a/doc/source/reference/linq_fold_patterns.rst +++ b/doc/source/reference/linq_fold_patterns.rst @@ -67,9 +67,9 @@ Source-side entry points * - ``each(array)`` - ``peel_each`` - Strips the ``each`` wrapper; subsequent chain plans see the raw ``array`` source. - * - ``zip(a, b)`` / ``zip(a, b, c)`` + * - ``zip(a, b)`` / ``zip(a, b, sel)`` - ``plan_zip`` - - Two- or three-source zip. Splice fuses zip + select + aggregate. + - Two-source zip. The three-argument form ``zip(a, b, sel)`` is pre-lowered to ``zip(a, b) |> _select(sel-as-tuple)`` so the standard zip+select fusion fires (closes the dot-product idiom). * - ``from_decs_template(type)`` - ``plan_decs_unroll`` etc. - Surfaces a ``[decs_template]`` schema. Decs splices fire. @@ -108,6 +108,9 @@ Array-source patterns * - ``._where(P).take(N).count()`` / ``.sum()`` - ``plan_loop_or_count`` (counter / accumulator with ``takeExpr``) - Bounded counter/accumulator; loop exits at N matches. + * - ``.take(N)._where(P).`` (counter / accumulator / early-exit / array) + - ``plan_loop_or_count`` (``postTakeWhereCond`` gate) + - Take cap ticks unconditionally; ``where`` gates only the per-element contribution. Preserves the "first N elements, then keep matching" semantic that ``where.take`` cannot express. Single trailing ``where`` only — skip / skip_while / take_while + where still cascade. * - ``._where(P).take_while(P2).<...>`` / ``.skip_while(P2).<...>`` - ``plan_loop_or_count`` (predicate-driven ranges) - ``take_while`` exits on first non-match; ``skip_while`` toggles state. @@ -117,6 +120,9 @@ Array-source patterns * - ``._order_by(K).take(N).to_array()`` - ``plan_order_family`` (bounded-heap) - ``spliced_push_heap`` fill + replace, ``spliced_pop_heap`` on replace, ``order_inplace`` at end. Buffer of size N. + * - ``._order_by(K).take(N)._select(F).to_array()`` / ``.first()._select(F)`` / ``.first_or_default()._select(F)`` + - ``plan_order_family`` (terminal ``_select``) + - Bounded-heap / streaming-min holds the raw element; projection ``F`` runs ≤K times at return. Closes the natural "take top-K then project" idiom. * - ``._order_by(K).to_array()`` / ``.order_by_descending(K).to_array()`` / ``.order(K).to_array()`` / ``.order_descending(K).to_array()`` - ``plan_order_family`` (full-sort fallback) - Materializes + sorts. No bounded-heap shortcut. @@ -128,10 +134,16 @@ Array-source patterns - Per-key bucket reducer; single hash, one entry per group. * - ``._group_by(K)._having(P)._select(...).to_array()`` - ``plan_group_by`` → ``plan_group_by_core`` - - HAVING filter applied after the per-key reduce. + - HAVING filter on the bucket reference (pre-aggregate); can lift hidden reducer slots referenced by ``P`` but absent from the select. + * - ``._group_by(K)._select(reduce)._where(P).to_array()`` / ``.count()`` + - ``plan_group_by`` → ``plan_group_by_core`` (trailing ``where`` as HAVING) + - HAVING filter on the constructed post-aggregate tuple (predicate references ``_.AggField`` by name). Distinct from ``_having(P)`` and orthogonal — both can fire on the same chain. * - ``.reverse().take(N).to_array()`` (with no ``where`` / ``select``) - ``plan_reverse`` (two-pass) - Sum archetype sizes, then walk tail-first with skip-counter and early-exit. + * - ``.reverse().take(N)._select(F).to_array()`` / ``.reverse()._select(F).first()`` + - ``plan_reverse`` (terminal ``_select``) + - Projection runs ≤K times at return on the R1-R4 buffer or on the surviving ``last`` value. NOT accepted: ``reverse._select.take`` — user must reorder to ``reverse.take._select``. Decs-source patterns ==================== @@ -168,6 +180,9 @@ identical — only the source iteration changes. * - ``from_decs_template(...)._order_by(K).take(N).to_array()`` - ``plan_decs_order_family`` (bounded-heap) - Same heap pattern as the array variant; buffer size N. + * - ``from_decs_template(...)._order_by(K).take(N)._select(F).to_array()`` + - ``plan_decs_order_family`` (terminal ``_select``) + - Decs mirror of ``plan_order_family``'s terminal ``_select`` — heap holds raw element, projection runs ≤K times at return. * - ``from_decs_template(...).min_by(K)`` / ``.max_by(K)`` - ``plan_decs_unroll`` → ``emit_decs_min_max_by`` - Streaming-min/max with key. @@ -177,13 +192,49 @@ identical — only the source iteration changes. * - ``from_decs_template(...).reverse().take(N).to_array()`` - ``plan_decs_reverse`` - Whole-archetype skip + partial-archetype skip-counter + early-exit. + * - ``from_decs_template(...).reverse().take(N)._select(F).to_array()`` / ``.reverse()._select(F).first()`` + - ``plan_decs_reverse`` (terminal ``_select``) + - Decs mirror of ``plan_reverse``'s terminal ``_select``. Skip-into-tail fast path is gated off when ``_select`` is present. * - ``from_decs_template(...)._group_by(K)._select(reduce).to_array()`` - ``plan_decs_group_by`` → ``plan_group_by_core`` - Shared bucket-reducer with the array path; differs only in the per-element source. + * - ``from_decs_template(...)._group_by(K)._select(reduce)._where(P).to_array()`` / ``.count()`` + - ``plan_decs_group_by`` → ``plan_group_by_core`` (trailing ``where`` as HAVING) + - Decs mirror of the array-side post-aggregate HAVING. Same predicate-on-output-tuple semantics. * - ``from_decs_template(...)._take_while(P).<...>`` / ``._skip_while(P).<...>`` - ``plan_decs_unroll`` (predicate-driven ranges) - Hoists ``skippingName`` state across archetypes. +Decs-decs equi-join +------------------- + +``plan_decs_join`` is the hashed equi-join splice over two +``from_decs_template`` sources. It collects the right side into a +``table>`` in one ``for_each_archetype`` pass, then +walks the left side and probes via ``table.get``. The key must be a +primitive (``int*`` / ``uint*`` / ``float`` / ``double`` / ``bool`` / +``string``); tuple keys cascade to the standard ``join_impl``. + +.. list-table:: + :header-rows: 1 + :widths: 35 25 40 + + * - Chain shape + - Splice arm + - Notes + * - ``from_decs_template(A) |> _join(from_decs_template(B), ka, kb, result) |> count()`` + - ``plan_decs_join`` + - Hash-fill + probe; ``count`` bumped by bucket length per hit. No per-pair invoke. + * - ``from_decs_template(A) |> _join(...) |> to_array()`` + - ``plan_decs_join`` + - Hash-fill + probe; ``result`` lambda inlined at the push site (no per-pair invoke into ``join_impl``). + * - ``from_decs_template(A) |> _join(...) |> _select(F) |> to_array()`` + - ``plan_decs_join`` (terminal ``_select``) + - Single bind of the join result per matched pair, then projection. + * - ``from_decs_template(A) |> _join(...) |> _where(P) |> count() / to_array()`` + - ``plan_decs_join`` (trailing ``_where``) + - Bind join result, evaluate predicate, gate ``count++`` / ``push_clone``. Composes with the trailing ``_select`` form (filter then project, single bind per pair). + Zip patterns ============ @@ -219,9 +270,13 @@ Common cases that fall back: - **Mixed-source operators** like ``union(a, b)``, ``except(a, b)``, ``intersect(a, b)``, ``concat(a, b)`` after the first source has been transformed (e.g. ``each(a)._select(F).union(b)``). -- **Join terminators**: ``_join`` / ``_left_join`` / ``_right_join`` / - ``_full_outer_join`` / ``_cross_join``. The join itself does not yet - splice; downstream ``.count()`` / ``.sum()`` chains fall back. +- **Joins other than decs-decs equi-join**: ``_left_join`` / + ``_right_join`` / ``_full_outer_join`` / ``_cross_join`` don't splice; + array-source ``_join`` also falls back. Only the decs-decs primitive-key + ``_join`` shape catalogued above splices (via ``plan_decs_join``); + tuple keys, non-primitive keys, mixed array/decs sources, or chain ops + beyond a single trailing ``_where`` / ``_select`` all cascade to + ``join_impl``. - **Aggregations on lazy groupings**: ``_group_by_lazy(K)._select(F)`` with a non-bucket-reducing ``_select``. - **Materialization-only chains** that the standard linq surface diff --git a/tests/linq/test_linq_fold_terminal_select.das b/tests/linq/test_linq_fold_terminal_select.das new file mode 100644 index 000000000..0b735732b --- /dev/null +++ b/tests/linq/test_linq_fold_terminal_select.das @@ -0,0 +1,195 @@ +options gen2 + +require math +require strings +require daslib/linq +require daslib/linq_boost +require daslib/linq_fold +require daslib/decs +require daslib/decs_boost +require dastest/testing_boost public + +struct Sound { + id : int + x : float + rank : int +} + +[decs_template(prefix = "ds_")] +struct DecsSound { + id : int + x : float +} + +[decs_template(prefix = "dc_")] +struct DecsCar { + id : int + dealer_id : int + name : string +} + +[decs_template(prefix = "dd_")] +struct DecsDealer { + id : int + name : string +} + +def make_sounds() : array { + return <- [ + Sound(id = 1, x = 3.0, rank = 1), + Sound(id = 2, x = 1.0, rank = 2), + Sound(id = 3, x = 4.0, rank = 3), + Sound(id = 4, x = 1.0, rank = 4), + Sound(id = 5, x = 5.0, rank = 5) + ] +} + +[test] +def test_order_take_select_array(t : T?) { + t |> run("plan_order_family: take + terminal _select") @(tt : T?) { + let sounds <- make_sounds() + // Closest 3 by |x|; return their ids. + unsafe { + let ids <- _fold(each(sounds)._order_by(abs(_.x)).take(3)._select(_.id).to_array()) + // Sounds with smallest |x|: id 2 (1.0), id 4 (1.0), id 1 (3.0). + tt |> equal(length(ids), 3) + tt |> equal(true, ids[0] == 2 || ids[0] == 4) + tt |> equal(true, ids[2] == 1) + } + } +} + +[test] +def test_where_order_take_select_array(t : T?) { + t |> run("plan_order_family: where + take + terminal _select") @(tt : T?) { + let sounds <- make_sounds() + unsafe { + let ids <- _fold(each(sounds)._where(_.rank >= 2)._order_by(_.x).take(2)._select(_.id).to_array()) + // After filter rank>=2: ids 2,3,4,5 (x = 1,4,1,5). Top 2 by x: ids 2 (1.0), 4 (1.0). + tt |> equal(length(ids), 2) + for (id in ids) { + tt |> equal(true, id == 2 || id == 4) + } + } + } +} + +[test] +def test_where_order_bare_select_array(t : T?) { + t |> run("plan_order_family: where + bare order + terminal _select") @(tt : T?) { + let sounds <- make_sounds() + unsafe { + let ranks <- _fold(each(sounds)._where(_.id != 3)._order_by(_.x)._select(_.rank).to_array()) + // After filter id!=3: ids 1,2,4,5 (x = 3,1,1,5). Sorted by x: (2 or 4, 2 or 4, 1, 5). + tt |> equal(length(ranks), 4) + tt |> equal(ranks[3], 5) + } + } +} + +[test] +def test_order_take_select_decs(t : T?) { + t |> run("plan_decs_order_family: take + terminal _select") @(tt : T?) { + restart() + create_entities(5) $(eid : EntityId; i : int; var cmp : ComponentMap) { + apply_decs_template(cmp, DecsSound(id = i + 1, x = float((i % 2) * 5 + 1))) + } + unsafe { + let ids <- _fold(from_decs_template(type)._order_by(_.x).take(2)._select(_.id).to_array()) + tt |> equal(length(ids), 2) + } + restart() + } +} + +[test] +def test_reverse_take_select_array(t : T?) { + t |> run("plan_reverse: where + reverse + take + terminal _select") @(tt : T?) { + let sounds <- make_sounds() + unsafe { + let ids <- _fold(each(sounds)._where(_.rank > 0).reverse().take(2)._select(_.id).to_array()) + // After filter (all 5), reverse: 5,4,3,2,1. take 2: 5,4. + tt |> equal(length(ids), 2) + tt |> equal(ids[0], 5) + tt |> equal(ids[1], 4) + } + } +} + +[test] +def test_reverse_select_first_array(t : T?) { + t |> run("plan_reverse: where + reverse + _select + first") @(tt : T?) { + let sounds <- make_sounds() + unsafe { + let id = _fold(each(sounds)._where(_.rank > 0).reverse()._select(_.id).first()) + // Reverse of ids 1..5 is 5,4,3,2,1; first = 5. + tt |> equal(id, 5) + } + } +} + +[test] +def test_reverse_take_select_decs(t : T?) { + t |> run("plan_decs_reverse: reverse + take + terminal _select") @(tt : T?) { + restart() + create_entities(4) $(eid : EntityId; i : int; var cmp : ComponentMap) { + apply_decs_template(cmp, DecsSound(id = i + 1, x = float(i))) + } + unsafe { + let ids <- _fold(from_decs_template(type).reverse().take(2)._select(_.id).to_array()) + tt |> equal(length(ids), 2) + } + restart() + } +} + +[test] +def test_join_select_to_array(t : T?) { + t |> run("plan_decs_join: join + terminal _select") @(tt : T?) { + restart() + create_entities(3) $(eid : EntityId; i : int; var cmp : ComponentMap) { + apply_decs_template(cmp, DecsCar(id = i + 1, dealer_id = i % 2 + 1, name = "Car{i}")) + } + create_entities(2) $(eid : EntityId; i : int; var cmp : ComponentMap) { + apply_decs_template(cmp, DecsDealer(id = i + 1, name = "Dealer{i}")) + } + unsafe { + let names <- _fold(from_decs_template(type) |> _join(from_decs_template(type), + $(l, r) => l.dealer_id == r.id, + $(l, r) => (CarName = l.name, DealerName = r.name)) + |> _select(_.CarName) + |> to_array()) + // 3 cars × 1 matching dealer each = 3 results. + tt |> equal(length(names), 3) + } + restart() + } +} + +[test] +def test_zip_3arg_sum(t : T?) { + t |> run("plan_zip: 3-arg zip + sum") @(tt : T?) { + let a <- [1, 2, 3, 4] + let b <- [10, 20, 30, 40] + unsafe { + let total = _fold(each(a) |> zip(each(b), $(x, y : int) => x * y) |> sum()) + // 1*10 + 2*20 + 3*30 + 4*40 = 10 + 40 + 90 + 160 = 300. + tt |> equal(total, 300) + } + } +} + +[test] +def test_zip_3arg_to_array(t : T?) { + t |> run("plan_zip: 3-arg zip + to_array") @(tt : T?) { + let a <- [1, 2, 3] + let b <- [10, 20, 30] + unsafe { + let r <- _fold(each(a) |> zip(each(b), $(x, y : int) => x + y) |> to_array()) + tt |> equal(length(r), 3) + tt |> equal(r[0], 11) + tt |> equal(r[1], 22) + tt |> equal(r[2], 33) + } + } +} diff --git a/tests/linq/test_linq_fold_theme2_trailing_where.das b/tests/linq/test_linq_fold_theme2_trailing_where.das new file mode 100644 index 000000000..41c3ddfd7 --- /dev/null +++ b/tests/linq/test_linq_fold_theme2_trailing_where.das @@ -0,0 +1,257 @@ +options gen2 + +require math +require strings +require daslib/linq +require daslib/linq_boost +require daslib/linq_fold +require daslib/decs +require daslib/decs_boost +require dastest/testing_boost public + +// Theme 2 (audit `benchmarks/sql/linq_fold_chain_audit.md`): trailing `_where` extensions +// across plan_decs_join (8a, C6), plan_group_by_core (4a, 4e), plan_loop_or_count (5c). + +struct Item { + category : string + price : int +} + +struct ActItem { + active : bool + score : int +} + +[decs_template(prefix = "dc_")] +struct DecsCar { + id : int + dealer_id : int + name : string +} + +[decs_template(prefix = "dd_")] +struct DecsDealer { + id : int + name : string +} + +[decs_template(prefix = "di_")] +struct DecsItem { + category : string + price : int +} + +def make_items() : array { + return <- [ + Item(category = "A", price = 100), + Item(category = "A", price = 300), + Item(category = "B", price = 200), + Item(category = "B", price = 800), + Item(category = "B", price = 200), + Item(category = "C", price = 50) + ] +} + +def make_act_items() : array { + return <- [ + ActItem(active = true, score = 10), + ActItem(active = false, score = 20), + ActItem(active = true, score = 30), + ActItem(active = false, score = 40), + ActItem(active = true, score = 50), + ActItem(active = false, score = 60), + ActItem(active = true, score = 70), + ActItem(active = true, score = 80) + ] +} + +// ── plan_decs_join trailing _where (closes 8a, C6) ───────────────────────────── + +[test] +def test_join_where_count(t : T?) { + t |> run("plan_decs_join: trailing _where + count (probe 8a)") @(tt : T?) { + restart() + create_entities(4) $(eid : EntityId; i : int; var cmp : ComponentMap) { + apply_decs_template(cmp, DecsCar(id = i + 1, dealer_id = i % 2 + 1, name = "Car{i}")) + } + create_entities(2) $(eid : EntityId; i : int; var cmp : ComponentMap) { + apply_decs_template(cmp, DecsDealer(id = i + 1, name = "Dealer{i}")) + } + unsafe { + let filtered = _fold(from_decs_template(type) |> _join(from_decs_template(type), + $(l, r) => l.dealer_id == r.id, + $(l, r) => (CarName = l.name, DealerName = r.name)) + |> _where(_.DealerName == "Dealer0") + |> count()) + // 4 cars × dealer_id 1,2,1,2; matching Dealer0 (id=1) → cars i=0,2 → 2 results. + tt |> equal(filtered, 2) + } + restart() + } +} + +[test] +def test_join_where_to_array(t : T?) { + t |> run("plan_decs_join: trailing _where + to_array") @(tt : T?) { + restart() + create_entities(4) $(eid : EntityId; i : int; var cmp : ComponentMap) { + apply_decs_template(cmp, DecsCar(id = i + 1, dealer_id = i % 2 + 1, name = "Car{i}")) + } + create_entities(2) $(eid : EntityId; i : int; var cmp : ComponentMap) { + apply_decs_template(cmp, DecsDealer(id = i + 1, name = "Dealer{i}")) + } + unsafe { + let rows <- _fold(from_decs_template(type) |> _join(from_decs_template(type), + $(l, r) => l.dealer_id == r.id, + $(l, r) => (CarName = l.name, DealerName = r.name)) + |> _where(_.DealerName == "Dealer1") + |> to_array()) + tt |> equal(length(rows), 2) + } + restart() + } +} + +[test] +def test_join_where_select_to_array(t : T?) { + t |> run("plan_decs_join: trailing _where + _select + to_array (combination)") @(tt : T?) { + restart() + create_entities(4) $(eid : EntityId; i : int; var cmp : ComponentMap) { + apply_decs_template(cmp, DecsCar(id = i + 1, dealer_id = i % 2 + 1, name = "Car{i}")) + } + create_entities(2) $(eid : EntityId; i : int; var cmp : ComponentMap) { + apply_decs_template(cmp, DecsDealer(id = i + 1, name = "Dealer{i}")) + } + unsafe { + let names <- _fold(from_decs_template(type) |> _join(from_decs_template(type), + $(l, r) => l.dealer_id == r.id, + $(l, r) => (CarName = l.name, DealerName = r.name)) + |> _where(_.DealerName == "Dealer0") + |> _select(_.CarName) + |> to_array()) + tt |> equal(length(names), 2) + } + restart() + } +} + +// ── plan_group_by_core trailing _where as HAVING (closes 4a, 4e) ─────────────── + +[test] +def test_groupby_having_where_to_array(t : T?) { + t |> run("plan_group_by: trailing _where (HAVING) on post-aggregate tuple (probe 4a)") @(tt : T?) { + let items <- make_items() + unsafe { + let rows <- _fold(each(items) + ._group_by(_.category) + ._select((Cat = _._0, Total = _._1 |> select(@(i : Item) => i.price) |> sum)) + ._where(_.Total > 500) |> to_array()) + // Category totals: A=400, B=1200, C=50. Filter Total > 500 → only B. + tt |> equal(length(rows), 1) + } + } +} + +[test] +def test_groupby_having_where_count(t : T?) { + t |> run("plan_group_by: trailing _where + count") @(tt : T?) { + let items <- make_items() + unsafe { + let n = _fold(each(items) + ._group_by(_.category) + ._select((Cat = _._0, Total = _._1 |> select(@(i : Item) => i.price) |> sum)) + ._where(_.Total > 100) + |> count()) + // Buckets with Total > 100: A (400), B (1200) → 2. + tt |> equal(n, 2) + } + } +} + +[test] +def test_groupby_having_where_decs(t : T?) { + t |> run("plan_decs_group_by: trailing _where (probe 4e)") @(tt : T?) { + restart() + create_entities(6) $(eid : EntityId; i : int; var cmp : ComponentMap) { + let cats = ["A", "A", "B", "B", "B", "C"] + let prices = [100, 300, 200, 800, 200, 50] + apply_decs_template(cmp, DecsItem(category = cats[i], price = prices[i])) + } + unsafe { + let rows <- _fold(from_decs_template(type) + ._group_by(_.category) + ._select((Cat = _._0, Total = _._1 |> select(@(i : tuple) => i.price) |> sum)) + ._where(_.Total > 500) + |> to_array()) + // Same data as array: A=400, B=1200, C=50. Filter keeps B. + tt |> equal(length(rows), 1) + } + restart() + } +} + +// ── plan_loop_or_count take.where (closes 5c, all 4 lanes) ───────────────────── + +[test] +def test_take_where_count(t : T?) { + t |> run("plan_loop_or_count: take(N)._where(p).count() (counter lane, probe 5c)") @(tt : T?) { + let items <- make_act_items() + unsafe { + // First 5: active = [T,F,T,F,T]. Count active = 3. + // Semantic distinction from _where(p).take(5): would be 5 (5 active in items). + let r1 = _fold(each(items).take(5)._where(_.active).count()) + tt |> equal(r1, 3) + let r2 = _fold(each(items)._where(_.active).take(5).count()) + tt |> equal(r2, 5) + } + } +} + +[test] +def test_take_where_sum(t : T?) { + t |> run("plan_loop_or_count: take(N)._where(p).sum() (accumulator lane)") @(tt : T?) { + let items <- make_act_items() + unsafe { + // _select projects scores. take(5) → [10,20,30,40,50]. _where(>0) keeps all. sum=150. + let s = _fold(each(items)._select(_.score).take(5)._where(_ > 0).sum()) + tt |> equal(s, 150) + } + } +} + +[test] +def test_take_where_first(t : T?) { + t |> run("plan_loop_or_count: take(N)._where(p).first_or_default() (early-exit lane)") @(tt : T?) { + let items <- make_act_items() + unsafe { + // take(3) → first 3 (T,F,T). First active = 10. + let f = _fold(each(items).take(3)._where(_.active).first_or_default(ActItem(active = false, score = -1))) + tt |> equal(f.score, 10) + } + } +} + +[test] +def test_take_where_to_array(t : T?) { + t |> run("plan_loop_or_count: take(N)._where(p).to_array() (array lane)") @(tt : T?) { + let items <- make_act_items() + unsafe { + // take(5) → first 5 (T,F,T,F,T). where(active) → 3 items, scores 10,30,50. + let arr <- _fold(each(items).take(5)._where(_.active).to_array()) + tt |> equal(length(arr), 3) + tt |> equal(arr[0].score, 10) + tt |> equal(arr[2].score, 50) + } + } +} + +[test] +def test_take_zero_where(t : T?) { + t |> run("plan_loop_or_count: take(0)._where edge case") @(tt : T?) { + let items <- make_act_items() + unsafe { + let r = _fold(each(items).take(0)._where(_.active).count()) + tt |> equal(r, 0) + } + } +}