Skip to content

[pull] master from GaijinEntertainment:master#1010

Merged
pull[bot] merged 12 commits into
forksnd:masterfrom
GaijinEntertainment:master
May 19, 2026
Merged

[pull] master from GaijinEntertainment:master#1010
pull[bot] merged 12 commits into
forksnd:masterfrom
GaijinEntertainment:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 19, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

borisbat and others added 12 commits May 18, 2026 14:56
Three new planners slot into the _fold cascade between plan_order_family
and plan_loop_or_count:

- plan_reverse: src [|> where_(p)]* [|> select(proj)]? |> reverse
                 [|> take(N)]? [|> {to_array, count, first, first_or_default}]
  Buffer-required terminators emit single prefilter pass into buffer +
  reverse_inplace + optional resize(min(N, length(buf))). count elides the
  buffer (counter loop only); first / first_or_default emit a last-survivor
  tracker.

- plan_distinct: src [|> where_(p)]* [|> select(proj)]? |> distinct[_by(key)]
                 [|> take(N)]? [|> terminator]
  Streaming dedup via table<unique_key> integrated into the prefilter loop.
  take(N) → break-after-N early-exit (guard placed BEFORE the consume site
  so non-positive N short-circuits). count terminator elides the buffer and
  returns length(seen); sum / long_count fold inline.

- plan_group_by: src |> group_by[_lazy](key) |> select(group_proj)
                 [|> {to_array, count}]?
  Incremental-aggregate fusion via table<unique_key; tuple<KT; AccT>>.
  Recognized reducers: _._1 |> {length, count, sum, long_count}. Bare +
  named-tuple wrap (K=_._0, V=_._1|>reducer) both supported; named-tuple
  preserves user field-name hints by using group_proj._type as the table
  value type. Bails to tier 2 for inner-select-sum, multi-reducer, having_,
  side-effectful keys.

Headlines (100K rows INTERP):
- distinct_count m3f: 33 → 15 ns/op (2.2× over prev; beats SQL m1=41)
- distinct_take m3f: new — 0 ns/op (early-exit at 3 unique brands)
- groupby_count m3f: 76 → 33 ns/op (2.3× over prev; 4.3× over SQL m1=141)
- reverse_take m3f: regression vs m3 (lazy reverse iterator wins for small N)
- groupby_sum m3f: unchanged (inner-select-sum deferred — see follow-up plan)

is_buffer_required_op trimmed to {zip, join, left_join, group_join} —
the named-marker arm in plan_loop_or_count is now pure defense-in-depth
because the three new planners catch their ops first.

New benchmarks: benchmarks/sql/reverse_take.das, benchmarks/sql/distinct_take.das.
LINQ.md updated with phase status + new section + coverage table.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `options _comment_hygiene = true` so daslib/style_lint enforces
STYLE014 (3-line cap on module/public //) and STYLE015 (1-line cap on
private //) at compile time. Stale comment blocks accumulated across
the three-tier cascade landings (PRs #2687, #2697, #2712, #2714, #2721).

Cleanup performed by utils/hygiene/ for //! private-docstring stripping,
then a lint-driven script that keeps the first WHY line of each
multi-line `//` block flagged by STYLE014/STYLE015 and drops the rest.

Net: 383 lines removed, 10 added — file shrunk 2294 → 1919 lines.

All 867 tests/linq still pass (INTERP + AOT).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds four small helpers used by plan_reverse / plan_distinct / plan_group_by:

- merge_where_cond(curr, pred) — collapses the 5-line "if-null-else-&&"
  pattern that fused chained where_ predicates into a one-liner.

- strip_const_ref(t) — clears constant + ref flags on a cloned TypeDecl;
  collapses the 3-line guarded assignment pattern (was repeated 5×).

- finalize_emission(top, srcName, at, body) — wraps a single body
  Expression in the canonical `invoke($($i(src) : T) { $e(body) }, src)`
  emission. Used by plan_reverse where the body is a qmacro_block.

- finalize_emission_stmts(top, srcName, at, stmts) — same wrap but
  splices a stmts array via $b. Used by plan_distinct + plan_group_by
  (their nested stmt construction is parse-fragile inside qmacro_block).

- buffer_return(bufName, isIterator) — emits the `return <- buf` /
  `return <- buf.to_sequence_move()` tail; collapses the 7-line
  if/else pattern (was repeated 3×).

plan_reverse also gets a structural refactor to qmacro_block bodies
per branch (count, first/first_or_default, to_array), reading top-to-
bottom as the emission shape instead of N × "stmts |> push <| qmacro_expr".

plan_distinct + plan_group_by keep their stmts-array structure (cross-
qmacro $i tag resolution fails on nested qmacro_block in those shapes —
the inner-block tag scope doesn't unify with the outer block's), but
adopt the shared helpers where applicable.

Net: 180 lines removed, 102 added — file shrunk 1919 → 1841 lines.

All 867 tests/linq pass interpret + AOT.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fixes

Addresses Copilot review on PR #2721:

1. plan_distinct: select(impure) → distinct evaluated projection twice per
   source element (once for the unique_key, once for buffer push or += acc),
   firing observable side effects twice vs once in the original LINQ chain.
   Fix: when projection has side effects, bind projected value once per
   element to a fresh `pv` local; key + push/acc share the bind. Pure
   projections still inline (matches plan_loop_or_count's discipline).

2. plan_distinct: take(N) limit was cloned into the consume body, so a
   side-effecting take arg fired once per fresh distinct key. Fix: bind
   the limit once at outer scope (alongside `taken = 0`); guard reads the
   local. Same single-eval discipline as the projection bind.

3. LINQ.md: corrected stale text claiming `is_buffer_required_op` was
   trimmed to {zip, join, left_join, group_join} — it retains all buffer-
   required names, but distinct/reverse/group_by are now defense-in-depth
   (caught earlier by dedicated planners).

4. test_linq_fold_ast.das: trimmed stale comment claiming named-tuple
   field names collapse to positional access — splice DOES preserve K/N
   field names via the user's group_proj body type as table value.

New test: test_select_distinct_impure_correctness pins the single-eval
behavior (projection fires N times for N source elements, not 2N).

All 869 tests/linq pass interpret + AOT (+2 new vs prior).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ening

Bundle addressing Copilot review pass on PR #2721:

- Formatter fix (3 lines in linq_fold.das): `- const - &` → `-const -&` (no
  spaces). darwin15 extended_checks formatter verify pass was failing CI.
- Per-key group_by test assertions: `test_group_by_count_named_tuple_preserves_keys`
  and `test_group_by_sum_named_tuple_parity` now check per-bucket count/sum (not
  just total). Catches wrong-bucket regressions that preserve aggregate totals.
- LINQ.md doc tightening:
  - distinct terminator list: drop `first`/`first_or_default` (not implemented;
    those cascade to tier 2)
  - distinct pseudo-code: take-guard BEFORE seen.insert/buf.push_clone (matches
    actual emission order — `take(0)` doesn't mutate seen-table or buffer)
  - reverse_take regression explanation: rewrite root cause. Both m3 and m3f
    materialize the full source buffer; m3's iterator-form `reverse()` yields
    backwards lazily via index so `take(N)` only triggers N yields; m3f does
    a full reverse_inplace before resize(N). For N << length the lazy approach
    wins. Future optimization: backward-index loop bounded to last N — deferred.
  - reverse_take.das header comment: drop "headline win" wording (it's a
    regression vs m3, not a win)
  - BufferGroupBy phase row + bail list: narrow to `src |> group_by_lazy(...)`
    (no upstream where_/select* fusion in v1 — deferred). Also clarify that
    side-effectful key bodies are NOT a bail — the emitted code evaluates the
    key once per element (same as plain group_by_lazy), so no re-eval hazard
    exists.

Verification: 95/95 AST tests, 234/234 functional tests, 329/329 AOT, lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d comments

Round-2 Copilot review on PR #2721 — 4 doc/comment-only fixes:

- LINQ.md:341 (distinct buffer-not-required terminators): clarify long_count
  is count-shaped (returns int64(length(seen))), not an inline accumulator —
  only sum folds inline at the freshly-inserted site with acc += <projection>.
- LINQ.md:345 (plan_group_by recognizer description): drop reference to a
  non-existent `try_recognize_group_reducer` rewriter and return tuple. Actual
  recognizer is is_bucket_reducer_call (linq_fold.das:1666), inline accumulator
  update — no separate two-pass rewriter.
- linq_fold.das:1694 — finish truncated "yields bucket" comment ("yields raw
  buckets (no fusion)").
- linq_fold.das:1755 — finish truncated "UniqKeyType wraps with" comment
  ("key is typedecl(keyBlock(default<elemT>)) wrapped in _::unique_key for
  the hash slot").

No code changes. Lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d comment

Round-3 Copilot review on PR #2721 — one real bug + one trailing nit:

- plan_group_by previously assigned entry._0 = k on EVERY element. Reference
  group_by_lazy_impl (linq.das:1946-1950) only stores the key when the bucket
  is first created. For non-workhorse keys where unique_key (= "{a}") happens
  to collapse two observationally-different objects to the same string, the
  splice was storing the LAST key per bucket instead of the FIRST — divergent
  from reference semantics. Fix: gate the assignment on !key_exists before the
  tab[uk] subscript materializes the entry. One extra hash lookup per element,
  but correctness over speed for the rare-collision case. For workhorse keys
  (int/float/etc. — the canonical _ % N idiom) unique_key is a bijection so
  the assignment was harmless overwriting; behavior unchanged.

- linq_fold.das:1817 — finish truncated "(it's already" comment ("already shaped
  as the named-tuple / reducer result").

Verification: 234 functional + 95 AST + 329 AOT all pass; lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-wins regression

Per Boris's suggestion: replace `key_exists(uk) + tab[uk]` (2 hash ops per element)
with `tab?[uk] ?? dummy` (1 hash op per element) + addr-compare for new-bucket detection.

Pattern:
    var dummy : <tab value type>           # OUT OF LOOP
    for it in src:
        let k = invoke(keyBlk, it)
        let uk = _::unique_key(k)
        unsafe:
            var entry & = tab?[uk] ?? dummy   # one hash op: ref to existing entry OR dummy
            <accUpdate>                       # entry._1++ or entry._1 += it
            if addr(entry) == addr(dummy):    # dummy path = new bucket
                dummy._0 = k
                tab[uk] = dummy
                dummy = default<...>

For canonical _ % N int key with small bucket count (groupby_count's 5 brands in 100K
rows = ~99.995% hits), this is ~1 hash op/element vs the prior 2 ops/element.

Headlines (100K rows INTERP):
  groupby_count m3f:  43 ns/op (prev fix) -> 37 ns/op (this fix)
  groupby_sum m3f:    105 -> 108 (within noise; sum cascades to tier 2 via inner-select-sum)
  distinct_count m3f: 15 -> 16  (within noise)

Still slightly above the pre-correctness-fix 33 ns/op baseline due to the addr-compare,
the predictable if-branch, and the dummy reset on miss. Net: 3.8x over SQL m1 (141),
1.9x over m3 (71). All 234 functional + 95 AST + 329 AOT tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…h clamp

CI lint chain follows my new test_linq_fold.das generic instantiations into
linq.das:1071-1072 and trips PERF015 ("ternary min — use 'min(a, b)' from
math module"). Pre-existing on master but per the standing "fix all lint
issues, even pre-existing" rule, swap the clamping ternaries to math::clamp
which is exactly equivalent (max(lo, min(hi, value))). linq.das already
requires math.

Unblocks extended_checks (linux, 64).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address Copilot round-3 review on PR #2721:

- distinct_take.das: m3 and m3f now BOTH use each(arr) (iterator source) for
  apples-to-apples comparison. Prior bench had m3 on arr (eager distinct_impl)
  vs m3f on arr.to_sequence() (lazy distinct generator). The "0 ns/op" m3f win
  partly reflected lazy iterator stop-on-take semantics, not the splice.
  Updated numbers: m3 30 -> m3f 0 ns/op. The real win is splice eliminating
  per-yield generator-dispatch overhead in the lazy chain.

- LINQ.md:35 BufferDistinct row: drop trailing "..." from terminator list;
  spell out exact spliced set {to_array, count, sum, long_count} and note that
  first/first_or_default/min/max/average cascade to tier 2.

- LINQ.md:343 plan_distinct section: add new "Intentional divergence" note
  explaining that distinct[_by] |> take(N) is streaming (splice breaks on first
  N distinct keys) vs the runtime's eager materialize-then-take. For pure code
  observationally identical; for side-effecting keys/upstream, splice fires
  side-effects on fewer source elements. This is by design — the splice gives
  array sources the same iterator-form semantics already provided by linq.das:668
  (the iterator-form distinct() is a lazy generator that stops under take).

- LINQ.md headline table: refresh distinct_take (0 ns/op, "splice eliminates
  generator-dispatch overhead"), groupby_count (37 ns/op from Boris's dummy-bind
  pattern), and benchmark "Shape" column to show each(arr) source.

234 functional tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…treaming exit

Per-element take-guard was previously inside the `if (!seen.key_exists(uk))` fresh-key
branch, which meant the loop only checked `taken >= takeLim` when scanning a fresh key.
For inputs like `[0,1,2,0,0,0,...,3]` (first N unique, then many duplicates of seen
brands, then an (N+1)th fresh at the end), the loop kept firing key_exists / where /
projection on every duplicate until the next fresh key triggered the inner break.

Move the guard to the top of the for-loop. Now `taken >= takeLim` fires at the start of
any iteration past the Nth distinct consume — duplicates after the Nth distinct skip the
key computation entirely. True streaming-take semantics with O(N)-source upper bound.

Headline benchmark numbers unchanged (distinct_take.das uses BRAND_COUNT=5 / TAKE_N=3
with cycling brands, so the Nth fresh and (N+1)th fresh are back-to-back at source
positions 2 and 3 — both inner and outer guards fire at position 3). The fix is for the
worst case (many duplicates between Nth and (N+1)th fresh).

Also in this commit:
- LINQ.md: "should never fire" wording — corrected to describe `is_buffer_required_op`
  as a fallback guard for shapes the dedicated planners don't cover (e.g. `distinct().first()`,
  `order_by(...).first()` cascade via the marker since the spliced terminator set is finite).
- linq_fold.das: two truncated "Inner-select" / "Named-tuple wrap" comments finished.
- distinct_take.das: header comment synced with the each(arr) source change from db3c392
  (was still describing the old eager-array shape).

234 functional + 95 AST + 329 AOT tests pass. Lint clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…uffer-distinct-groupby-reverse

linq_fold: buffer-required splice arms — reverse, distinct, group_by
@pull pull Bot locked and limited conversation to collaborators May 19, 2026
@pull pull Bot added the ⤵️ pull label May 19, 2026
@pull pull Bot merged commit f9ba7a4 into forksnd:master May 19, 2026
1 of 5 checks passed
@pull pull Bot had a problem deploying to github-pages May 19, 2026 02:58 Error
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant