Skip to content

linq_fold: extract _fold + add _old_fold baseline; 24-benchmark 4-way suite#2687

Merged
borisbat merged 1 commit into
masterfrom
bbatkin/linq-fold-foundation
May 16, 2026
Merged

linq_fold: extract _fold + add _old_fold baseline; 24-benchmark 4-way suite#2687
borisbat merged 1 commit into
masterfrom
bbatkin/linq-fold-foundation

Conversation

@borisbat
Copy link
Copy Markdown
Collaborator

Summary

Foundation work (Phase 0+1) for the planner-driven splice-mode rewrite of the LINQ _fold macro. This PR is purely additive / behavior-preserving_fold produces the exact same expansion as before. The splice work lands in later, per-operator PRs that turn the m3f_old baseline into a real comparison.

  • Extracts _fold and the dispatch infrastructure (linqCalls dict, g_foldSeq, fold_* helpers, flatten_linq, fold_linq_default) from daslib/linq_boost.das (1231 → 724 lines) into a new daslib/linq_fold.das module (561 lines). linq_boost does require daslib/linq_fold public, so every existing call site continues to resolve _fold(...) without change.
  • Adds _old_fold — same fold_linq_default code path, parameterized over the recursive-call macro name (_fold recurses into _fold, _old_fold into _old_fold). This freezes today's behavior as the benchmark baseline once _fold diverges in Phase 2+.
  • 24 4-way benchmark files at 100K rows under benchmarks/sql/, modeled on the existing count_aggregate.das. Each compares m1 (_sql over :memory:), m3 (plain LINQ), m3f_old (_old_fold), m3f (_fold). 21 new files plus 4 existing extended to the 4-way shape.
  • benchmarks/sql/LINQ.md — project notes, phase status, baseline table, operator-coverage checklist against the broader tests/linq/ suite, and design decisions. Subsequent splice PRs update this file with deltas and tick off the parity matrix.
  • Doc updateslinq_fold registered in doc/reflections/das2rst.das with a minimal document_module_linq_fold block; module-linq_fold.rst handmade description; toctree entry in sec_algorithms.rst. Sphinx -W builds clean (no warnings, no errors); no // stub or Uncategorized remaining.

Baseline highlights (100K rows, ns/op per element, INTERP)

Pattern Current _fold vs plain LINQ SQL position
where + count 6× faster (29 → 5) parity
where + select + sum 3.5× (43 → 12) parity (33)
chained where + count 2.6× (45 → 17) parity (36)
where + select + to_array (43 → 11) m3f wins (70 SQL)
PK indexed lookup 6× over plain scan; both lose to SQL ~1400× (b-tree)
sort + first / take parity with plain (full sort) ~60× (partial-sort + LIMIT)

_fold already brings substantial wins on the 6 patterns it explicitly recognizes (where+count, where+select, select+where, order+distinct, bare where, bare select). The Phase 2+ work targets the parity rows — where _fold falls through to the default emitter and matches plain LINQ. Each new splice path collapses one of those m3=m3f_old=m3f rows. Full table in benchmarks/sql/LINQ.md.

Notes for review

  • Verbatim extraction. The linq_fold.das infrastructure is a verbatim move of the same code that lived in linq_boost.das on master, with one targeted change: fold_linq_default now takes a recursiveMacroName : string parameter so the recursive make_call(..., recursiveMacroName) at the inner-pipeline sub-fold point routes back to the calling macro. _fold passes "_fold", _old_fold passes "_old_fold". This is the only knob that lets the two macros stay in lockstep today and diverge cleanly in Phase 2+.
  • A few benchmark variants are 3-way, not 4-way, where sqlite_linq has no clean form (zip — not a relational op; join with inner select_from — wiring not exposed without more plumbing). LINQ.md documents each case. Other workarounds: _any() → _first_opt() |> is_some, _all(p) → count(where ¬p) == 0, take/skip/distinct → terminate in to_array (LIMIT/OFFSET/DISTINCT can't precede an aggregate in sqlite_linq today).
  • AOT/JIT verification deferred to the per-operator splice PRs. Pure-extraction + new-benchmarks shouldn't break AOT, and CI runs the full matrix; if it surfaces something I'll fix in the PR.
  • PERF009 suppression in linq_fold.das on the qmacro_expr that emits var pass_N = call — the macro's single-pass output pattern triggers the rule at user call sites; rewriting it would change _old_fold's baseline. Inline // nolint:PERF009 with a comment explaining why.

Test plan

  • mcp__daslang__compile_check on all 26 touched .das files — clean
  • mcp__daslang__lint + format_file — 0 issues
  • tests/linq/ (all 13 files, 491 tests) — pass unchanged
  • tests/linq/test_linq_fold.das + test_linq_fold_ast.das (98 tests) — pass unchanged
  • tests/dasSQLITE/test_05_sql_macro.das (19 tests) — pass
  • Sanity script: _fold and _old_fold produce identical results
  • All 24 benchmarks run end-to-end via ./bin/daslang dastest/dastest.das -- --bench --test benchmarks/sql --test-names none (full suite, ~86s)
  • ./bin/daslang doc/reflections/das2rst.das — no stubs left, no Uncategorized
  • Clean sphinx-build -b html — succeeds with 0 warnings / 0 errors

Plan file: ~/.claude/plans/keen-hopping-balloon.md.

🤖 Generated with Claude Code

… suite

Phase 0+1 of the planner-driven splice-mode rewrite. Extracts _fold and the
dispatch infrastructure (linqCalls dict, g_foldSeq, fold_* helpers,
flatten_linq, fold_linq_default) from daslib/linq_boost.das into a new
daslib/linq_fold.das module; linq_boost requires linq_fold public so the
macro stays visible at every existing call site. Adds _old_fold — same
fold_linq_default code path, parameterized so its recursive sub-folds keep
targeting _old_fold once _fold diverges in later PRs. This freezes today's
behavior as the benchmark baseline.

Adds 24 4-way benchmark files at 100K rows under benchmarks/sql/, modeled
on count_aggregate.das. Each compares an in-memory SQLite query against
the LINQ chain in m1 (_sql), m3 (plain linq), m3f_old (_old_fold), m3f
(_fold) variants. Baselines, operator-coverage tracking, and phase status
live in benchmarks/sql/LINQ.md alongside the design notes. m3f and m3f_old
are identical by construction in this PR; the delta becomes meaningful as
splice paths land per operator family.

All linq + dasSQLITE tests pass; all touched .das files lint+format clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 16, 2026 15:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Foundation refactor (Phase 0+1) that extracts the LINQ _fold macro and its dispatch infrastructure from daslib/linq_boost.das into a new daslib/linq_fold.das module, adds a frozen-baseline twin macro _old_fold sharing the same code path (parameterized only by the macro name used in the recursive sub-fold call), and lands a 24-file benchmark suite under benchmarks/sql/ comparing m1 (_sql) / m3 (plain LINQ) / m3f_old / m3f at 100K rows. linq_boost re-exports the new module via require ... public, so existing _fold call sites are unaffected.

Changes:

  • Move _fold machinery (linqCalls, flatten_linq, g_foldSeq, all fold_* helpers, fold_linq_default) out of linq_boost.das into a new linq_fold.das; thread a recursiveMacroName : string parameter through fold_linq_default so _fold and the new _old_fold recurse into themselves.
  • Register linq_fold in doc/reflections/das2rst.das plus a handmade RST description and a toctree entry.
  • Add 21 new + extend 4 existing benchmark files (4-way m1/m3/m3f_old/m3f shape, 100K rows) and a benchmarks/sql/LINQ.md design/progress document.

Reviewed changes

Copilot reviewed 30 out of 30 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
daslib/linq_fold.das New module: extracted _fold infra + new _old_fold macro; adds recursiveMacroName param and inline // nolint:PERF009.
daslib/linq_boost.das Removes the moved code; adds require daslib/linq_fold public to preserve resolution of _fold(...).
doc/reflections/das2rst.das Registers linq_fold module documentation generator.
doc/source/stdlib/handmade/module-linq_fold.rst Handmade module description (purpose, usage, example) for the new module.
doc/source/stdlib/sec_algorithms.rst Adds linq_fold to algorithms toctree.
benchmarks/sql/_common.das Extends fixture schema with brand, year, dealer_id, adds Dealer table and fixture_dealers_array() helper.
benchmarks/sql/LINQ.md Project notes, baseline table, operator-coverage matrix, design decisions.
benchmarks/sql/count_aggregate.das, select_where.das, select_where_order_take.das, indexed_lookup.das Extended to the 4-way shape, renamed benchmark functions, normalized correctness gate to empty(...).
benchmarks/sql/{sum_aggregate,sum_where,min_aggregate,max_aggregate,average_aggregate,first_match,any_match,all_match,to_array_filter,take_count,skip_take,distinct_count,sort_first,sort_take,groupby_count,groupby_sum,chained_where,zip_dot_product,join_count}.das New 4-way (or 3-way where SQL has no clean form) benchmarks at 100K rows.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


// take(N) |> to_array — bounded materialization (SQL can't put count() after take()
// because LIMIT-then-aggregate collapses to one row; we measure the array result
// and use length() as the correctness gate). Expected length: TAKE_N (assuming N >= TAKE_N).
@borisbat borisbat merged commit c629611 into master May 16, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants