linq_fold: extract _fold + add _old_fold baseline; 24-benchmark 4-way suite#2687
Conversation
… suite Phase 0+1 of the planner-driven splice-mode rewrite. Extracts _fold and the dispatch infrastructure (linqCalls dict, g_foldSeq, fold_* helpers, flatten_linq, fold_linq_default) from daslib/linq_boost.das into a new daslib/linq_fold.das module; linq_boost requires linq_fold public so the macro stays visible at every existing call site. Adds _old_fold — same fold_linq_default code path, parameterized so its recursive sub-folds keep targeting _old_fold once _fold diverges in later PRs. This freezes today's behavior as the benchmark baseline. Adds 24 4-way benchmark files at 100K rows under benchmarks/sql/, modeled on count_aggregate.das. Each compares an in-memory SQLite query against the LINQ chain in m1 (_sql), m3 (plain linq), m3f_old (_old_fold), m3f (_fold) variants. Baselines, operator-coverage tracking, and phase status live in benchmarks/sql/LINQ.md alongside the design notes. m3f and m3f_old are identical by construction in this PR; the delta becomes meaningful as splice paths land per operator family. All linq + dasSQLITE tests pass; all touched .das files lint+format clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Foundation refactor (Phase 0+1) that extracts the LINQ _fold macro and its dispatch infrastructure from daslib/linq_boost.das into a new daslib/linq_fold.das module, adds a frozen-baseline twin macro _old_fold sharing the same code path (parameterized only by the macro name used in the recursive sub-fold call), and lands a 24-file benchmark suite under benchmarks/sql/ comparing m1 (_sql) / m3 (plain LINQ) / m3f_old / m3f at 100K rows. linq_boost re-exports the new module via require ... public, so existing _fold call sites are unaffected.
Changes:
- Move
_foldmachinery (linqCalls,flatten_linq,g_foldSeq, allfold_*helpers,fold_linq_default) out oflinq_boost.dasinto a newlinq_fold.das; thread arecursiveMacroName : stringparameter throughfold_linq_defaultso_foldand the new_old_foldrecurse into themselves. - Register
linq_foldindoc/reflections/das2rst.dasplus a handmade RST description and a toctree entry. - Add 21 new + extend 4 existing benchmark files (4-way
m1/m3/m3f_old/m3fshape, 100K rows) and abenchmarks/sql/LINQ.mddesign/progress document.
Reviewed changes
Copilot reviewed 30 out of 30 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| daslib/linq_fold.das | New module: extracted _fold infra + new _old_fold macro; adds recursiveMacroName param and inline // nolint:PERF009. |
| daslib/linq_boost.das | Removes the moved code; adds require daslib/linq_fold public to preserve resolution of _fold(...). |
| doc/reflections/das2rst.das | Registers linq_fold module documentation generator. |
| doc/source/stdlib/handmade/module-linq_fold.rst | Handmade module description (purpose, usage, example) for the new module. |
| doc/source/stdlib/sec_algorithms.rst | Adds linq_fold to algorithms toctree. |
| benchmarks/sql/_common.das | Extends fixture schema with brand, year, dealer_id, adds Dealer table and fixture_dealers_array() helper. |
| benchmarks/sql/LINQ.md | Project notes, baseline table, operator-coverage matrix, design decisions. |
| benchmarks/sql/count_aggregate.das, select_where.das, select_where_order_take.das, indexed_lookup.das | Extended to the 4-way shape, renamed benchmark functions, normalized correctness gate to empty(...). |
| benchmarks/sql/{sum_aggregate,sum_where,min_aggregate,max_aggregate,average_aggregate,first_match,any_match,all_match,to_array_filter,take_count,skip_take,distinct_count,sort_first,sort_take,groupby_count,groupby_sum,chained_where,zip_dot_product,join_count}.das | New 4-way (or 3-way where SQL has no clean form) benchmarks at 100K rows. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| // take(N) |> to_array — bounded materialization (SQL can't put count() after take() | ||
| // because LIMIT-then-aggregate collapses to one row; we measure the array result | ||
| // and use length() as the correctness gate). Expected length: TAKE_N (assuming N >= TAKE_N). |
Summary
Foundation work (Phase 0+1) for the planner-driven splice-mode rewrite of the LINQ
_foldmacro. This PR is purely additive / behavior-preserving —_foldproduces the exact same expansion as before. The splice work lands in later, per-operator PRs that turn them3f_oldbaseline into a real comparison._foldand the dispatch infrastructure (linqCallsdict,g_foldSeq,fold_*helpers,flatten_linq,fold_linq_default) fromdaslib/linq_boost.das(1231 → 724 lines) into a newdaslib/linq_fold.dasmodule (561 lines).linq_boostdoesrequire daslib/linq_fold public, so every existing call site continues to resolve_fold(...)without change._old_fold— samefold_linq_defaultcode path, parameterized over the recursive-call macro name (_foldrecurses into_fold,_old_foldinto_old_fold). This freezes today's behavior as the benchmark baseline once_folddiverges in Phase 2+.benchmarks/sql/, modeled on the existingcount_aggregate.das. Each compares m1 (_sqlover:memory:), m3 (plain LINQ), m3f_old (_old_fold), m3f (_fold). 21 new files plus 4 existing extended to the 4-way shape.benchmarks/sql/LINQ.md— project notes, phase status, baseline table, operator-coverage checklist against the broadertests/linq/suite, and design decisions. Subsequent splice PRs update this file with deltas and tick off the parity matrix.linq_foldregistered indoc/reflections/das2rst.daswith a minimaldocument_module_linq_foldblock;module-linq_fold.rsthandmade description; toctree entry insec_algorithms.rst. Sphinx-Wbuilds clean (no warnings, no errors); no// stuborUncategorizedremaining.Baseline highlights (100K rows, ns/op per element, INTERP)
_foldvs plain LINQ_foldalready brings substantial wins on the 6 patterns it explicitly recognizes (where+count,where+select,select+where,order+distinct, barewhere, bareselect). The Phase 2+ work targets the parity rows — where_foldfalls through to the default emitter and matches plain LINQ. Each new splice path collapses one of thosem3=m3f_old=m3frows. Full table in benchmarks/sql/LINQ.md.Notes for review
linq_fold.dasinfrastructure is a verbatim move of the same code that lived inlinq_boost.dason master, with one targeted change:fold_linq_defaultnow takes arecursiveMacroName : stringparameter so the recursivemake_call(..., recursiveMacroName)at the inner-pipeline sub-fold point routes back to the calling macro._foldpasses"_fold",_old_foldpasses"_old_fold". This is the only knob that lets the two macros stay in lockstep today and diverge cleanly in Phase 2+.zip— not a relational op;joinwith inner select_from — wiring not exposed without more plumbing). LINQ.md documents each case. Other workarounds:_any() → _first_opt() |> is_some,_all(p) → count(where ¬p) == 0,take/skip/distinct → terminate in to_array(LIMIT/OFFSET/DISTINCT can't precede an aggregate in sqlite_linq today).linq_fold.dason the qmacro_expr that emitsvar pass_N = call— the macro's single-pass output pattern triggers the rule at user call sites; rewriting it would change_old_fold's baseline. Inline// nolint:PERF009with a comment explaining why.Test plan
mcp__daslang__compile_checkon all 26 touched.dasfiles — cleanmcp__daslang__lint+format_file— 0 issuestests/linq/(all 13 files, 491 tests) — pass unchangedtests/linq/test_linq_fold.das+test_linq_fold_ast.das(98 tests) — pass unchangedtests/dasSQLITE/test_05_sql_macro.das(19 tests) — pass_foldand_old_foldproduce identical results./bin/daslang dastest/dastest.das -- --bench --test benchmarks/sql --test-names none(full suite, ~86s)./bin/daslang doc/reflections/das2rst.das— no stubs left, no Uncategorizedsphinx-build -b html— succeeds with 0 warnings / 0 errorsPlan file:
~/.claude/plans/keen-hopping-balloon.md.🤖 Generated with Claude Code