Skip to content

[pull] master from GaijinEntertainment:master#1002

Merged
pull[bot] merged 25 commits into
forksnd:masterfrom
GaijinEntertainment:master
May 17, 2026
Merged

[pull] master from GaijinEntertainment:master#1002
pull[bot] merged 25 commits into
forksnd:masterfrom
GaijinEntertainment:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 17, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

borisbat and others added 25 commits May 16, 2026 10:51
Replaces _fold's comprehension emitter with a planner that walks the chain
and emits a plain for-loop inside invoke($block, $src). Two terminator
lanes:

- array lane: [_where*][_select?] → loop + push_clone (identity) or
  emplace-of-bound-projection (workhorse choice made at macro time from
  the projection's _type.isWorkhorseType, no runtime static_if).
- counter lane: same intermediates + _count → counter loop with `n++`.

Chained _where|_where fuse into a single && predicate; chained
_select|_select fall through (needs ExprRef2Value-aware substitution,
deferred to Phase 2B). Anything outside the two lanes (_select|_where,
_sum, _min, _max, _first, _any, _all, _long_count, _order, _distinct,
_take, _skip, _zip, _reverse, ...) returns the raw chain unfolded —
no dispatch to _old_fold or fold_linq_default.

_old_fold and fold_linq_default are untouched; the comprehension contract
now lives solely on _old_fold (10 AST tests retargeted; 8 new AST tests +
6 behavioral tests cover the new loop emission).

Benchmark deltas (100K, INTERP, ns/op per element):
  count_aggregate (where|count):       5 → 5    parity
  chained_where (where|where|count):  17 → 8    2.1× faster
  select_count (select|count):        15 → 2    7.5× faster
  to_array_filter (where|select):     11 → 13   ~18% slower vs comprehension

Out-of-scope shapes regress to m3 (plain linq) — accepted as the
forcing function for Phase 2B (sum/min/max/first/any/all + chained
selects + take/skip).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ilter parity

The first Phase-2A cut was ~18% slower than the _old_fold comprehension on
where|select|to_array. Four small fixes brought it to 11 ns/op parity:

1. Workhorse decision at macro time, not runtime. The projection's _type is
   resolved when the planner runs, so the macro reads
   projection._type.isWorkhorseType directly and emits exactly one branch
   instead of a runtime static_if.

2. Pre-reserve when the source has a known length. The planner emits
   acc |> reserve(length(src)) when top._type isn't an iterator — matches
   what ExprArrayComprehension lowering does internally.

3. Peel each(<array>) at macro time. each(arr) reports as iterator<T> so
   (2) wouldn't fire on benchmark sources like each(arr)._where(...). The
   planner now detects each(<expr>) where the inner has length and unwraps
   it — the emitted loop iterates the array directly.

4. Drop the intermediate var binding for workhorse projections. Workhorse
   values copy cheaply, so the planner emits acc |> push(projection)
   directly. Non-workhorse keeps the bind-then-emplace dance because <- is
   a statement, not an expression.

Phase 2A benchmark deltas (100K, INTERP, ns/op per element):
  count_aggregate (where|count):       5 → 5    parity
  chained_where (where|where|count):  17 → 8    2.1× faster
  select_count (select|count):        15 → 2    7.5× faster
  to_array_filter (where|select):     11 → 11   parity (was 13 pre-fix)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-up improvements on top of the Phase-2A loop planner:

1. Chained _select|_select|... now fuses (for workhorse projections).
   The planner emits intermediate `var v_N = projection_N` let-bindings
   inside the loop body; each next lambda's `_` is renamed straight to
   the prior binding's name via fold_linq_cond. No expression substitution
   = no ExprRef2Value-wrapper trap. Non-workhorse chained selects still
   fall through (needs `:=` clone semantics — Phase 2B).

2. Drop emplace from emission. emplace moves out of its argument and
   can corrupt the source when the projection returns a ref into it
   (e.g. `_._field`). The planner now emits `push` for workhorse and
   `push_clone` for non-workhorse — no intermediate `var v <- proj;
   emplace(v)` dance, which both simplifies the AST and is safer.

The chained-select AST test (previously asserting fall-through) now
asserts invoke emission. All 118 fold + ast tests pass; benchmark
deltas held vs the previous commit:
  count_aggregate:    5  parity
  chained_where:      8  2.1× faster
  select_count:       2  7.5× faster
  to_array_filter:   11  parity

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #2689 review fixes (Copilot):

1. Counter lane drop-projection bug. `_fold(src._select(f).count())` was
   skipping the projection entirely, which diverges from raw LINQ
   `count(select(src, f))` when `f` has side effects. Counter lane now
   binds the final projection to a discardable local per matched element
   so user-visible side effects fire. The optimizer dead-code-eliminates
   the binding for pure projections (the common case — `_.x * 2`,
   `_.price` etc.), so the 7.5× select_count speedup is preserved.

2. Vacuous comprehension assertion in two AST tests. Pass `body_expr`
   (the full ExprInvoke wrapper) to `qm_resolve_comprehension` instead
   of `inv.arguments[0]` (the inner ExprMakeBlock, which can never match
   either branch of the resolver). The fixed form actually verifies the
   loop output is not the `fromComprehension=true` shape.

Adds 2 behavioral tests for the side-effects invariant (single
`select|count` and `where|select|count`). All Phase 2A benchmarks held:
count_aggregate 5/5, chained_where 8/17 (2.1×), select_count 2/15
(7.5×), to_array_filter 11/11.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reflect counter-lane semantics fix: projection is now evaluated per
matched element (side effects fire); optimizer DCEs pure projections.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #2689 review fixes (Copilot, round 2):

1. Peel-each + reserve guard. The inline `each(<x>)` peel + `sourceHasLength`
   gate previously accepted any non-iterator inner type, including
   `each(lambda)` (a lambda iterable per builtin.das:1351). That would peel
   to a lambda, then emit `reserve(length(lambda))` which has no overload
   and would fail to compile inside the macro output. Phase 2A never hit
   this in practice because the test suite only uses array sources, but
   it's a latent trap.

   Extracted `peel_each_length_source` and `type_has_length` helpers.
   Peel now triggers only when the inner type satisfies `isGoodArrayType
   || isGoodTableType || isString || isArray (T[N]) || isRange`. Same
   predicate gates the array-lane reserve emission, so the two stay in
   sync. Lambdas / custom user iterables fall through unfolded.

2. Reworded `test_select_count_fold_result` assertion message: the old
   "(projection ignored by counter)" wording was outdated after the
   counter-lane fix in 6226a1e — the planner now evaluates the
   projection per iteration (for side effects); only the value is
   discarded. Reads "(projection does not affect count value)" now.

select_count benchmark held at 2 ns/op (vs 15 for old fold), to_array_filter
held at 11/11 parity. AST + behavioral tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #2685 normalized ref_time_ticks() to nanoseconds across every
platform (Windows used to return raw QPC ticks at the underlying
counter's frequency — typically 10 MHz). The fix shipped without a
unit test that would have caught a units regression.

Add four tests under tests/fio/perf_time.das (sleep() lives in fio,
so this is the right neighborhood):

  - monotonic — 1000 successive reads never go backwards. Catches
    any signed/unsigned mixup or wrap-around bug in the ns conversion
    arithmetic.

  - sleep_roundtrip — sleep(100 ms) -> delta_ns must land in
    [80 ms, 500 ms]. The 80 ms lower bound is the load-bearing
    assertion: if Windows reverted to raw QPC ticks (10 MHz counter
    on the typical box -> a 100 ms wall-clock sleep would surface as
    1000000 "ticks" interpreted as ns, i.e. 1 ms), the test would
    trip. Wide upper bound covers CI runner scheduler jitter.

  - get_time_usec_agrees — the get_time_usec(t0) helper agrees with
    (ref_time_ticks() - t0) / 1000 within 5 ms. Two helpers reading
    the same underlying clock should not drift; if one ever ends up
    on a different code path, this notices.

  - units_are_nanoseconds — three back-to-back sleep(100 ms) deltas
    stay within 200 ms spread. If the unit accidentally changed
    mid-run (think: thread-local frequency cache going stale), the
    deltas would diverge wildly.

The test runs cleanly in both interpreter and AOT mode on Windows
(Win11 local): sleep(100 ms) -> 102-109 ms delta, get_time_usec
agrees to within microseconds. tests/aot/CMakeLists.txt:224 already
covers tests/fio/*.das via FILE(GLOB CONFIGURE_DEPENDS); cmake
reconfigure picks the new file up automatically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hase-2a-loop-planner

linq_fold: Phase 2A loop planner (where|select array + counter lanes)
…oundtrip-test

tests/fio: regression coverage for ref_time_ticks() ns normalization
Adds a reusable conservative `has_sideeffects(expr) : bool` predicate to
daslib/macro_boost. Returns true if an expression has — or might have —
side effects; false ONLY when provably pure. Intended for macro-time
elision of discardable evaluations.

Classification:
- Safe leaves: ExprVar, all ExprConst*, ExprAddr, ExprTypeInfo/Decl/Tag.
- Safe via recursion: ExprField/SafeField/Swizzle, ExprRef2Value/Ptr,
  ExprPtr2Ref, ExprAddr, ExprIs/IsVariant/AsVariant/SafeAsVariant,
  ExprCast, ExprNullCoalescing, ExprStringBuilder (string heap is
  no-op per compiler), ExprKeyExists (pure container read).
- ExprAt: safe when subexpr type is NOT isGoodTableType (tables auto-
  insert on missing key); ExprSafeAt always safe.
- ExprOp1/Op2/Op3: op-name allowlist for pure ops on workhorse types
  (bypasses func==null artifacts from partial folding); falls back to
  the function-flag check. `/` and `%` blacklisted (div-by-zero panic).
- ExprCall: allowlist `func.flags.builtIn && !knownSideEffects &&
  !unsafeOperation`, recurse args.
- Everything else: conservative true.

Counter-lane integration in daslib/linq_fold.das:

1. Discardable `var vfinal = projection` bind is now emitted only when
   `has_sideeffects(projection)` returns true. Pure projections like
   `_._field * 2` produce a bare-loop counter at macro time, no
   optimizer DCE required.

2. count→length shortcut: when the counter lane has no where-filter
   AND every projection in the chain is pure AND the source has a known
   length (array/table/string/range/fixed-array), the planner emits
   `length(src)` directly — the loop is elided entirely. select_count
   benchmark drops from 2 ns/op to 0 ns/op.

3. peel_each fix: `each` is a daslang generic, so the resolved
   `func.name` on a typed call is the mangled instance. The original
   peel only matched `func.name == "each"` and never fired for typed
   chains. Now also checks `func.fromGeneric.name == "each"`. Gated to
   array-shaped arguments (isGoodArrayType || isArray) so iterator-
   yielding sources like `each(range(10))` keep their wrapper.

4. Block-parameter typedecl branched on source shape: iterator sources
   keep `-const` (rvalue, must be consumable); array sources keep the
   source's `const&` modifier (peeled `let arr <-` is const-ref).

Tests:
- tests/macro_boost/test_has_sideeffects.das — 24 cases (17 safe + 5
  unsafe + 2 conservative-unsafe) wired via a `_test_has_sideeffects`
  probe call_macro that emits ExprConstBool at macro time.
- tests/linq/test_linq_fold_ast.das — 5 new tests:
  * test_pure_projection_uses_length_shortcut — invoke body returns
    `length(src)` directly, no for loop.
  * test_bare_count_uses_length_shortcut — same for `each(arr).count()`.
  * test_impure_projection_keeps_bind — for-body has bind + ++acc.
  * test_peel_each_on_array_source / _on_bare_count — assert peel fires.
  * test_peel_each_skips_non_array_source — `each(range(...))` keeps
    its wrapper (gate prevents iterator-source peeling).
  * test_target_each_range_count_runs — behavioral check for
    iterator-source chains.

Benchmarks (100K rows, INTERP, vs Phase 2A baseline):
- select_count: 2 → 0 ns/op (length shortcut elides loop entirely)
- chained_where: 8 → 6 ns/op (peel + const-ref param)
- count_aggregate: 5 → 4 ns/op (1ns from peel)
- to_array_filter: 11 → 10 ns/op (1ns from peel)

569/569 linq tests + 51/51 fold-AST + 24/24 has_sideeffects pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ssion

Cards added in the course of the linq_fold splice rewrite + PR #2691
(has_sideeffects + counter-lane elision). Topics:

linq_fold / macro-emission patterns:
- daslang-generic-instance-detect-via-fromgeneric — func.fromGeneric is
  the canonical "which generic was this instantiated from?" link;
  func.name on typed instances is mangled.
- daslib-macro-boost-has-sideeffects-predicate — new public predicate,
  full classification table, known limitations, test plumbing.
- qmacro-invoke-source-bind-typedecl-modifier-iter-vs-array — typedecl
  block-param const/ref handling differs between iterator and array
  sources; the two diagnostic error messages tell you which branch you
  picked wrong.
- qmacro-gensym-per-callsite-via-lineinfo — backtick-prefixed names +
  line+column suffix, force_at / force_generated / can_shadow.
- my-fold-macro-emits-a-loop-with-for-it-in-source-... (UPDATED) —
  peel_each pattern corrected for generic-instance detection + positive
  array gate + block-param typedecl handling.

LINQ semantics:
- are-there-parity-tests-in-tests-linq-that-compare-fold-output-to-...
- which-typedecl-predicates-identify-types-where-length-expr-is-...
- why-does-each-arr-fail-with-unsafe-when-not-source-of-for-loop-...
- what-s-the-right-sqlite-linq-chain-form-for-aggregates-sum-min-max-...
- my-macro-substitutes-it-for-a-projection-expression-via-template-...
- when-a-call-macro-needs-to-pick-copy-vs-move-init-for-a-projection-...
- where-does-nolint-rule-go-when-a-lint-warning-is-emitted-from-inside-...

Tooling / ops:
- how-do-i-run-dastest-in-benchmark-only-mode-and-what-s-the-command-...
- cpp-profiler-macos-samply-instruments.md
- what-s-the-end-to-end-checklist-for-adding-a-new-daslib-das-module-...
- how-do-i-call-a-dasimgui-or-any-managed-c-method-on-a-struct-field-...

Updated:
- why-does-my-dastest-integration-test-hang-at-readiness-gate-failed-...
  — original card pointed at a require-order red herring; real cause
  was ref_time_ticks() returning ns on POSIX while wait_until_ready's
  deadline math assumed μs. Fix landed in PR #2685.

No code changes — docs only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five cards captured while landing dasImgui PR #38 — the dastest
integration suite running across the 3-OS GitHub Actions matrix.
Docs-only — no code changes.

  - imgui-playwright-windows-ci-16-post-libhv-stall — dasHV/libhv on
    GitHub-hosted windows-latest stalls after exactly 16 POST
    /command per subprocess. Empirically counted from
    DASLIVE_HV_LOG=stderr logs. Local Win11 unaffected. Workaround:
    1-POST polling helpers + Windows-only --exclude of 7 high-POST
    tests with a 4-call safety margin.

  - imgui-harness-headless-timeout-sec-cascade-guard — new
    --headless-timeout-sec=N CLI flag on the imgui_harness 5-helper
    surface. Playwright passes (test_timeout - 5) so a panicked test
    can't leave a zombie daslang-live holding port 9090
    (daslang's `finally` is skipped on panic). Cascade-prevention
    pattern generalizes to any spawned-subprocess-owning-a-port
    layout.

  - imgui-macos-configmacosxbehaviors-shortcut-is-super-not-ctrl —
    ImGui sets io.ConfigMacOSXBehaviors = true on macOS; the
    "shortcut key" becomes Super (Cmd), not Ctrl. Synth-IO tests
    must branch on snap?["io"]?["config_macos_behaviors"] and use
    ["Super"] mods on macOS. Surfaced after the cascade-fix made
    the symptom observable; Copilot diagnosed it via
    config_macos_behaviors snapshot extension.

  - why-does-daslang-live-s-post-shutdown-return-200-ok-but-the-
    subprocess-never-actually-exits-on-linux-macos — libhv v1.3.4
    `pathHandlers` is std::unordered_map; ANY("*") catch-all
    enumerates BEFORE specific paths on Linux libstdc++,
    intermittently on Windows MSVC. Every request hits the help
    handler, /shutdown's request_exit() never runs. Workaround
    landed via daslang PR #2688 (drops ANY, serves help from
    GET("/")). Upstream libhv bug unreported as of 2026-05-16.

  - why-does-my-lint-macro-fire-on-the-wrapper-module-that-
    legitimately-uses-the-forbidden-symbols-even-though-i-scope-
    visit-module — [lint_macro] runs PER MODULE during the require
    chain; getThisModule rebinds per per-module pass, so
    visit_module(prog.getThisModule) ALSO walks the wrapper that
    legitimately uses the forbidden symbols. Wrapper modules must
    carry the defensive opt-out (options _allow_xxx_calls = true),
    same convention as imgui_lint.das's _allow_imgui_legacy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…owlist

Two correctness fixes from Copilot review on PR #2691:

1. **Mutation operators bypass.** `++`, `--`, and compound-assignment ops
   (`+=`, `-=`, `*=`, `/=`, `%=`, `&=`, `|=`, `^=`, `<<=`, `>>=`,
   `&&=`, `||=`, `^^=`) fall through the `is_safe_op{1,2}` allowlist
   check, but the fallback through `func_has_sideeffects` only catches
   them if the resolved C++ builtin happens to carry the right flag. If
   the builtin missed the flag (or there is no resolved builtin), `x++`
   classifies as pure. Add `is_mutation_op1` / `is_mutation_op2`
   blacklists invoked up front, before any flag check. Note the AST
   op-string convention: postfix `++` / `--` are `"+++"` / `"---"`.

2. **User operator overloads bypass.** When `e2.op` is in the safe
   allowlist (`+`, `-`, `*`, `==`, etc.), the old code skipped the
   `func_has_sideeffects(e2.func)` check entirely. A user-defined
   `def operator +(...)` on a custom type would then classify as pure
   regardless of side effects. Restructure: `func != null` → trust the
   func flags (non-builtin defaults to unsafe via `func_has_sideeffects`);
   `func == null` → fall back to op-name allowlist for partial-folding
   artifacts.

Tests:
- `test_postfix_increment_unsafe`, `test_postfix_decrement_unsafe`
- `test_user_op_overload_unsafe` (defines `operator +` on a private
  struct with a global-counter side effect)

CI fix: register `tests/macro_boost/` in `tests/aot/CMakeLists.txt`
(missed when the test directory was created in the parent commit).
Mirrors the `tests/linq/` pattern: a test-files glob + a module-files
list for the `_has_sideeffects_probe.das` helper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
… post-PR #2685 ns normalization

Agent-Logs-Url: https://github.com/GaijinEntertainment/daScript/sessions/97224dec-45d1-4968-a3dd-8e5f37274983

Co-authored-by: borisbat <272689+borisbat@users.noreply.github.com>
…ects-counter-lane-elision

macro_boost: add has_sideeffects + counter-lane elision
daslang.exe accepts -project_root <path> to override the script-dir
fallback used to scan <root>/modules/*/.das_module. daslang-live was
silently lacking the same flag — it only had -project (project file).

Workflow that surfaced this: running a tutorial via daslang-live from
inside a module clone at D:\DASPKG\dasImgui where the script path was
"../../../../examples/tutorial/X.das". With no -project_root, the
fallback at lines 722-724 sets project_root = examples/tutorial — which
has no modules/ underneath, so `require imgui` fails. Same scenario
works for daslang.exe with -project_root .

3-line patch: arg-parse arm + help line. The project_root static is
already declared at line 37 and consumed at line 727; the script-dir
fallback still kicks in when the flag isn't passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors daslang.exe's silent alias at utils/daScript/main.cpp:190 where
both -project_root and -project-root are accepted by the same arm.
Help text still only shows the underscore form (same convention as
daslang.exe — alias is silent).

Addresses PR #2693 review feedback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…-linq-fold-session

mouse-data/docs: 21 new + 1 updated card from linq_fold + dasImgui PR #38
…e-project-root

daslang-live: add -project_root flag (mirror daslang.exe)
The example pre-dated PR #33's default-on imgui_lint, PR #38's headless
harness, and PR #39's daslang theme. It hand-rolled its own imgui_app()
shim, called raw Begin/End/Checkbox/InputFloat with addr-of dance, and
set FontGlobalScale=1.0 with a "BBATKIN: note - my monitor is HUGE"
comment.

Rewrite onto the current surface:

* require imgui/imgui_harness — single import for the boost-v2 widget
  stack, daslang theme + JetBrains Mono via live_imgui_init, harness
  lifecycle (harness_init / harness_begin_frame / harness_new_frame /
  harness_shutdown).
* window(SETUP_WIN, ...) { ... } container instead of raw Begin/End.
* edit_checkbox / edit_input_float / edit_input_float2 against
  safe_addr(global) instead of unsafe(addr(field)). Collapsed C0.x/C0.y
  → edit_input_float2(safe_addr(c0), ...) — same applies to C1/C-1/C2/C-2.
* text("...") narrative widget instead of raw Text().
* separator(SEP_C0/C1/C2) instead of raw Separator().
* Drop FontGlobalScale shim — theme picks a sensible 14px default.

Per-frame loop splits harness_end_frame manually so custom OpenGL draws
between glClear and ImGui_ImplOpenGL3_RenderDrawData. options
_allow_glfw_calls = true opts out of imgui_harness_lint for the GLFW/GL
calls the example legitimately owns (live_get_framebuffer_size,
glViewport, glClear, draw_fourier()).

To run: `daspkg install` from examples/graphics/ to fetch dasImgui into
the local modules/, then `daslang.exe -project_root .
furier_opengl_imgui_example.das`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…p-update

site/blog: roadmap update + are-we-there-yet post
…aphics-modernize

examples/graphics: modernize Fourier viz to dasImgui boost-v2 + harness
@pull pull Bot locked and limited conversation to collaborators May 17, 2026
@pull pull Bot added the ⤵️ pull label May 17, 2026
@pull pull Bot merged commit 39b1a45 into forksnd:master May 17, 2026
@pull pull Bot had a problem deploying to github-pages May 17, 2026 02:58 Error
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants