diff --git a/benchmarks/sql/LINQ.md b/benchmarks/sql/LINQ.md index 5a956110f..54506d59c 100644 --- a/benchmarks/sql/LINQ.md +++ b/benchmarks/sql/LINQ.md @@ -22,8 +22,9 @@ See `~/.claude/plans/keen-hopping-balloon.md` for the long-form plan. |---|---|---| | 0 | Rename `_fold` → `_old_fold` in linq_boost; extract `_fold` and `_old_fold` into new `daslib/linq_fold.das` module; `linq_boost` `require linq_fold public` for re-export | ✅ done | | 1 | Benchmark suite: 24 files under `benchmarks/sql/`, each 4-way (m1 `_sql` / m3 plain linq / m3f_old `_old_fold` / m3f `_fold`) at 100K rows; baseline numbers captured | ✅ done | -| 2 | Splice planner + initial operators (`count`, `sum`, `to_array`, `where` with literal-lambda inlining); pattern tests for "spliced" vs "fell back" | ⏳ next | -| 3+ | Per-operator splice PRs: `select`, terminal aggregates with early-exit (`first`, `any`, `all`, `min`, `max`, `average`), `take`/`skip`/chained `where`, then buffer-required ops (`distinct`, `sort`, `groupby`, `zip`, `join`) | ⏳ | +| 2A | Loop planner — `_fold` emits explicit for-loops for `[where_*][select?]` (array lane) and `[where_*][select?] |> count` (counter lane); anything else falls through unfolded. No comprehensions, no dispatch back to `_old_fold`. | ✅ done | +| 2B | Aggregate accumulators: `sum`, `min`, `max`, `average`, `first`, `any`, `all`, `long_count`. Also `take`/`skip` in counter/array lane and chained-`_select|_select` fusion (needs `ExprRef2Value`-aware projection substitution) | ⏳ next | +| 3+ | Buffer-required operators: `distinct`, `sort`, `reverse`, `groupby`, `zip`, `join`. Once we go array, we stay array | ⏳ | | 4 | Final coverage pass + docs; full 4-way comparison table refresh; parity-test sweep | ⏳ | ## Baselines (100K rows, INTERP mode) @@ -69,7 +70,36 @@ Notation: `—` means the variant is not applicable for this benchmark (operator - **m1 vs m3** shows the SQLite-vs-in-memory-LINQ cost gap. SQL wins on `indexed_lookup` (b-tree) and on sorted-take patterns (engine partial-sort + LIMIT). Arrays win on raw aggregates where the SQL overhead exceeds the in-memory work. - **m3 vs m3f_old** shows what the *current* `_fold` macro already achieves. Big wins on the patterns it explicitly recognizes (`where+count` 6×, `where+select+to_array` ~4×, `chained_where+count` 2.6×). Negligible difference where it falls through to the default emitter. -- **m3f vs m3f_old** is the target of Phase 2+. Currently identical by construction. Each PR in the splice series adds a splice path for one operator family and updates this table with the new ratio. +- **m3f vs m3f_old** is the target of Phase 2+. Each PR in the splice series adds a path for one operator family and updates this table with the new ratio. + +## Phase 2A — Loop planner (2026-05-16) + +`_fold` now emits explicit for-loops for two narrow shape families instead of comprehensions. Anything outside scope falls through unfolded to raw linq (no dispatch to `_old_fold` or `fold_linq_default`). + +**In scope:** `[where_*][select*]` (array lane) and `[where_*][select*] |> count` (counter lane). Chained `_where|_where|...` fuses via `&&`. Chained `_select|_select|...` fuses via intermediate `var v_N = projection_N` let-bindings — each next lambda's `_` is renamed straight to the prior binding's name, no expression substitution needed (which would have hit the ExprRef2Value-wrapper problem documented in `skills/das_macros.md`). Chained selects currently require all projections to be workhorse; non-workhorse intermediates would need `:=` (clone) since `<-` (move) can corrupt source for lvalue projections — deferred to Phase 2B. + +**Out of scope (falls through):** `_select|_where`, `sum`, `min`, `max`, `average`, `first`, `any`, `all`, `long_count`, `_order`, `_distinct`, `_take`, `_skip`, `_zip`, `_reverse`, etc. + +### Phase 2A deltas (100K rows, INTERP) + +| Benchmark | Shape | m3f_old | m3f (Phase 2A) | Delta | +|---|---|---:|---:|---| +| count_aggregate | `where → count` | 5 | 4 | parity-ish (1ns improvement from `each()` peel) | +| chained_where | `where → where → count` | 17 | 6 | **2.8× faster** (fuses chained wheres into single `&&` predicate; small gain from peel + const-ref param) | +| select_count | `select → count` | 15 | 0 | **∞ faster** — when the projection is pure (`has_sideeffects == false`) and the source has length, the counter lane shortcuts to `length(src)` and elides the loop entirely. See [macro_boost::has_sideeffects](../../daslib/macro_boost.das) and `linq_fold.das:plan_loop_or_count` | +| to_array_filter | `where → select → to_array` | 11 | 10 | parity (after `each()` peel + reserve + workhorse `push`) | + +Shapes outside Phase 2A scope now compile to plain linq (`m3f ≈ m3`). This is an intentional regression vs the historical `_old_fold` numbers — Boris's call ("we let it fall through unfolded, and we see performance issues. im ok being slower until we fix") as the forcing function for Phase 2B+. The previous "m3f = m3f_old (identical by construction)" baseline assumed `_fold` would dispatch to `_old_fold` on the unmatched path; Phase 2A drops that dispatch. + +### Three small things that closed the to_array_filter gap + +The first cut was 18% slower than the comprehension. Three independent fixes brought it to parity: + +1. **Workhorse decision at macro time, not runtime.** The first emission used `static_if (typeinfo is_workhorse(projection))` inside the qmacro so the compiler picked copy- vs move-init. The projection's `_type` is already resolved when the planner runs, so the macro now reads `projection._type.isWorkhorseType` directly and emits exactly one branch — less AST, no static_if to fold away. +2. **Pre-reserve when the source has a known length.** ExprArrayComprehension lowering reserves the result array to the source's length to avoid growth reallocs; the explicit loop has to do the same explicitly. The planner emits `acc |> reserve(length(src))` when the source isn't an iterator. +3. **Peel `each()` at macro time.** The benchmark source `each(arr)` reports as `iterator`, so the reserve from (2) wouldn't fire. The planner now detects `each()` where the inner expression has length and unwraps it — the emitted loop iterates the array directly. `for (it in arr)` and `for (it in each(arr))` yield the same element refs; the wrapper iterator is incidental in fold context. + +A fourth simplification dropped `emplace` from the emission entirely. emplace **moves** out of its argument and can corrupt the source when the projection returns a ref into it (e.g. `_._field`). The safe pattern is `push` for workhorse (cheap copy) and `push_clone` for non-workhorse (deep clone). No intermediate `var v = projection; emplace(v)` is needed in either case — the planner pushes the projection expression directly. ## Operator-coverage checklist (parity tests) diff --git a/benchmarks/sql/select_count.das b/benchmarks/sql/select_count.das new file mode 100644 index 000000000..84e225342 --- /dev/null +++ b/benchmarks/sql/select_count.das @@ -0,0 +1,75 @@ +options gen2 +options persistent_heap + +require _common public + +// _select |> count — projection followed by counter. The final count value doesn't depend +// on the projection, but plain LINQ `count(select(src, f))` still evaluates `f` per element +// so user-visible side effects fire. Phase-2A `_fold` matches that: the counter lane binds +// the final projection to a discardable local per matched element (side effects preserved) +// and skips array materialization. The optimizer DCEs the binding for pure projections +// like `_.price * 2`, leaving a bare-loop counter for the common case. `_old_fold` lacks a +// [select, count] pattern in g_foldSeq so it falls to the default nested-pass form +// (pass_0 = select(...); count(pass_0)) — materializing the same way m3 does. + +def run_m1(b : B?; n : int) { + with_sqlite(":memory:") $(db) { + fixture_db(db, n) + b |> run("m1_sql/{n}", n) { + let c = _sql(db |> select_from(type) |> count()) + if (c == 0) { + b->failNow() + } + } + } +} + +def run_m3(b : B?; n : int) { + let arr <- fixture_array(n) + b |> run("m3_array/{n}", n) { + let c = arr |> _select(_.price * 2) |> count() + if (c == 0) { + b->failNow() + } + } +} + +def run_m3f_old(b : B?; n : int) { + let arr <- fixture_array(n) + b |> run("m3f_old_array_fold/{n}", n) { + let c = _old_fold(each(arr)._select(_.price * 2).count()) + if (c == 0) { + b->failNow() + } + } +} + +def run_m3f(b : B?; n : int) { + let arr <- fixture_array(n) + b |> run("m3f_array_fold/{n}", n) { + let c = _fold(each(arr)._select(_.price * 2).count()) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def select_count_m1(b : B?) { + run_m1(b, 100000) +} + +[benchmark] +def select_count_m3(b : B?) { + run_m3(b, 100000) +} + +[benchmark] +def select_count_m3f_old(b : B?) { + run_m3f_old(b, 100000) +} + +[benchmark] +def select_count_m3f(b : B?) { + run_m3f(b, 100000) +} diff --git a/daslib/linq_fold.das b/daslib/linq_fold.das index 8975aad43..39ebbde3f 100644 --- a/daslib/linq_fold.das +++ b/daslib/linq_fold.das @@ -522,6 +522,301 @@ def private fold_linq_default(var expr : Expression?; recursiveMacroName : strin return res } +[macro_function] +def private type_has_length(t : TypeDecl?) : bool { + // True for types where `length()` is statically resolvable: arrays, tables, + // strings, fixed-arrays (T[N]), and the range family. Lambdas (`def each(lam : + // lambda<...>)`) and custom user iterables are excluded — they have no length() + // overload and would make a macro-emitted `reserve(length(src))` fail to compile. + if (t == null) return false + return (t.isGoodArrayType || t.isGoodTableType || t.isString + || t.isArray || t.isRange) +} + +[macro_function] +def private is_each_call(call : ExprCall?) : bool { + //! `each` in daslib/builtin.das is generic, so the resolved `func.name` on a typed + //! call is the mangled instance name (e.g. `builtin\`each\`30908...`). The generic's + //! original name lives in `func.fromGeneric.name`. Match either. + if (call == null || call.func == null) return false + return (call.func.name == "each" + || (call.func.fromGeneric != null && call.func.fromGeneric.name == "each")) +} + +[macro_function] +def private peel_each(var top : Expression?) : Expression? { + // Unwrap `each()` to `` when `` is a true array (or fixed-size array). + // Iteration semantics are preserved: `for it in ` implicitly re-wraps via the + // same `each` overload. We gate on array-ness because peeling an iterator-typed + // argument (e.g. `each(range(10))`, `each(generator())`) would put the iterator in + // place — the downstream length shortcut and reserve-by-length hints assume an + // indexable source. Only peel when we can prove that's true. + if (!(top is ExprCall)) return top + var topCall = top as ExprCall + if (!is_each_call(topCall) || topCall.arguments |> length != 1) return top + let argExpr = topCall.arguments[0] + if ((argExpr == null || argExpr._type == null) + || (!argExpr._type.isGoodArrayType && !argExpr._type.isArray)) return top + return clone_expression(argExpr) +} + +[macro_function] +def private plan_loop_or_count(var expr : Expression?) : Expression? { + // Phase-2A loop planner. Recognizes chains of shape `[where_*][select?]` (array lane) + // and `[where_*][select?] |> count` (counter lane). Fuses chained wheres into `&&` and + // chained selects via expression composition; emits one inline `invoke($block, $src)` + // with a plain for-loop. Returns null for anything else — caller falls through unfolded. + var (top, calls) = flatten_linq(expr) + if (empty(calls)) return null + top = peel_each(top) + let lastName = calls.back()._1.name + if (lastName != "count" && lastName != "where_" && lastName != "select") return null + let counterLane = lastName == "count" + let intermediateCount = counterLane ? length(calls) - 1 : length(calls) + let at = calls[0]._0.at + let srcName = "`source`{at.line}`{at.column}" + let itName = "`it`{at.line}`{at.column}" + let accName = "`acc`{at.line}`{at.column}" + var whereCond : Expression? + var projection : Expression? + var intermediateBinds : array + var seenSelect = false + var allProjectionsPure = true + var elementType = clone_type(top._type.firstType) + var lastBindName = itName + for (i in 0 .. intermediateCount) { + var cll & = unsafe(calls[i]) + let opName = cll._1.name + if (opName == "where_") { + if (seenSelect) return null // where-after-select not in Phase 2A + var predicate = fold_linq_cond(cll._0.arguments[1], itName) + if (whereCond == null) { + whereCond = predicate + } else { + whereCond = qmacro($e(whereCond) && $e(predicate)) + } + } elif (opName == "select") { + // Chained selects: bind the previous projection to a fresh local now so the next + // lambda's `_` can be renamed straight to that name — avoids the + // ExprRef2Value-substitution trap that plain `Template.replaceVariable` hits when + // splicing a typed expression into another typed expression. Phase 2A only + // chains workhorse projections; a non-workhorse intermediate binding would need + // a clone (`:=`) since `<-` (move) can corrupt source for lvalue projections + // like `_._field`. Deferred to Phase 2B. + if (projection != null) { + let prevWorkhorse = projection._type != null && projection._type.isWorkhorseType + if (!prevWorkhorse) return null // chained non-workhorse selects — Phase 2B + if (has_sideeffects(projection)) { + allProjectionsPure = false + } + let bindName = "`v`{at.line}`{at.column}`{length(intermediateBinds)}" + intermediateBinds |> push <| qmacro_expr() { + var $i(bindName) = $e(projection) + } + lastBindName = bindName + } + projection = fold_linq_cond(cll._0.arguments[1], lastBindName) + elementType = clone_type(cll._0._type.firstType) + seenSelect = true + } else { + return null + } + } + if (projection != null && has_sideeffects(projection)) { + allProjectionsPure = false + } + // Counter-lane shortcut: when there's no filter and every projection in the chain is + // pure, the count is simply `length(source)`. Skip the loop entirely — no per-element + // increments, no per-element side-effect evaluation. Gated on `type_has_length` so we + // only emit `length(src)` when it's statically resolvable. + if (counterLane && whereCond == null && allProjectionsPure + && type_has_length(top._type)) { + var topExpr = clone_expression(top) + topExpr.genFlags.alwaysSafe = true + var res = qmacro(invoke($($i(srcName) : typedecl($e(topExpr))) { + return length($i(srcName)) + }, $e(topExpr))) + res.force_at(at) + res.force_generated(true) + let blk = (res as ExprInvoke).arguments[0] as ExprMakeBlock + (blk._block as ExprBlock).arguments[0].flags.can_shadow = true + return res + } + // Build the per-element loop body. + var loopBody : Expression? + if (counterLane) { + // Counter lane must evaluate the projection (and any chained intermediates) per + // matched element so user-visible side effects fire — `count(select(src, f))` in + // plain LINQ invokes f per element, and our fold must match. Bind the final + // projection to a discardable local; daslang macro output bypasses LINT002. + var sideEffectStmts : array + sideEffectStmts |> reserve(length(intermediateBinds) + 2) + for (b in intermediateBinds) { + sideEffectStmts |> push(b) + } + // Bind the final projection only when it might have side effects. Pure projections + // (the common case — `_._field * 2`) can be elided entirely; no need to rely on + // the optimizer to DCE a dead store afterwards. + if (projection != null && has_sideeffects(projection)) { + let finalBindName = "`vfinal`{at.line}`{at.column}" + sideEffectStmts |> push <| qmacro_expr() { + var $i(finalBindName) = $e(projection) + } + } + sideEffectStmts |> push <| qmacro_expr() { + $i(accName) ++ + } + var incBlock : Expression? + if (length(sideEffectStmts) == 1) { + incBlock = sideEffectStmts[0] + } else { + incBlock = qmacro_block() { + $b(sideEffectStmts) + } + } + if (whereCond != null) { + loopBody = qmacro_expr() { + if ($e(whereCond)) { + $e(incBlock) + } + } + } else { + loopBody = incBlock + } + } else { + // array lane + if (projection != null) { + // push for workhorse (cheap copy), push_clone for non-workhorse (deep clone, + // never mutates source). emplace would move out of the projection's value, + // which is unsafe when the projection returns a ref into the source. + // For chained selects, `intermediateBinds` carries N-1 prior bindings; splice + // them in before the push so each lambda body can resolve its renamed parameter + // to the correct binding name. + let workhorseProj = projection._type != null && projection._type.isWorkhorseType + var pushStmt : Expression? + if (workhorseProj) { + pushStmt = qmacro_expr() { + $i(accName) |> push($e(projection)) + } + } else { + pushStmt = qmacro_expr() { + $i(accName) |> push_clone($e(projection)) + } + } + var perElem : Expression? + if (empty(intermediateBinds)) { + perElem = pushStmt + } else { + var perElemStmts : array + perElemStmts |> reserve(length(intermediateBinds) + 1) + for (b in intermediateBinds) { + perElemStmts |> push(b) + } + perElemStmts |> push(pushStmt) + perElem = qmacro_block() { + $b(perElemStmts) + } + } + if (whereCond != null) { + loopBody = qmacro_expr() { + if ($e(whereCond)) { + $e(perElem) + } + } + } else { + loopBody = perElem + } + } elif (whereCond != null) { + // Identity case (no projection): `it` aliases the source element. Workhorse + // types can `push` (cheap copy); non-workhorse needs `push_clone` to avoid + // mutating the source via a move. + let elemWorkhorse = elementType != null && elementType.isWorkhorseType + if (elemWorkhorse) { + loopBody = qmacro_expr() { + if ($e(whereCond)) { + $i(accName) |> push($i(itName)) + } + } + } else { + loopBody = qmacro_expr() { + if ($e(whereCond)) { + $i(accName) |> push_clone($i(itName)) + } + } + } + } else { + // identity chain — nothing to fuse; let the caller fall through. + return null + } + } + var topExpr = clone_expression(top) + topExpr.genFlags.alwaysSafe = true + var res : Expression? + // Pick the block-parameter typedecl modifier by source shape: + // - iterator (rvalue, e.g. `each(range(10))`) — strip `-const` so the body can + // consume the iterator. Without the strip, daslang's typer reports + // "can't iterate over const iterator". + // - container with length (array/table/string/range/fixed-array) — keep modifiers + // so a `const&` source (e.g. `let arr <-`) matches the param exactly. + let topIsIter = top._type != null && top._type.isIterator + if (counterLane) { + if (topIsIter) { + res = qmacro(invoke($($i(srcName) : typedecl($e(topExpr)) - const) { + var $i(accName) = 0 + for ($i(itName) in $i(srcName)) { + $e(loopBody) + } + return $i(accName) + }, $e(topExpr))) + } else { + res = qmacro(invoke($($i(srcName) : typedecl($e(topExpr))) { + var $i(accName) = 0 + for ($i(itName) in $i(srcName)) { + $e(loopBody) + } + return $i(accName) + }, $e(topExpr))) + } + } else { + let isIter = expr._type.isIterator + // Pre-reserve the accumulator to the source's length when the source has a known + // length (array, table, range — anything that isn't an iterator). Avoids realloc + // walks during growth; matches what ExprArrayComprehension lowering does. + let sourceHasLength = type_has_length(top._type) + if (isIter) { + res = qmacro(invoke($($i(srcName) : typedecl($e(topExpr)) - const) { + var $i(accName) : array<$t(elementType)> + for ($i(itName) in $i(srcName)) { + $e(loopBody) + } + return <- $i(accName).to_sequence_move() + }, $e(topExpr))) + } elif (sourceHasLength) { + res = qmacro(invoke($($i(srcName) : typedecl($e(topExpr))) { + var $i(accName) : array<$t(elementType)> + $i(accName) |> reserve(length($i(srcName))) + for ($i(itName) in $i(srcName)) { + $e(loopBody) + } + return <- $i(accName) + }, $e(topExpr))) + } else { + res = qmacro(invoke($($i(srcName) : typedecl($e(topExpr)) - const) { + var $i(accName) : array<$t(elementType)> + for ($i(itName) in $i(srcName)) { + $e(loopBody) + } + return <- $i(accName) + }, $e(topExpr))) + } + } + res.force_at(at) + res.force_generated(true) + let blk = (res as ExprInvoke).arguments[0] as ExprMakeBlock + (blk._block as ExprBlock).arguments[0].flags.can_shadow = true + return res +} + [call_macro(name="_fold")] class private LinqFold : AstCallMacro { //! implements _fold(expression) that folds LINQ expressions into optimized sequnences @@ -534,12 +829,9 @@ class private LinqFold : AstCallMacro { //! Visits the _fold macro call and folds LINQ expressions into optimized sequences. macro_verify(call.arguments |> length == 1, prog, call.at, "expecting _fold(expression)") macro_verify(call.arguments[0]._type != null, prog, call.at, "expecting linq expression") - var res : Expression? = fold_linq_default(call.arguments[0], "_fold") - if (res == null) { - prog |> macro_error(call.at, "cannot fold LINQ expression\n{describe(call.arguments[0])}") - return res - } - return res + var res : Expression? = plan_loop_or_count(call.arguments[0]) + if (res != null) return res + return clone_expression(call.arguments[0]) } } diff --git a/daslib/macro_boost.das b/daslib/macro_boost.das index 02cb2923b..bac8e2c2f 100644 --- a/daslib/macro_boost.das +++ b/daslib/macro_boost.das @@ -149,3 +149,166 @@ def public collect_labels(expr : ExpressionPtr) { return <- res } +[macro_function] +def public has_sideeffects(expr : Expression?) : bool { + //! Conservative side-effect detection. Returns true when the expression has — or + //! might have — side effects. Returns false ONLY when provably pure (no function + //! calls, no heap allocation, no container mutation). + //! + //! Intended for macro-time elision of discardable evaluations. + //! Callers treat false as a promise; true is the safe default — when in doubt, true. + // null / compiler-tagged-pure / variable reads / constant literals — leaf, safe. + if (expr == null || expr.flags.noSideEffects + || expr is ExprVar + || expr is ExprConstInt || expr is ExprConstInt8 || expr is ExprConstInt16 + || expr is ExprConstInt64 || expr is ExprConstUInt || expr is ExprConstUInt8 + || expr is ExprConstUInt16 || expr is ExprConstUInt64 || expr is ExprConstFloat + || expr is ExprConstDouble || expr is ExprConstBool || expr is ExprConstString + || expr is ExprConstPtr || expr is ExprConstRange || expr is ExprConstURange + || expr is ExprConstRange64 || expr is ExprConstURange64 + || expr is ExprConstEnumeration || expr is ExprConstBitfield) return false + // Member access — recurse into operand. + if (expr is ExprField) return has_sideeffects((expr as ExprField).value) + if (expr is ExprSafeField) return has_sideeffects((expr as ExprSafeField).value) + if (expr is ExprSwizzle) return has_sideeffects((expr as ExprSwizzle).value) + // Pointer / reference artifacts. + if (expr is ExprRef2Value) return has_sideeffects((expr as ExprRef2Value).subexpr) + if (expr is ExprRef2Ptr) return has_sideeffects((expr as ExprRef2Ptr).subexpr) + if (expr is ExprPtr2Ref) return has_sideeffects((expr as ExprPtr2Ref).subexpr) + if (expr is ExprAddr) return false + // Type / variant checks. + if (expr is ExprIs) return has_sideeffects((expr as ExprIs).subexpr) + if (expr is ExprIsVariant) return has_sideeffects((expr as ExprIsVariant).value) + if (expr is ExprAsVariant) return has_sideeffects((expr as ExprAsVariant).value) + if (expr is ExprSafeAsVariant) return has_sideeffects((expr as ExprSafeAsVariant).value) + // Cast — recurse. + if (expr is ExprCast) return has_sideeffects((expr as ExprCast).subexpr) + // Compile-time meta. + if (expr is ExprTypeInfo || expr is ExprTypeDecl || expr is ExprTag) return false + // Subscripts. + if (expr is ExprAt) { + let at_e = expr as ExprAt + // tables auto-insert on missing key — unsafe; arrays/strings safe (read-only). + if (at_e.subexpr == null || at_e.subexpr._type == null + || at_e.subexpr._type.isGoodTableType) return true + return has_sideeffects(at_e.subexpr) || has_sideeffects(at_e.index) + } + if (expr is ExprSafeAt) { + let sat = expr as ExprSafeAt + return has_sideeffects(sat.subexpr) || has_sideeffects(sat.index) + } + // Null coalescing. + if (expr is ExprNullCoalescing) { + let nc = expr as ExprNullCoalescing + return has_sideeffects(nc.subexpr) || has_sideeffects(nc.defaultValue) + } + // String builder — string heap allocation is no-op by compiler; recurse into operands. + if (expr is ExprStringBuilder) { + let sb = expr as ExprStringBuilder + for (e in sb.elements) { + if (has_sideeffects(e)) return true + } + return false + } + // key_exists is a pure container read. + if (expr is ExprKeyExists) { + let ke = expr as ExprKeyExists + for (a in ke.arguments) { + if (has_sideeffects(a)) return true + } + return false + } + // Function-call-shaped expressions: ExprCall (regular call) and ExprOp1/ExprOp2/ExprOp3 + // (operators, which also resolve to a function). Two-layer check: + // + // 1. Mutation ops (`++`, `--`, `+=`, `-=`, …) are unconditionally unsafe — + // blacklisted up front, regardless of how the resolved builtin happens to be + // flagged. Catches builtins that the C++ side forgot to mark with + // `knownSideEffects`/`unsafeOperation`. + // 2. Trust `func.flags` when `func != null` — covers user-defined operator + // overloads (e.g. `struct Foo { def operator +(...) }`), which fall through + // `func_has_sideeffects` as non-builtin → unsafe. Fall back to the op-name + // allowlist only when `func == null` (typer left it unresolved, e.g. after + // partial constant folding). `/` and `%` stay UNSAFE (div-by-zero panic; + // design decision). + // + // `is`/`as` on handled types is EXACT-rtti (see CLAUDE.md), so each shape needs its + // own branch — can't cast ExprOp2 to ExprCallFunc even though the C++ class inherits. + if (expr is ExprOp1) { + let e1 = expr as ExprOp1 + // func != null → trust func flags (catches user overloads); func == null → fall + // back to op-name allowlist (handles partial-folding artifacts). Mutation ops + // are unconditionally unsafe (in case a C++ builtin missed the side-effect flag). + if (is_mutation_op1(e1.op) + || (e1.func != null && func_has_sideeffects(e1.func)) + || (e1.func == null && !is_safe_op1(e1.op))) return true + return has_sideeffects(e1.subexpr) + } + if (expr is ExprOp2) { + let e2 = expr as ExprOp2 + if (is_mutation_op2(e2.op) || e2.op == "/" || e2.op == "%" + || (e2.func != null && func_has_sideeffects(e2.func)) + || (e2.func == null && !is_safe_op2(e2.op))) return true + return has_sideeffects(e2.left) || has_sideeffects(e2.right) + } + if (expr is ExprOp3) { + let e3 = expr as ExprOp3 + // ExprOp3 is the only ternary `?:` in daslang — pure if operands pure. + return has_sideeffects(e3.subexpr) || has_sideeffects(e3.left) || has_sideeffects(e3.right) + } + if (expr is ExprCall) { + let ec = expr as ExprCall + if (func_has_sideeffects(ec.func)) return true + for (a in ec.arguments) { + if (has_sideeffects(a)) return true + } + return false + } + // Default: unknown → unsafe. + return true +} + +[macro_function] +def private func_has_sideeffects(f : Function?) : bool { + //! True when calling `f` may have side effects. Allowlists builtins + //! (`flags.builtIn`) without `knownSideEffects` or `unsafeOperation`. + return (f == null || !f.flags.builtIn + || f.flags.knownSideEffects || f.flags.unsafeOperation) +} + +[macro_function] +def private is_safe_op1(op : das_string) : bool { + //! Unary operators that are pure on workhorse types — no overflow trap, no mutation. + //! Excludes `++` / `--` (handled by is_mutation_op1). + return op == "-" || op == "!" || op == "~" || op == "+" +} + +[macro_function] +def private is_safe_op2(op : das_string) : bool { + //! Binary operators that are pure on workhorse types. Excludes `/`, `%` (div-by-zero + //! panic — design decision) and all compound-assignment ops (handled by is_mutation_op2). + return (op == "+" || op == "-" || op == "*" + || op == "==" || op == "!=" || op == "<" || op == "<=" || op == ">" || op == ">=" + || op == "&" || op == "|" || op == "^" || op == "<<" || op == ">>" + || op == "&&" || op == "||") +} + +[macro_function] +def private is_mutation_op1(op : das_string) : bool { + //! Unary operators that mutate their operand. Unconditionally unsafe — bypasses any + //! flag check on the resolved builtin (in case the C++ side forgot to mark it). + //! `++` / `--` are prefix; `+++` / `---` are the daslang AST op-strings for postfix + //! increment/decrement (the trailing-plus / trailing-minus naming). + return op == "++" || op == "--" || op == "+++" || op == "---" +} + +[macro_function] +def private is_mutation_op2(op : das_string) : bool { + //! Compound-assignment operators (mutate the left operand). Same unconditional-unsafe + //! treatment as is_mutation_op1. + return (op == "+=" || op == "-=" || op == "*=" || op == "/=" || op == "%=" + || op == "&=" || op == "|=" || op == "^=" + || op == "<<=" || op == ">>=" + || op == "&&=" || op == "||=" || op == "^^=") +} + diff --git a/examples/graphics/furier_opengl_imgui_example.das b/examples/graphics/furier_opengl_imgui_example.das index 4063ccea0..e25f05f7c 100644 --- a/examples/graphics/furier_opengl_imgui_example.das +++ b/examples/graphics/furier_opengl_imgui_example.das @@ -1,15 +1,22 @@ options gen2 +options indenting = 4 options persistent_heap = true +options _allow_glfw_calls = true -require imgui_app -require glfw/glfw_boost -require imgui/imgui_boost +require imgui/imgui_harness require opengl/opengl_boost -require daslib/math_boost +require glfw/glfw_boost +require live/glfw_live +require imgui_app require daslib/safe_addr +require daslib/math_boost require daslib/strings -var window : GLFWwindow? +// Custom 3D + ImGui hybrid: the harness gives us the boost-v2 widget surface +// (window/edit_*/text) and the daslang theme via live_imgui_init, but we +// split harness_end_frame manually because our Fourier viz draws between +// the GL clear and ImGui's RenderDrawData. _allow_glfw_calls = true opts +// out of imgui_harness_lint for the OpenGL/GLFW calls we own. let NGRAPH = 1000 @@ -17,149 +24,124 @@ var { rotating : bool = true rps : float = 0.1f tt : float = 0.0f -// c0 c0 : float2 = float2(-0.2f, 0.05f) -// cp1, cn1 enable_1 : bool = true cp1 : float2 = float2(0.27f, 0.0f) cn1 : float2 = float2(0.0f, 0.10f) -// cp2, cn2 enable_2 : bool = false cp2 : float2 = float2(-0.07f, 0.08f) cn2 : float2 = float2(0.03f, -0.02f) } -def imgui_app(title : string; blk : block) { - if (glfwInit() == 0) { - panic("can't init glfw") - } - glfwInitOpenGL(3, 3, false, false) - window = glfwCreateWindow(1024, 1024, title, null, null) - if (window == null) { - panic("can't create window") - } - glfwMakeContextCurrent(window) - glfwSwapInterval(1) - CreateContext(null) - var io & = unsafe(GetIO()) - io.FontGlobalScale = 1.0 // BBATKIN: note - my monitor is HUGE - StyleColorsDark(null) - ImGui_ImplGlfw_InitForOpenGL(window, true) - ImGui_ImplOpenGL3_Init("#version 330") - var clear_color = float4(0.85f, 0.85f, 0.90f, 1.00f) - create_gl_objects() - while (glfwWindowShouldClose(window) == 0) { - glfwPollEvents() - ImGui_ImplOpenGL3_NewFrame() - ImGui_ImplGlfw_NewFrame() - invoke(blk) - var display_w, display_h : int - glfwGetFramebufferSize(window, display_w, display_h) - glViewport(0, 0, display_w, display_h) - glClearColor(clear_color.x, clear_color.y, clear_color.z, clear_color.w) - glClear(GL_COLOR_BUFFER_BIT) - let time = rotating ? glfwGetTime() * double(rps) % 1.0lf : double(tt) - let t = float(time) - // compute vectors - let p0 = c0 - let pp1 = mul_complex(cp1, rot_complex(1.0 * t * 2.0 * PI)) - let pn1 = mul_complex(cn1, rot_complex(-1.0 * t * 2.0 * PI)) - let pp2 = mul_complex(cp2, rot_complex(2.0 * t * 2.0 * PI)) - let pn2 = mul_complex(cn2, rot_complex(-2.0 * t * 2.0 * PI)) - // c0 - draw_arrow(float2(0.0f, 0.0f), p0, float3(1.0f, 0.0f, 0.0f)) - if (enable_1) { - // cp1 - var p = p0 - draw_circle(p, length(cp1), float3(0.0f, 0.0f, 0.0f)) - draw_arrow(p, pp1, float3(1.0f, 0.0f, 0.0f)) - p += pp1 - // cn1 - draw_circle(p, length(cn1), float3(0.0f, 0.0f, 0.0f)) - draw_arrow(p, pn1, float3(1.0f, 0.0f, 0.0f)) - p += pn1 - if (enable_2) { - // cp2 - draw_circle(p, length(cp2), float3(0.0f, 0.0f, 0.0f)) - draw_arrow(p, pp2, float3(1.0f, 0.0f, 0.0f)) - p += pp2 - // cn2 - draw_circle(p, length(cn2), float3(0.0f, 0.0f, 0.0f)) - draw_arrow(p, pn2, float3(1.0f, 0.0f, 0.0f)) - p += pn2 - } - } - // graph - draw_graph(float3(1.0f, 1.0f, 0.0f)) - // and done - ImGui_ImplOpenGL3_RenderDrawData(GetDrawData()) - glfwMakeContextCurrent(window) - glfwSwapBuffers(window) - } - ImGui_ImplOpenGL3_Shutdown() - ImGui_ImplGlfw_Shutdown() - DestroyContext(null) - glfwDestroyWindow(window) - glfwTerminate() -} - def angle(c : float2) { let C = normalize(c) return atan2(C.y, C.x) } -def format_angle(c : float2) { +def format_angle(c : float2) : string { return fmt(":.2f", angle(c) / PI) + "*PI" } -def format_length(c : float2) { +def format_length(c : float2) : string { return fmt(":.2f", length(c)) } -def editCoefficientsWindow(p_open : bool? implicit) { - if (!Begin("Setup vectors and circles", p_open, ImGuiWindowFlags.None)) { - End() - return +def edit_coefficients_window() { + window(SETUP_WIN, (text = "Setup vectors and circles", closable = false, + flags = ImGuiWindowFlags.None)) { + text("Speed of PHI, rotations per second") + edit_checkbox(safe_addr(rotating), (id = "ROTATING", text = "Rotating")) + edit_input_float(safe_addr(tt), (id = "TIME", text = "time", step = 0.1f)) + edit_input_float(safe_addr(rps), (id = "RPS", text = "RPS", step = 0.1f)) + separator(SEP_C0) + text("C0 A={format_length(c0)} PHI={format_angle(c0)}") + edit_input_float2(safe_addr(c0), (id = "C0", text = "C0 (x,y)")) + separator(SEP_C1) + edit_checkbox(safe_addr(enable_1), (id = "ENABLE_1", text = "Enable C1,C-1")) + if (enable_1) { + text("C1 A={format_length(cp1)} PHI={format_angle(cp1)}") + edit_input_float2(safe_addr(cp1), (id = "CP1", text = "C1 (x,y)")) + text("C-1 A={format_length(cn1)} PHI={format_angle(cn1)}") + edit_input_float2(safe_addr(cn1), (id = "CN1", text = "C-1 (x,y)")) + separator(SEP_C2) + edit_checkbox(safe_addr(enable_2), (id = "ENABLE_2", text = "Enable C2,C-2")) + if (enable_2) { + text("C2 A={format_length(cp2)} PHI={format_angle(cp2)}") + edit_input_float2(safe_addr(cp2), (id = "CP2", text = "C2 (x,y)")) + text("C-2 A={format_length(cn2)} PHI={format_angle(cn2)}") + edit_input_float2(safe_addr(cn2), (id = "CN2", text = "C-2 (x,y)")) + } + } } - Text("Speed of PHI, rotations per second") - Checkbox("Rotating", unsafe(addr(rotating))) - InputFloat("time", unsafe(addr(tt)), 0.1f) - InputFloat("RPS", unsafe(addr(rps)), 0.1f) - Separator() - Text("C0 A={format_length(c0)} PHI={format_angle(c0)}") - InputFloat("C0.x", unsafe(addr(c0.x)), 0.1f) - InputFloat("C0.y", unsafe(addr(c0.y)), 0.1f) - Separator() - Checkbox("Enable C1,C-1", unsafe(addr(enable_1))) +} + +def draw_fourier() { + let time = rotating ? glfwGetTime() * double(rps) % 1.0lf : double(tt) + let t = float(time) + let p0 = c0 + let pp1 = mul_complex(cp1, rot_complex(1.0 * t * 2.0 * PI)) + let pn1 = mul_complex(cn1, rot_complex(-1.0 * t * 2.0 * PI)) + let pp2 = mul_complex(cp2, rot_complex(2.0 * t * 2.0 * PI)) + let pn2 = mul_complex(cn2, rot_complex(-2.0 * t * 2.0 * PI)) + draw_arrow(float2(0.0f, 0.0f), p0, float3(1.0f, 0.0f, 0.0f)) if (enable_1) { - Text("C1 A={format_length(cp1)} PHI={format_angle(cp1)}") - InputFloat("C1.x", unsafe(addr(cp1.x)), 0.1f) - InputFloat("C1.y", unsafe(addr(cp1.y)), 0.1f) - Text("C-1 A={format_length(cn1)} PHI={format_angle(cn1)}") - InputFloat("C-1.x", unsafe(addr(cn1.x)), 0.1f) - InputFloat("C-1.y", unsafe(addr(cn1.y)), 0.1f) - Separator() - Checkbox("Enable C2,C-2", unsafe(addr(enable_2))) + var p = p0 + draw_circle(p, length(cp1), float3(0.0f, 0.0f, 0.0f)) + draw_arrow(p, pp1, float3(1.0f, 0.0f, 0.0f)) + p += pp1 + draw_circle(p, length(cn1), float3(0.0f, 0.0f, 0.0f)) + draw_arrow(p, pn1, float3(1.0f, 0.0f, 0.0f)) + p += pn1 if (enable_2) { - Text("C2 A={format_length(cp2)} PHI={format_angle(cp2)}") - InputFloat("C2.x", unsafe(addr(cp2.x)), 0.1f) - InputFloat("C2.y", unsafe(addr(cp2.y)), 0.1f) - Text("C-2 A={format_length(cn2)} PHI={format_angle(cn2)}") - InputFloat("C-2.x", unsafe(addr(cn2.x)), 0.1f) - InputFloat("C-2.y", unsafe(addr(cn2.y)), 0.1f) + draw_circle(p, length(cp2), float3(0.0f, 0.0f, 0.0f)) + draw_arrow(p, pp2, float3(1.0f, 0.0f, 0.0f)) + p += pp2 + draw_circle(p, length(cn2), float3(0.0f, 0.0f, 0.0f)) + draw_arrow(p, pn2, float3(1.0f, 0.0f, 0.0f)) + p += pn2 } } - End() + draw_graph(float3(1.0f, 1.0f, 0.0f)) +} + +[export] +def init() { + harness_init("Vectors & Circles", 1024, 1024) + create_gl_objects() +} + +[export] +def update() { + if (!harness_begin_frame()) return + harness_new_frame() + + edit_coefficients_window() + + var display_w, display_h : int + live_get_framebuffer_size(display_w, display_h) + glViewport(0, 0, display_w, display_h) + glClearColor(0.85f, 0.85f, 0.90f, 1.0f) + glClear(GL_COLOR_BUFFER_BIT) + draw_fourier() + + end_of_frame() + Render() + ImGui_ImplOpenGL3_RenderDrawData(GetDrawData()) + live_end_frame() } [export] -def main { - var f = 0.0 - imgui_app("Vectors & Circles") { - NewFrame() - editCoefficientsWindow(null) - Render() +def shutdown() { + harness_shutdown() +} + +[export] +def main() { + init() + while (!exit_requested()) { + update() } + shutdown() } var @in @location = 0 v_position : float2 @@ -214,7 +196,7 @@ def draw_arrow(origin : float2; c : float2; color : float3) { f_color = color v_offset = origin v_rot = c - v_scale = float2(1., 1.) + v_scale = float2(1.0f, 1.0f) vs_main_bind_uniform(program) fs_main_bind_uniform(program) glBindVertexArray(vao_arrow) @@ -260,16 +242,16 @@ def create_gl_objects { glBindVertexArray(vao_arrow) glGenBuffers(1, safe_addr(vbo_arrow)) glBindBuffer(GL_ARRAY_BUFFER, vbo_arrow) - var vertices <- [Vertex( - xy = float2(0.0f, 0.0f)), Vertex( - xy = float2(1.0f, 0.0f)), Vertex( - xy = float2(0.95f, 0.025f)), Vertex( - xy = float2(0.95f, -0.025f)), Vertex( - xy = float2(1.0f, 0.0f) - )] + var vertices <- [ + Vertex(xy = float2(0.0f, 0.0f)), + Vertex(xy = float2(1.0f, 0.0f)), + Vertex(xy = float2(0.95f, 0.025f)), + Vertex(xy = float2(0.95f, -0.025f)), + Vertex(xy = float2(1.0f, 0.0f)) + ] glBufferData(GL_ARRAY_BUFFER, vertices, GL_STATIC_DRAW) bind_vertex_buffer(null, type) - // graph + // graph glGenVertexArrays(1, safe_addr(vao_graph)) glBindVertexArray(vao_graph) glGenBuffers(1, safe_addr(vbo_graph)) @@ -280,7 +262,6 @@ def create_gl_objects { } def mul_complex(a, b : float2) { - // (a.x+i*a.y)*(b.x+i*b.y) = a.x*b.x - a.y*b.y + i*(a.x*b.y + a.y*b.x) return float2(a.x * b.x - a.y * b.y, a.x * b.y + a.y * b.x) } @@ -288,7 +269,7 @@ def rot_complex(phi : float) { return float2(cos(phi), sin(phi)) } -def compute_fn(t : float) {// 0..1 +def compute_fn(t : float) { let p0 = c0 let pp1 = mul_complex(cp1, rot_complex(1.0 * t * 2.0 * PI)) let pn1 = mul_complex(cn1, rot_complex(-1.0 * t * 2.0 * PI)) @@ -312,4 +293,4 @@ def compute_graph { vertices |> push(Vertex(xy = p)) } glBufferData(GL_ARRAY_BUFFER, vertices, GL_DYNAMIC_DRAW) -} \ No newline at end of file +} diff --git a/mouse-data/docs/are-there-parity-tests-in-tests-linq-that-compare-fold-output-to-the-underlying-linq-operators.md b/mouse-data/docs/are-there-parity-tests-in-tests-linq-that-compare-fold-output-to-the-underlying-linq-operators.md new file mode 100644 index 000000000..551cbb2b0 --- /dev/null +++ b/mouse-data/docs/are-there-parity-tests-in-tests-linq-that-compare-fold-output-to-the-underlying-linq-operators.md @@ -0,0 +1,37 @@ +--- +slug: are-there-parity-tests-in-tests-linq-that-compare-fold-output-to-the-underlying-linq-operators +title: Are there parity tests in tests/linq/ that compare `_fold` output to the underlying linq operators? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +There's no file named "parity" or similar. The parity-test surface IS the broader [tests/linq/](tests/linq/) directory: + +- `test_linq.das` — comprehension basics +- `test_linq_aggregation.das` — count/sum/min/max/avg +- `test_linq_querying.das` — any/all/contains +- `test_linq_transform.das` — select / select_many / zip +- `test_linq_sorting.das` — order / reverse +- `test_linq_group_by.das` — group_by / having +- `test_linq_join.das` — joins +- `test_linq_partition.das` — take / skip / chunk / take_while / skip_while +- `test_linq_set.das` — distinct / union / except / intersect / unique +- `test_linq_element.das` — first / last / single / element_at +- `test_linq_concat.das` — concat / prepend / append +- `test_linq_generation.das` — range / repeat +- `test_linq_bugs.das` — regressions + +Each file uses `[test]` functions with `t |> run("name") @(t) { ... }` blocks asserting `t |> equal(actual, expected)`. These exercise the regular linq operators (`where_`, `select`, `count`, ...) directly — they're not split into "fold-on" vs "fold-off" variants. + +Dedicated `_fold` tests live in `test_linq_fold.das` (functional output) and `test_linq_fold_ast.das` (AST-shape verification — pattern-matches the macro expansion). These DO compare `_fold(chain)` output against the plain `chain` output for the shapes the macro recognizes. + +When the user says "parity tests" in linq context, treat the full `test_linq_*.das` suite as the operator-coverage map. Phase-2+ benchmark/splice PRs should add a `benchmarks/sql/` entry for each shape exercised here that isn't already covered (tracked as a checklist in `benchmarks/sql/LINQ.md`). + +## Questions +- Are there parity tests in tests/linq/ that compare `_fold` output to the underlying linq operators? +- What's the "parity test" coverage surface for linq? +- Where are tests for linq operators? + +## Questions +- Are there parity tests in tests/linq/ that compare `_fold` output to the underlying linq operators? diff --git a/mouse-data/docs/cpp-profiler-macos-samply-instruments.md b/mouse-data/docs/cpp-profiler-macos-samply-instruments.md new file mode 100644 index 000000000..db57cabc9 --- /dev/null +++ b/mouse-data/docs/cpp-profiler-macos-samply-instruments.md @@ -0,0 +1,68 @@ +--- +slug: cpp-profiler-macos-samply-instruments +title: What C++ sampling profiler should I use on macOS for daslang (and how do I run it)? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +# C++ sampling profiler on macOS (Apple Silicon) + +VS Code has **no first-class C++ profiler integration on macOS** — the "Performance Profiler" / similar extensions wrap Linux `perf` and don't help here. Skip them. Run a sampler from the integrated terminal and view results in browser/Instruments. + +## samply (default choice) + +Rust-built, Firefox-Profiler frontend, zero config. + +```bash +cargo install samply +samply record ./build/daslang script.das +``` + +- Opens flamegraph in browser automatically. +- Symbolicates Mach-O cleanly if you build `-DCMAKE_BUILD_TYPE=RelWithDebInfo` (do NOT use plain `Release` — symbols are stripped). +- Works without sudo on Apple Silicon. +- Good for "where does the CPU go" questions. + +## Xcode Instruments — Time Profiler (second opinion) + +Native macOS sampler, kernel-assisted, best symbolication on Apple Silicon. Use when samply's view is ambiguous or you want call-tree + timeline together. + +```bash +xcrun xctrace record --template 'Time Profiler' --launch -- ./build/daslang script.das +``` + +Then open the resulting `.trace` bundle (Instruments launches). UI is outside VS Code. + +## daslang-specific recipe + +Pair the sampler with the per-module compile-time logging (`-log-compile-time` CLI flag, added on branch `bbatkin/log-compile-time-cli`): + +```bash +cmake --build build --config RelWithDebInfo -j 64 +samply record ./build/daslang -log-compile-time path/to/script.das +``` + +- `-log-compile-time` tells you which module is slow. +- Sampling tells you which function inside that module is hot. +- Together they narrow "compile is slow" to a specific phase + symbol. + +## What NOT to use + +- `perf` — Linux only, doesn't exist on Darwin. +- Intel VTune — x86-mostly, ignore on Apple Silicon. +- `gprof` — instrumenting, not sampling; ancient. +- VS Code C++ profiler extensions — see above, all are Linux/perf wrappers or toys. +- `hyperfine` / `poop` — benchmarking (whole-program timing), not profiling (per-function hotspots). Different question. + +## Build flag reminder + +Both samply and Instruments need symbols. The two viable build types on this repo: + +- `RelWithDebInfo` — fast code + symbols. Use this for profiling. +- `Debug` — slow code; profile reflects debug overhead, not real hotspots. Avoid. + +Plain `Release` strips symbols and you'll get `???` everywhere in the flamegraph. + +## Questions +- What C++ sampling profiler should I use on macOS for daslang (and how do I run it)? diff --git a/mouse-data/docs/daslang-generic-instance-detect-via-fromgeneric.md b/mouse-data/docs/daslang-generic-instance-detect-via-fromgeneric.md new file mode 100644 index 000000000..922061d8e --- /dev/null +++ b/mouse-data/docs/daslang-generic-instance-detect-via-fromgeneric.md @@ -0,0 +1,33 @@ +--- +slug: daslang-generic-instance-detect-via-fromgeneric +title: How do I detect that an ExprCall is to a daslang generic (e.g. each, length, find) when func.name is the mangled instance name and not the original generic's name? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +When a daslang generic function (`def each(a : array) : iterator`, `def length(a : auto | #) : int`, etc.) is resolved against a concrete type at infer time, the resolved `Function?` instance gets a **mangled name** like `` `builtin`each`30908`12345 ``. Macro code that compares `call.func.name == "each"` will never match a typed instance. + +The original generic's identity lives in `call.func.fromGeneric`: + +```das +[macro_function] +def private is_each_call(call : ExprCall?) : bool { + if (call == null || call.func == null) return false + return (call.func.name == "each" + || (call.func.fromGeneric != null && call.func.fromGeneric.name == "each")) +} +``` + +The `name == "each"` branch covers the unusual case where you see the call before the typer has specialized it (e.g. inside a custom call_macro that runs early). The `fromGeneric.name` branch is the normal case for any post-infer chain. + +**When this bites:** writing a `[macro_function]` that pattern-matches on a stdlib helper by name — `each`, `length`, `key_exists`, `find`, `set_insert`, all the generic `to_array`/`to_table` variants. Without the `fromGeneric` check, every typed chain silently falls through your match and your macro behaves as if the helper wasn't there. + +**Generalizes beyond function calls:** same applies to method overload resolution. `call.func.fromGeneric` is the canonical "which generic was this instantiated from?" link. There's no `originalName` field — the chain is `func → func.fromGeneric → fromGeneric.name`. + +**Doesn't apply to:** C++ builtins from `addExtern<>` (no fromGeneric, the `func.name` is the bound name directly). Builtins also have `func.flags.builtIn = true` if you need to distinguish. + +See [[my-fold-macro-emits-a-loop-with-for-it-in-source-acc-reserve-length-source-but-the-reserve-doesn-t-fire-when-the-chain-starts-wi]] for the concrete case where this broke `peel_each` in `daslib/linq_fold.das`. + +## Questions +- How do I detect that an ExprCall is to a daslang generic (e.g. each, length, find) when func.name is the mangled instance name and not the original generic's name? diff --git a/mouse-data/docs/daslib-macro-boost-has-sideeffects-predicate.md b/mouse-data/docs/daslib-macro-boost-has-sideeffects-predicate.md new file mode 100644 index 000000000..5d551d415 --- /dev/null +++ b/mouse-data/docs/daslib-macro-boost-has-sideeffects-predicate.md @@ -0,0 +1,43 @@ +--- +slug: daslib-macro-boost-has-sideeffects-predicate +title: Is there a conservative side-effect detector for Expression nodes in daslib macro_boost — something I can call from a call_macro to know if it's safe to elide an evaluation at macro time? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +Yes — `has_sideeffects(expr : Expression?) : bool` in `daslib/macro_boost` (added in PR #2691, follow-up to Phase 2A loop planner). Returns `true` if the expression has or **might have** side effects; `false` ONLY when provably pure. + +```das +require daslib/macro_boost public + +if (has_sideeffects(projection)) { + // Emit the bind — projection must run for its side effects. + sideEffectStmts |> push <| qmacro_expr() { + var $i(finalBindName) = $e(projection) + } +} else { + // Skip the bind — pure projection, no observable effect. +} +``` + +**Conservative — false is a promise:** + +- SAFE leaves: `ExprVar`, all `ExprConst*`, `ExprAddr`, `ExprTypeInfo/Decl/Tag`. +- SAFE via recursion: `ExprField`, `ExprSafeField`, `ExprSwizzle`, `ExprRef2Value/Ptr`, `ExprPtr2Ref`, `ExprIs`, `ExprAsVariant`, `ExprIsVariant`, `ExprSafeAsVariant`, `ExprCast`, `ExprNullCoalescing`, `ExprStringBuilder` (string heap is no-op per compiler), `ExprKeyExists` (pure container read). +- `ExprAt`: safe when `subexpr._type` is NOT `isGoodTableType` (tables auto-insert default on missing key — a write). `ExprSafeAt` (`?[...]`) always safe. +- `ExprOp1/Op2/Op3`: op-name allowlist for pure ops on workhorse types — `+ - * == != < <= > >= & | ^ << >> && || ?:` (Op2), `- ! ~ +` (Op1). Falls back to `func.flags.builtIn && !knownSideEffects && !unsafeOperation`. `/` and `%` BLACKLISTED (div-by-zero panic). +- `ExprCall`/`ExprCallFunc`: allowed when `func.flags.builtIn && !knownSideEffects && !unsafeOperation`, then recurse args. +- Everything else (including `ExprNew`, all `ExprMake*`, user-defined calls, `ExprInvoke`, `ExprYield`, statement-context exprs): UNSAFE. + +**Known limitations / when it returns conservative-unsafe:** + +- daslang-generic helpers like `length(arr)` and `key_exists(tab, k)` — the resolved `func.name` is the mangled instance, and the typer hasn't always reached the `flags.builtIn=true` C++ overload before the call_macro fires. They show up as user-call shapes and get rejected. Workaround: don't rely on this for length/key_exists in projections (they appear in `has_sideeffects` tests as `target_generic_length_unresolved` / `target_key_exists_unresolved` returning `true`). +- User-defined pure helpers — there's no `[no_side_effects]` annotation yet. The compiler's `expr.flags.noSideEffects` fast path catches some cases (set during infer), but anything the typer didn't tag falls through to UNSAFE. + +**Tests:** `tests/macro_boost/test_has_sideeffects.das` has 24 cases (17 safe + 5 unsafe + 2 conservative-unsafe) wired via a `_test_has_sideeffects` probe `call_macro` ([`tests/macro_boost/_has_sideeffects_probe.das`](../../tests/macro_boost/_has_sideeffects_probe.das)) that runs the predicate at macro time and emits `ExprConstBool` of the result. Use the same probe pattern when testing any new predicate that needs to run at macro time but be exercised via runtime tests. + +**Real use:** `daslib/linq_fold.das` `plan_loop_or_count` uses it for three optimizations: discardable `var vfinal =` bind elision, count→length shortcut gate (whole loop elided when no filter + all projections pure + source has length), and tracking `allProjectionsPure` across chained selects. select_count benchmark went from 2 → 0 ns/op. + +## Questions +- Is there a conservative side-effect detector for Expression nodes in daslib macro_boost — something I can call from a call_macro to know if it's safe to elide an evaluation at macro time? diff --git a/mouse-data/docs/how-do-i-call-a-dasimgui-or-any-managed-c-method-on-a-struct-field-that-s-bound-as-a-raw-pointer-e-g-addfontfromfilettf-on-getio.md b/mouse-data/docs/how-do-i-call-a-dasimgui-or-any-managed-c-method-on-a-struct-field-that-s-bound-as-a-raw-pointer-e-g-addfontfromfilettf-on-getio.md new file mode 100644 index 000000000..450226dba --- /dev/null +++ b/mouse-data/docs/how-do-i-call-a-dasimgui-or-any-managed-c-method-on-a-struct-field-that-s-bound-as-a-raw-pointer-e-g-addfontfromfilettf-on-getio.md @@ -0,0 +1,66 @@ +--- +slug: how-do-i-call-a-dasimgui-or-any-managed-c-method-on-a-struct-field-that-s-bound-as-a-raw-pointer-e-g-addfontfromfilettf-on-getio +title: How do I call a dasImgui (or any managed C++) method on a struct field that's bound as a raw pointer — e.g. AddFontFromFileTTF on GetIO().Fonts? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +## TL;DR + +When a managed struct's field is bound as a pointer (`T?`) and the method on that pointed-to struct expects the value by-ref (`T implicit`), you must explicitly **dereference**. Plain `field |> method(...)` errors with mismatched types. + +## The error you'll hit + +``` +error[30341]: no matching functions or generics: AddFontFromFileTTF(imgui::ImFontAtlas?&, string const&, ...) + candidate function: ImFontAtlas implicit ... + invalid argument 'self' (0). expecting 'imgui::ImFontAtlas implicit', passing 'imgui::ImFontAtlas?&' +``` + +The `?` is the giveaway — `GetIO().Fonts` is `ImFontAtlas?` (raw pointer; field bound via `addField` against C++ `ImFontAtlas* Fonts`), but the method binding `das_call_member< ImFont * (ImFontAtlas::*)(...) >` takes the receiver by-value/ref. + +## The fix + +Bind a local ref through `unsafe(*ptr)`, then call as usual: + +```daslang +var atlas & = unsafe(*GetIO().Fonts) +let f = atlas |> AddFontFromFileTTF(ttf, 14.0f, null, null) +``` + +Equivalent inline form: + +```daslang +unsafe(*GetIO().Fonts) |> AddFontFromFileTTF(ttf, 14.0f, null, null) +``` + +## Why each part + +- **`*ptr`** is daslang's pointer-deref syntax (see `daslib/if_not_null.rst`: *"a dereferenced call: ``if (ptr != null) { call(*ptr, args) }``"*). The alternative `deref(ptr)` exists too but is rarer in modules; `*` is the idiom. +- **`unsafe(...)`** is required because dereferencing a raw `T?` is unsafe (no null check, no lifetime guarantee). +- **`var atlas &`** binds a local *reference* — without `&` you'd be copying the whole `ImFontAtlas` struct into a stack temporary, which (a) wastes memory and (b) means any mutation the method does (font atlas builds, glyph rasterization) hits the copy and is lost. +- **The pipe `|>` works fine on the local ref** — `atlas |> method(x, y)` desugars to `method(atlas, x, y)` and the `implicit` first-param accepts the ref directly. + +## Why NOT the other shapes + +- `GetIO().Fonts.AddFontFromFileTTF(...)` — `.method()` sugar is sugar for `method(self, ...)` only when `self` is a struct value. CLAUDE.md explicitly: *"Does NOT work on: primitives, tuples/arrays, and lambda typedefs"* — and (this case) raw pointers. Field *access* on a pointer auto-derefs (`GetIO().Fonts.TexID` works); method dispatch does not. +- `GetIO().Fonts->AddFontFromFileTTF(...)` — `->` is for class instances (smart_ptr / class types), not raw C-struct pointers from `ManagedStructureAnnotation`. +- `deref(GetIO().Fonts) |> AddFontFromFileTTF(...)` — works but the pipe gets a temporary value not a ref; mutations on the receiver disappear. Use `var x & = unsafe(*p)` instead. + +## When this comes up + +Anywhere a C++ binding exposes a struct field as `T*` (typical for "owns-an-atlas" or "owns-a-context" patterns): +- `ImGuiIO::Fonts` → `ImFontAtlas?` +- `ImDrawData::CmdLists` → indirection on lists +- anything bound via raw `addField` where the C++ type is `Foo*` + +If the C++ field were a value (`ImFontAtlas Fonts;` instead of `ImFontAtlas* Fonts;`), it'd bind as the struct directly and the pipe would just work. + +## Related + +- [[dasimgui-new-state-struct-widget-auto-emit-just-works]] — different topic (state-struct registration) but same module family. +- [[how-do-i-pack-an-im-col32-color-from-dasimgui-v2-code-without-depending-on-the-v1-daslib-imgui-boost-path]] — sibling dasImgui idiom. + +## Questions +- How do I call a dasImgui (or any managed C++) method on a struct field that's bound as a raw pointer — e.g. AddFontFromFileTTF on GetIO().Fonts? diff --git a/mouse-data/docs/how-do-i-run-dastest-in-benchmark-only-mode-and-what-s-the-command-line-syntax.md b/mouse-data/docs/how-do-i-run-dastest-in-benchmark-only-mode-and-what-s-the-command-line-syntax.md new file mode 100644 index 000000000..014873dae --- /dev/null +++ b/mouse-data/docs/how-do-i-run-dastest-in-benchmark-only-mode-and-what-s-the-command-line-syntax.md @@ -0,0 +1,45 @@ +--- +slug: how-do-i-run-dastest-in-benchmark-only-mode-and-what-s-the-command-line-syntax +title: How do I run dastest in benchmark-only mode and what's the command-line syntax? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +Benchmarks are functions annotated with `[benchmark]` from `dastest/testing_boost.das`. Run them via the dastest harness with `--bench`: + +```bash +# All benchmarks in a directory (skip the regular tests) +./bin/daslang dastest/dastest.das -- --bench --test benchmarks/sql --test-names none + +# Just one file +./bin/daslang dastest/dastest.das -- --bench --test benchmarks/sql/count_aggregate.das --test-names none + +# Filter by [benchmark] function-name prefix (substring match on the function name) +./bin/daslang dastest/dastest.das -- --bench --bench-names sum_ --test benchmarks/sql --test-names none + +# Collect N samples for variance / averaging +./bin/daslang dastest/dastest.das -- --bench --test benchmarks/sql/count_aggregate.das --test-names none --count 5 +``` + +Key flags: +- `--bench` — enable benchmark execution +- `--test ` — folder or single file (NOT positional) +- `--test-names none` — skip regular `[test]` discovery (benchmarks only) +- `--bench-names ` — filter benchmarks by function-name prefix +- `--bench-format ` — output format +- `--count ` — repeat all benchmarks N times + +Benchmarks only run after all module **tests** have passed; that's why `--test-names none` is the canonical "skip tests, run benchmarks" combo. + +Output is ` N ns/op /op /op /op /op`. If the benchmark `b |> run(name, chunk_size, op)` form passes a chunk_size (typically the dataset size), the displayed ns/op is **divided by that chunk_size** — i.e. per-element time, not per-op-call time. Sub-nanosecond results (`0 ns/op`) usually mean early-exit hit the answer in O(1) regardless of dataset size. + +Reference: `dastest/README.md` and `dastest/dastest_clargs.das`. + +## Questions +- How do I run dastest in benchmark-only mode and what's the command-line syntax? +- What's the dastest --bench command line? +- How do I filter dastest benchmarks by name? + +## Questions +- How do I run dastest in benchmark-only mode and what's the command-line syntax? diff --git a/mouse-data/docs/imgui-harness-headless-timeout-sec-cascade-guard.md b/mouse-data/docs/imgui-harness-headless-timeout-sec-cascade-guard.md new file mode 100644 index 000000000..cbe4a813a --- /dev/null +++ b/mouse-data/docs/imgui-harness-headless-timeout-sec-cascade-guard.md @@ -0,0 +1,57 @@ +--- +slug: imgui-harness-headless-timeout-sec-cascade-guard +title: How do I add a wall-clock self-exit timer to a daslang-live harness so a panicked test doesn't leave a zombie subprocess on the live-API port? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +**Problem**: daslang's `finally` block is skipped on panic. A test that does `with_imgui_app(F) $(d) { ... expect_value(...) ... }` and panics inside `expect_value`'s timeout never reaches the `/shutdown` POST that `with_imgui_app` would have sent in cleanup. The spawned `daslang-live` subprocess keeps running, holding port 9090. The next test that spawns errors out with "another instance of daslang-live is already running" (macOS) or hangs at drain for the full popen watchdog (Ubuntu) because its `/status` polls hit the zombie instead. + +**Fix shape**: give the subprocess a wall-clock self-exit timer. Even if the parent never sends `/shutdown`, the script self-exits before the popen parent gives up, the port releases, and the next test starts cleanly. + +**Implementation in dasImgui PR #38** (`widgets/imgui_harness.das` + `widgets/imgui_playwright.das`): + +1. New CLI flag `--headless-timeout-sec=N` parsed via clargs alongside `--headless-frames=N`: + ```daslang + let raw_timeout = find_flag_raw_value(args, "--headless-timeout-sec") + g_headless_max_uptime_sec = to_float(raw_timeout |> unwrap_or("0")) + ``` + Default 0 = disabled (preserves standalone `daslang.exe foo.das -- --headless` usage). + +2. Wall-clock check inside `harness_begin_frame()`, right next to the existing `--headless-frames` cap: + ```daslang + let now = get_uptime() + if (g_headless_first_uptime < 0.0f) { + g_headless_first_uptime = now + } + if (g_headless_max_uptime_sec > 0.0f && (now - g_headless_first_uptime) >= g_headless_max_uptime_sec) { + print("[harness] headless timeout {g_headless_max_uptime_sec}s reached at uptime {now - g_headless_first_uptime}s — request_exit()\n") + request_exit() + return false + } + ``` + The print is the **only** log line kept in the cleaned-up harness — it fires at most once per subprocess and only when the safety net actually trips. Healthy runs are silent. + +3. Playwright's `with_imgui_app_opt` appends `--headless-timeout-sec=(test_timeout_sec - 5)` to the spawned argv whenever `--headless` is forwarded. The −5 s margin gives the script time to finish the current frame, run `shutdown()`, and close the live-API port before the parent's popen watchdog fires: + ```daslang + if (playwright_wants_headless()) { + argv |> push("--") + argv |> push("--headless") + let harness_budget = test_timeout_sec - 5.0f + if (harness_budget > 5.0f) { + argv |> push("--headless-timeout-sec={harness_budget}") + } + } + ``` + +**Why it works on the cascade**: even when `expect_value` panics inside the body block, the daslang-live subprocess continues running its update loop. The next `harness_begin_frame()` call (called every frame from `update()`) notices `uptime > budget`, calls `request_exit()`, and the main loop terminates cleanly via `while (!exit_requested())`. `shutdown()` runs, port 9090 releases, popen's parent reads EOF, exit code is 0 — no cascade for the next test. + +**Sizing rule**: set timeout less than the popen watchdog by enough margin to cover one frame + shutdown. 5 seconds is comfortable. If the popen watchdog is 120 s, harness timeout = 115 s. + +**Limitation**: only fires from `harness_begin_frame`. If the script's `update()` is stuck inside something that never returns to the main loop, harness timeout never gets a chance. This is the right tradeoff — a stuck `update()` is a different bug class (real deadlock), and popen still kills the process at its watchdog. + +**Cascade-guard pattern generalizes**: any long-running subprocess that owns a port (HTTP server, RPC endpoint, anything) and is spawned for a bounded test/check should have a wall-clock self-exit set slightly below the parent's kill-timeout. Cleanup-via-script always beats cleanup-via-SIGKILL. + +## Questions +- How do I add a wall-clock self-exit timer to a daslang-live harness so a panicked test doesn't leave a zombie subprocess on the live-API port? diff --git a/mouse-data/docs/imgui-macos-configmacosxbehaviors-shortcut-is-super-not-ctrl.md b/mouse-data/docs/imgui-macos-configmacosxbehaviors-shortcut-is-super-not-ctrl.md new file mode 100644 index 000000000..44a09202b --- /dev/null +++ b/mouse-data/docs/imgui-macos-configmacosxbehaviors-shortcut-is-super-not-ctrl.md @@ -0,0 +1,39 @@ +--- +slug: imgui-macos-configmacosxbehaviors-shortcut-is-super-not-ctrl +title: Why does synth Ctrl+A do nothing on macOS but works on Linux/Windows in an ImGui InputText test? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +**ImGui sets `io.ConfigMacOSXBehaviors = true` automatically on macOS** (`#ifdef __APPLE__` in `ImGuiIO` ctor). When that flag is true, the **"shortcut key" is Super (Cmd), not Ctrl**. + +So on macOS: +- `Ctrl+A` → "move to start of line" (text-editing convention) +- `Cmd+A` → "select all" +- `Ctrl+Backspace` → does not delete word +- `Cmd+Backspace` → "delete to start of line" + +A synth-IO test that drives `Ctrl+A` to clear an InputText buffer **fails silently on macOS** — the assertion message will look like "NAME_INPUT.value == "" after Ctrl+A then Backspace" with the buffer still holding `"abc"` (because Ctrl+A moved cursor to start, then Backspace deleted nothing). + +**Detection from a playwright test**: snapshot `io.config_macos_behaviors` and branch the chord. dasImgui's `widgets/imgui_boost_runtime.das` exposes the flag via the `io_jv()` snapshot key `"config_macos_behaviors"`: + +```daslang +let macos = snap0?["io"]?["config_macos_behaviors"] ?? false +if (macos) { + post_command(d, "imgui_key_chord", JV((mods = ["Super"], key = "A"))) +} else { + post_command(d, "imgui_key_chord", JV((mods = ["Ctrl"], key = "A"))) +} +``` + +**Why we don't unconditionally use Super**: on Linux/Windows there's no Super-key binding for select-all in InputText; the wrong choice silently fails on those platforms instead. Branch on the actual `config_macos_behaviors` flag — works everywhere. + +Discovered in dasImgui PR #38 `test_click_then_ctrl_a_clears_input` failing only on macOS CI after the playwright cascade fix made other failures observable. Copilot diagnosed and fixed in commit `42b7292` + the IO-snapshot extension. Caveat: Copilot drafted the fix with `if macos { ... }` (gen1 syntax) — gen2 requires `if (macos) { ... }`, watch for that pattern when ferrying AI suggestions through. + +**Related ImGui flags worth knowing**: +- `io.KeyCtrl` reflects either Ctrl OR (Cmd on macOS when `ConfigMacOSXBehaviors`) — the "shortcut" lookups go through `io.KeySuper` instead on macOS. +- `ImGui::Shortcut(ImGuiMod_Ctrl | ImGuiKey_A)` automatically maps to Cmd on macOS — but `Shortcut(ImGuiMod_Super | ImGuiKey_A)` does NOT remap, only `ImGuiMod_Ctrl` is the platform-aware "primary modifier". + +## Questions +- Why does synth Ctrl+A do nothing on macOS but works on Linux/Windows in an ImGui InputText test? diff --git a/mouse-data/docs/imgui-playwright-windows-ci-16-post-libhv-stall.md b/mouse-data/docs/imgui-playwright-windows-ci-16-post-libhv-stall.md new file mode 100644 index 000000000..0b5afc966 --- /dev/null +++ b/mouse-data/docs/imgui-playwright-windows-ci-16-post-libhv-stall.md @@ -0,0 +1,38 @@ +--- +slug: imgui-playwright-windows-ci-16-post-libhv-stall +title: Why do imgui playwright tests hang at 120 seconds on Windows CI when they pass locally and on POSIX? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +**dasHV's libhv build on Windows CI stalls after exactly 16 POST /command connections per subprocess.** Local Windows works fine; only the GitHub-hosted `windows-latest` runner trips it. Discovered while resurrecting dasImgui integration tests in PR #38 (May 2026). + +**Symptom**: a test does `with_imgui_app` → spawns daslang-live → fast burst of 1 GET /status + 16 POST /command (all 200 OK in <300 ms), then the libhv event loop **stops accepting new connections**. The 17th HTTP request hangs ~60 s, the test body never makes progress, popen kills the subprocess at the 120 s watchdog (`DAS_POPEN_TIMEOUT = 0x7FFFFF01 = 2147483393`). + +**Confirm**: run with `DASLIVE_HV_LOG=stderr DASLIVE_HV_LOG_LEVEL=DEBUG` env vars. Count `[POST /command]=>[200 OK]` lines per subprocess pid in the captured stderr — if exactly 16, you've hit it. Healthy paths show many more. + +**Verified-locally counterproof**: same dasImgui suite under `D:\Work\daScript\bin\Release\daslang.exe ... dastest.das -- --test ... --headless` on Win11 box runs 96/96 in ~6 min, including the tests that hang on CI. So it's CI-runner-specific (Windows Server, different scheduler, different IOCP / TCP loopback tuning, possibly Defender). NOT a libhv bug per se — the upstream libhv code on master is byte-identical to v1.3.4 and works fine on `libDaScriptDyn` linkage everywhere except GitHub `windows-latest`. + +**Workarounds in dasImgui PR #38**: + +1. **Halve HTTP traffic in polling helpers**. Old idiom `wait_until(d, 240) $(var snap) { let s = post_command(d, "imgui_key_status", null); return !(s?["playing"] ?? true) }` does **2 HTTP requests per iteration** (snapshot inside `wait_until` + the inner `post_command`). New helpers `wait_for_key_idle(d, 4.0f)` and `wait_for_mouse_idle(d, 4.0f)` in `widgets/imgui_playwright.das` do **1 POST per iteration** (status only). Use them whenever you only need "playing == false", not a full snapshot. + +2. **Exclude high-POST tests on Windows-only** in `.github/workflows/tests.yml`. Conservative cutoff: any test estimated at ≥12 POSTs over its lifetime, leaves a 4-call safety margin under the 16-connection limit. Pattern: + ```yaml + EXTRA_EXCLUDES="" + if [ "${{ matrix.os }}" = "windows-latest" ]; then + EXTRA_EXCLUDES="--exclude inputs_drag --exclude inputs_numeric --exclude inputs_slider \ + --exclude inputs_color --exclude inputs_choice --exclude inputs_text \ + --exclude indexed_dynamic" + fi + ``` + +**Heuristic for "POST count"**: each `set_value(...)` is 1 POST. Each `wait_for_payload_value(...)` / `wait_for_int_value(...)` / `wait_until { post_command }` is 1-3 POSTs depending on how fast the answer converges. Tests with ≥10 `set_value + wait` pairs typically exceed 16 POSTs. + +**Pre-existing "finally skipped on panic" cascade**: a panicking test in the body block was already known to skip `with_imgui_app`'s `/shutdown` cleanup → zombie subprocess on port 9090 → next test cascades. PR #38 also added `--headless-timeout-sec=N` self-exit to `imgui_harness` (see related card `imgui-harness-headless-timeout-sec-cascade-guard`) so a panicked subprocess can't outlive the popen watchdog. + +**Proper fix is upstream** in daslang's libhv build for Windows IOCP. Track + re-include all 7 excluded tests when it lands. + +## Questions +- Why do imgui playwright tests hang at 120 seconds on Windows CI when they pass locally and on POSIX? diff --git a/mouse-data/docs/my-fold-macro-emits-a-loop-with-for-it-in-source-acc-reserve-length-source-but-the-reserve-doesn-t-fire-when-the-chain-starts-wi.md b/mouse-data/docs/my-fold-macro-emits-a-loop-with-for-it-in-source-acc-reserve-length-source-but-the-reserve-doesn-t-fire-when-the-chain-starts-wi.md new file mode 100644 index 000000000..456c78e34 --- /dev/null +++ b/mouse-data/docs/my-fold-macro-emits-a-loop-with-for-it-in-source-acc-reserve-length-source-but-the-reserve-doesn-t-fire-when-the-chain-starts-wi.md @@ -0,0 +1,56 @@ +--- +slug: my-fold-macro-emits-a-loop-with-for-it-in-source-acc-reserve-length-source-but-the-reserve-doesn-t-fire-when-the-chain-starts-wi +title: My fold macro emits a loop with `for (it in source); acc |> reserve(length(source))` but the reserve doesn't fire when the chain starts with `each(arr)`. How do I make it work? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [[daslang-generic-instance-detect-via-fromgeneric]] +--- + +Peel `each()` at macro time. `each(arr)` reports as `iterator`, so any "is the source an iterator?" check (e.g. `top._type.isIterator`) sees `true` and the array-only reserve path is skipped. But the iteration semantics of `for (it in each(arr))` and `for (it in arr)` are identical — the wrapper iterator is incidental in fold context. + +Pattern (corrected version, from `daslib/linq_fold.das` after Phase 2A bind-elision PR): + +```das +[macro_function] +def private is_each_call(call : ExprCall?) : bool { + // `each` in daslib/builtin.das is generic — the resolved `func.name` + // on a typed instance is mangled (e.g. `builtin`each`30908...`). + // The original generic's name lives in `func.fromGeneric.name`. + if (call == null || call.func == null) return false + return (call.func.name == "each" + || (call.func.fromGeneric != null && call.func.fromGeneric.name == "each")) +} + +[macro_function] +def private peel_each(var top : Expression?) : Expression? { + if (!(top is ExprCall)) return top + var topCall = top as ExprCall + if (!is_each_call(topCall) || topCall.arguments |> length != 1) return top + let argExpr = topCall.arguments[0] + // Only peel when the argument is a true array (or fixed-size array). + // Don't peel iterator-typed args like `each(range(10))` — replacing the + // each call with the raw range would break length-shortcut + reserve + // hints that assume an indexable source. + if ((argExpr == null || argExpr._type == null) + || (!argExpr._type.isGoodArrayType && !argExpr._type.isArray)) return top + return clone_expression(argExpr) +} +``` + +**Two gotchas the original version missed:** + +1. `func.name == "each"` never matched typed instances — generic-instance detection requires `fromGeneric.name`. See [[daslang-generic-instance-detect-via-fromgeneric]]. +2. Peel gate must be **positive** (`is good array`) not negative (`isn't iterator`). `each(range(N))` returns an iterator but its argument `range(N)` is also iterator-shaped (`isRange`) and would otherwise pass `!isIterator`. The positive `isGoodArrayType || isArray` gate cleanly excludes range/string/lambda sources. + +**Block-parameter typedecl needs branching on source shape after peel.** When peel fires, the source goes from iterator (rvalue, no modifiers) to array (`array const&` for `let arr <-` chains). The block parameter type: +- iterator source: `typedecl($e(topExpr)) - const` — strip rvalue const so body can iterate +- array source: `typedecl($e(topExpr))` (no modifier) — keep `const&` so const-ref source matches + +Both wrong → either `array const& vs array` mismatch or `can't iterate over const iterator`. + +**What this is worth:** brought `linq_fold`'s `each(arr)._where(...)._select(_.price).to_array()` benchmark from 13 → 10 ns/op (parity with comprehension baseline). The count→length shortcut built on top brings pure `each(arr)._select(_.x).count()` from 2 → 0 ns/op (loop entirely elided). + +**Generalizes:** any fused-loop emitter that needs the source's length (reserve, two-pass, length-aware operators like `take_last`), peel inner-array-yielding wrappers — but use `fromGeneric` for generic helpers and a positive array gate, not a negative iterator gate. + +## Questions +- My fold macro emits a loop with `for (it in source); acc |> reserve(length(source))` but the reserve doesn't fire when the chain starts with `each(arr)`. How do I make it work? diff --git a/mouse-data/docs/my-macro-substitutes-it-for-a-projection-expression-via-template-replacevariable-it-proj-apply-template-but-the-result-fails-to.md b/mouse-data/docs/my-macro-substitutes-it-for-a-projection-expression-via-template-replacevariable-it-proj-apply-template-but-the-result-fails-to.md new file mode 100644 index 000000000..9652ba343 --- /dev/null +++ b/mouse-data/docs/my-macro-substitutes-it-for-a-projection-expression-via-template-replacevariable-it-proj-apply-template-but-the-result-fails-to.md @@ -0,0 +1,24 @@ +--- +slug: my-macro-substitutes-it-for-a-projection-expression-via-template-replacevariable-it-proj-apply-template-but-the-result-fails-to +title: My macro substitutes `it` for a projection expression via `Template.replaceVariable("it", proj) + apply_template`, but the result fails to compile with "can only dereference a reference". What's going wrong? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +Post-typer, reads of a `var` local appear wrapped as `ExprRef2Value(ExprVar(name))` — the invisible adapter the typer inserts to dereference a reference for its value. `templates_boost.TemplateVisitor.visitExprVar` (the engine behind `Template.replaceVariable + apply_template`) only matches the inner `ExprVar` and replaces IT with a clone of the substitute. The outer `ExprRef2Value` wrapper stays, but now it wraps a non-reference value — compile error `30921: can only dereference a reference`. + +This is the same `ExprRef2Value`-transparency problem `daslib/ast_match.das` documents for `qmatch` — they solve it on both pattern and source sides via `qm_peel_ref2value`. `apply_template` does NOT auto-peel. + +Two fixes for substitution: + +1. **Pre-peel the destination** before `apply_template`: walk `dst` and replace every `ExprRef2Value(ExprVar(name))` with the inner `ExprVar(name)` first. After substitution, the result is clean. Drawback: removes wrappers globally (around other identifiers too) — if other refs still need the wrapper, the typer will re-insert them, but you've added a pass. + +2. **Use a custom visitor instead of `Template.replaceVariable`**: override `visitExprRef2Value` to detect `ExprRef2Value(ExprVar(name))` and return `clone_expression(replacement)` directly (stripping the wrapper as part of the substitution). Override `visitExprVar` as a fallback for bare ExprVars. The pattern mirrors `qm_peel_ref2value`'s "peel both sides" approach. + +Concrete repro: daslang `linq_fold`'s Phase 2A planner tried to fuse chained `_select|_select` via `substitute_it_for(proj2, "it", proj1)`. proj1 was `it * 2` (where `it` is the typed-and-wrapped loop var), proj2 was `it + 1`. Substituting via Template replaced the inner ExprVar in proj2 but left `ExprRef2Value(it * 2) + 1` — type error. The fix was deferred (chained-select falls through unfolded in Phase 2A) but Phase 2B needs option 2. + +See `skills/das_macros.md` "Peel ExprRef2Value before qmatch" for the matcher-side analog. The substitution side has no in-tree helper yet. + +## Questions +- My macro substitutes `it` for a projection expression via `Template.replaceVariable("it", proj) + apply_template`, but the result fails to compile with "can only dereference a reference". What's going wrong? diff --git a/mouse-data/docs/qmacro-gensym-per-callsite-via-lineinfo.md b/mouse-data/docs/qmacro-gensym-per-callsite-via-lineinfo.md new file mode 100644 index 000000000..515d4d51d --- /dev/null +++ b/mouse-data/docs/qmacro-gensym-per-callsite-via-lineinfo.md @@ -0,0 +1,43 @@ +--- +slug: qmacro-gensym-per-callsite-via-lineinfo +title: How do I generate a uniquely-named gensym inside an AstCallMacro for a per-call-site variable, using LineInfo? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +Use the call site's `LineInfo.line` + `.column` interpolated into a backtick-prefixed identifier string. Backtick-prefixed names live in a separate namespace so they don't collide with user-typed identifiers and they survive lint/style passes. + +```das +def override visit(prog : ProgramPtr; mod : Module?; var call : ExprCallMacro?) : Expression? { + let at = call.at // LineInfo of the call site + let accName = "`acc`{at.line}`{at.column}" + let itName = "`it`{at.line}`{at.column}" + let srcName = "`src`{at.line}`{at.column}" + + var res = qmacro(invoke($($i(srcName) : typedecl($e(src))) { + var $i(accName) = 0 + for ($i(itName) in $i(srcName)) { + // ... + } + return $i(accName) + }, $e(src))) + res.force_at(at) + res.force_generated(true) + return res +} +``` + +Two follow-up steps you almost always want: + +1. `res.force_at(at)` + `res.force_generated(true)` — sets `at = call.at` on every emitted node and marks them macro-generated. The latter bypasses lint rules that would otherwise fire on synthesized code (e.g. STYLE001, LINT002 "unused variable"). +2. `(blk._block as ExprBlock).arguments[0].flags.can_shadow = true` on the bound let-variable — quiets shadow warnings if the user already has an `acc`/`it`/`src` in scope. Reach for `.flags.can_shadow` on any qmacro-bound name that might collide with caller context. + +**Why include both line AND column:** macros emitted from nested helpers can have several emission sites on the same line (e.g. piped chains where each `|>` step emits a separate gensym). Line alone is not unique. + +**Why backtick prefix:** the backtick is a daslang lexer hint that this is an internal/synthesized name. Without it, very-long generated names sometimes clash with user identifiers or trip naming rules (the formatter, the auto-rename tools). + +**Worked example:** `daslib/linq_fold.das` `plan_loop_or_count` — multiple gensyms per emission site (accumulator, iterator, source, bound projection). Variants per fold-helper too (`fold_where_count` uses `nName` over `accName`). + +## Questions +- How do I generate a uniquely-named gensym inside an AstCallMacro for a per-call-site variable, using LineInfo? diff --git a/mouse-data/docs/qmacro-invoke-source-bind-typedecl-modifier-iter-vs-array.md b/mouse-data/docs/qmacro-invoke-source-bind-typedecl-modifier-iter-vs-array.md new file mode 100644 index 000000000..a1193075a --- /dev/null +++ b/mouse-data/docs/qmacro-invoke-source-bind-typedecl-modifier-iter-vs-array.md @@ -0,0 +1,46 @@ +--- +slug: qmacro-invoke-source-bind-typedecl-modifier-iter-vs-array +title: In a call_macro that emits an `invoke($($i(src) : typedecl($e(topExpr)) ) { ... }, $e(topExpr))` wrapper, what `` do I use so the param matches both array and iterator sources without const/ref mismatches? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +There is no single modifier that works for both — branch on `top._type.isIterator`: + +```das +if (top._type != null && top._type.isIterator) { + // Iterator source — rvalue from a function call like each(range(10)). + // typedecl() picks up the function-return type which carries const; + // -const strips it so the body can `for (it in src)` (otherwise + // daslang complains "can't iterate over const iterator"). + res = qmacro(invoke($($i(srcName) : typedecl($e(topExpr)) - const) { + // ... body uses $i(srcName) ... + }, $e(topExpr))) +} else { + // Container source with length — array/table/string/range/fixed-array. + // `let arr <- [...]` is `array const&`. Stripping -const would + // produce a non-const-ref param; passing the const-ref source then + // fails with `array const& vs array` ("can't ref types + // can only add constness"). Keep modifiers — typedecl() preserves + // them and the const-ref source matches exactly. + res = qmacro(invoke($($i(srcName) : typedecl($e(topExpr))) { + // ... body uses $i(srcName) ... + }, $e(topExpr))) +} +``` + +The two error messages are diagnostic — they tell you which branch you're on: +- `can't iterate over const iterator` → you forgot `-const` on an iterator path +- `array const& vs array ... can't ref types can only add constness` → you have `-const` on an array path + +**Why this is needed in the first place:** the block param is your way to bind the source expression to a stable name so the loop body can reference it once without re-evaluating side effects. The "right" param type is "whatever the source actually is" — but qmacro `typedecl(expr)` produces the raw type-of including const-ref from the call return, which only sometimes matches what the consumer needs. + +**Use `top._type != null` guard** — `_type` is null for freshly cloned expressions that haven't gone through the typer yet. Treating null as "not iterator" (default to array branch) is wrong if you're past the typer; pick conservatively and call out the assumption. + +**See `daslib/linq_fold.das` `plan_loop_or_count`** for a working example with five emission sites — counter lane, array-lane iter/sourceHasLength/else, and the length-shortcut path that's only reachable when the source has length (so it always uses the no-modifier form). + +**Fast path for length-shortcut:** if you can emit `length($e(topExpr))` directly without the invoke wrapper, do that — no source-bind problem. Works when the entire body is one expression and the source's evaluation cost is "you'd evaluate it once anyway." + +## Questions +- In a call_macro that emits an `invoke($($i(src) : typedecl($e(topExpr)) ) { ... }, $e(topExpr))` wrapper, what `` do I use so the param matches both array and iterator sources without const/ref mismatches? diff --git a/mouse-data/docs/what-s-the-end-to-end-checklist-for-adding-a-new-daslib-das-module-so-docs-build-cleanly.md b/mouse-data/docs/what-s-the-end-to-end-checklist-for-adding-a-new-daslib-das-module-so-docs-build-cleanly.md new file mode 100644 index 000000000..bba99ca5e --- /dev/null +++ b/mouse-data/docs/what-s-the-end-to-end-checklist-for-adding-a-new-daslib-das-module-so-docs-build-cleanly.md @@ -0,0 +1,46 @@ +--- +slug: what-s-the-end-to-end-checklist-for-adding-a-new-daslib-das-module-so-docs-build-cleanly +title: What's the end-to-end checklist for adding a new daslib/*.das module so docs build cleanly? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +Four things to update, in order: + +**1. `doc/reflections/das2rst.das`** — add a `require daslib/` near the other daslib requires, write a `document_module_(root : string)` function modeled on a sibling (e.g. `document_module_linq_boost`), and call it from the dispatcher block near the end. Minimal form for a module with mostly-private internals: + +```daslang +def document_module_my_new_module(root : string) { + var mod = find_module("my_new_module") + var groups : array + document("Short description", mod, "my_new_module.rst", groups) +} +``` + +For modules with many public functions, copy the `linq_boost` pattern and add `group_by_regex(...)` entries for each named group — anything left over lands in "Uncategorized" and **fails CI**. + +**2. `doc/source/stdlib/handmade/module-.rst`** — `das2rst` auto-creates this as `// stub\nModule `. Replace the **whole file** with a plain-text description (1-2 paragraphs, with a `.. code-block:: das` require + minimal example). See `module-linq.rst` / `module-linq_boost.rst` for the convention. + +**3. `doc/source/stdlib/sec_*.rst`** — find the section your module belongs in (e.g. `sec_algorithms.rst` for linq family, `sec_strings.rst` for strings, etc.) and add `generated/.rst` to its `.. toctree::`. Without this the page builds but isn't linked. + +**4. Regenerate + verify:** + +```bash +./bin/daslang doc/reflections/das2rst.das # picks up new module + handmade stub +grep -rl "// stub" doc/source/stdlib/handmade/ # must be empty after step 2 +grep -c Uncategorized doc/source/stdlib/generated/*.rst | grep -v ':0$' # must be empty +rm -rf doc/sphinx-build site/doc # clean cache (cached builds hide warnings) +sphinx-build -b html -d doc/sphinx-build doc/source site/doc 2>&1 | tee /tmp/sphinx_out.txt +grep -iE "warning:|error:" /tmp/sphinx_out.txt # must be empty +``` + +`doc/source/stdlib/generated/*.rst` and `generated/detail/*.rst` are **gitignored** — only commit (1) das2rst.das, (2) the handmade module-.rst, and (3) the sec_*.rst toctree update. + +## Questions +- What's the end-to-end checklist for adding a new daslib/*.das module so docs build cleanly? +- Where do I register a new daslib module in das2rst.das? +- Why does my new module appear as `// stub` in the generated RST? + +## Questions +- What's the end-to-end checklist for adding a new daslib/*.das module so docs build cleanly? diff --git a/mouse-data/docs/what-s-the-right-sqlite-linq-chain-form-for-aggregates-sum-min-max-average-and-what-operators-aren-t-supported-as-sql-chain-term.md b/mouse-data/docs/what-s-the-right-sqlite-linq-chain-form-for-aggregates-sum-min-max-average-and-what-operators-aren-t-supported-as-sql-chain-term.md new file mode 100644 index 000000000..c37c00c27 --- /dev/null +++ b/mouse-data/docs/what-s-the-right-sqlite-linq-chain-form-for-aggregates-sum-min-max-average-and-what-operators-aren-t-supported-as-sql-chain-term.md @@ -0,0 +1,39 @@ +--- +slug: what-s-the-right-sqlite-linq-chain-form-for-aggregates-sum-min-max-average-and-what-operators-aren-t-supported-as-sql-chain-term +title: What's the right sqlite_linq chain form for aggregates (sum/min/max/average), and what operators aren't supported as `_sql` chain terminals? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +Column aggregates in `_sql` chains use the **regular linq function name** after a `_select`, NOT an `_aggregate(_.Col)` macro: + +```daslang +// CORRECT — _sql analyzer recognizes `_select(_.Col) |> sum()` and emits SELECT SUM(price) +let s = _sql(db |> select_from(type) |> _select(_.price) |> sum()) +let m = _sql(db |> select_from(type) |> _select(_.price) |> min()) +let a = _sql(db |> select_from(type) |> _select(_.price) |> average()) // promotes to double +``` + +There is no `_sum` / `_min` / `_max` / `_average` chain macro. The error if you try one is `error[30838]: can't locate variable '_'` because `_sum` doesn't dispatch as a call macro. + +The full set of `_sql` chain terminals is **`_to_array()`, `_first()`, `_first_opt()`, `count()`, and `sum()`/`min()`/`max()`/`average()` after a 1-column `_select`**. These are NOT supported as chain terminals: + +| Chain | Why not | Workaround | +|---|---|---| +| `_any()` (no args, terminal) | not implemented | `_first_opt() \|> is_some` | +| `_all(pred)` | no SQL idiom recognized | invert: `_where(NOT pred) \|> count() == 0` | +| `take(N) \|> count()` | LIMIT-after-aggregate has no effect (aggregate collapses to 1 row) | drop count, materialize: `take(N)` returns array, take `length()` | +| `skip(M) \|> take(N) \|> count()` | same | same — terminate in to_array | +| `distinct() \|> count()` | `COUNT(DISTINCT col)` not yet implemented | `distinct()` alone, then `length()` of result array | +| `_sql(... \|> _join(select_from(type), ...))` | inner `select_from` needs db handle wired inside the analyzer | omit m1 / use raw SQL string for join benchmarks | + +The error messages from `sqlite_linq.das` are explicit — read them, they spell out the alternative form. Pattern matching for these lives in `modules/dasSQLITE/daslib/sqlite_linq.das` `peel_column_aggregate` and `analyze_chain`. + +## Questions +- What's the right sqlite_linq chain form for aggregates (sum/min/max/average), and what operators aren't supported as `_sql` chain terminals? +- Why does `_sum(_.price)` fail in `_sql` with "can't locate variable '_'"? +- How do I express `any`/`all`/distinct-count/take-count in `_sql`? + +## Questions +- What's the right sqlite_linq chain form for aggregates (sum/min/max/average), and what operators aren't supported as `_sql` chain terminals? diff --git a/mouse-data/docs/when-a-call-macro-needs-to-pick-copy-vs-move-init-for-a-projection-should-i-emit-static-if-typeinfo-is-workhorse-e-proj-or-decid.md b/mouse-data/docs/when-a-call-macro-needs-to-pick-copy-vs-move-init-for-a-projection-should-i-emit-static-if-typeinfo-is-workhorse-e-proj-or-decid.md new file mode 100644 index 000000000..50512f050 --- /dev/null +++ b/mouse-data/docs/when-a-call-macro-needs-to-pick-copy-vs-move-init-for-a-projection-should-i-emit-static-if-typeinfo-is-workhorse-e-proj-or-decid.md @@ -0,0 +1,33 @@ +--- +slug: when-a-call-macro-needs-to-pick-copy-vs-move-init-for-a-projection-should-i-emit-static-if-typeinfo-is-workhorse-e-proj-or-decid +title: When a call_macro needs to pick copy-vs-move-init for a projection, should I emit `static_if (typeinfo is_workhorse($e(proj)))` or decide at macro time? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +Decide at macro time. By the time a `[call_macro]` `visit()` fires, inner macros have expanded and the typer has run, so every sub-expression carries a resolved `_type`. Read `projection._type.isWorkhorseType` directly and emit exactly one branch — no `static_if`, no `typeinfo is_workhorse` at runtime, less AST for the typer to fold away later. + +Pattern: + +```das +let workhorseProj = projection._type != null && projection._type.isWorkhorseType +var perElem : Expression? +if (workhorseProj) { + perElem = qmacro_expr() { $i(accName) |> push($e(projection)) } +} else { + perElem = qmacro_block() { + var $i(valName) <- $e(projection) + $i(accName) |> emplace($i(valName)) + } +} +``` + +For workhorse types (`int`, `float`, `bool`, `string`, …, anything `isWorkhorseType` returns true for) you can push the expression directly with no intermediate `var v = expr`. For non-workhorse, `<-` is a statement not an expression — you need `var v <- proj; acc |> emplace(v)`. The two-step is only required there. + +This trick brought daslang `linq_fold`'s `where|select|to_array` emission from 13 → 11 ns/op (parity with the `_old_fold` comprehension baseline) at 100K rows. See [daslib/linq_fold.das](daslib/linq_fold.das) `plan_loop_or_count` (the array lane). The previous version had a runtime `static_if` inside the qmacro — correct but generated 2× the AST and lost the temp-binding optimization opportunity. + +Other `TypeDecl` predicates available at macro time: `isIterator`, `isGoodArrayType`, `isConst`, `isPod`, plus `firstType` / `secondType` / `argTypes` for compound types. Use them; the typer has already done the work. + +## Questions +- When a call_macro needs to pick copy-vs-move-init for a projection, should I emit `static_if (typeinfo is_workhorse($e(proj)))` or decide at macro time? diff --git a/mouse-data/docs/where-does-nolint-rule-go-when-a-lint-warning-is-emitted-from-inside-a-qmacro-expr-and-fires-at-the-user-s-call-site-rather-than.md b/mouse-data/docs/where-does-nolint-rule-go-when-a-lint-warning-is-emitted-from-inside-a-qmacro-expr-and-fires-at-the-user-s-call-site-rather-than.md new file mode 100644 index 000000000..e73aff48a --- /dev/null +++ b/mouse-data/docs/where-does-nolint-rule-go-when-a-lint-warning-is-emitted-from-inside-a-qmacro-expr-and-fires-at-the-user-s-call-site-rather-than.md @@ -0,0 +1,36 @@ +--- +slug: where-does-nolint-rule-go-when-a-lint-warning-is-emitted-from-inside-a-qmacro-expr-and-fires-at-the-user-s-call-site-rather-than +title: Where does `// nolint:RULE` go when a lint warning is emitted from inside a `qmacro_expr` and fires at the user's call site rather than at the macro source? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +The nolint comment must be **inline at the end of the offending line**, inside the `qmacro_expr {...}` block — NOT on a separate comment line above and NOT at the user call site. + +When a macro emits code via `qmacro_expr { var $i(name) = $e(expr) }`, lint analyzes the expansion at the user's call site but **reports the source position** as the line inside the qmacro_expr body. To suppress, the comment must travel with the emitted line: + +```daslang +} else { + blk.list |> emplace_new <| qmacro_expr() { + var $i(newArgName) = $e(newCall) // nolint:PERF009 + } + ... +} +``` + +What DOESN'T work: +- `// nolint:PERF009` on a comment line above the qmacro_expr block — suppresses nothing. +- `// nolint:PERF009` on the user-side `let x = _fold(...)` line — the lint engine reports against the macro source position, not the user site. + +The placement rule generalizes: `nolint:RULE` must be **on the literal line** that the lint output points at. For macro-quoted code, that's inside the `qmacro_expr { ... }` body. + +Concrete example: PERF009 ("redundant move into variable immediately returned") fired at `daslib/linq_fold.das:490:24` (a line inside `fold_linq_default`'s qmacro_expr) when called via `benchmarks/sql/take_count.das`'s single-pass chain. Inline `// nolint:PERF009` on the emitted `var = expr` line suppresses it cleanly. + +## Questions +- Where does `// nolint:RULE` go when a lint warning is emitted from inside a `qmacro_expr` and fires at the user's call site rather than at the macro source? +- nolint for macro-generated lint warnings +- How to suppress a lint rule that fires only at certain user call sites? + +## Questions +- Where does `// nolint:RULE` go when a lint warning is emitted from inside a `qmacro_expr` and fires at the user's call site rather than at the macro source? diff --git a/mouse-data/docs/which-typedecl-predicates-identify-types-where-length-expr-is-statically-resolvable-in-daslang-macros.md b/mouse-data/docs/which-typedecl-predicates-identify-types-where-length-expr-is-statically-resolvable-in-daslang-macros.md new file mode 100644 index 000000000..b847fb36f --- /dev/null +++ b/mouse-data/docs/which-typedecl-predicates-identify-types-where-length-expr-is-statically-resolvable-in-daslang-macros.md @@ -0,0 +1,63 @@ +--- +slug: which-typedecl-predicates-identify-types-where-length-expr-is-statically-resolvable-in-daslang-macros +title: Which TypeDecl predicates identify types where length(expr) is statically resolvable in daslang macros? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +# Length-supporting types in daslang macros + +When a `[macro_function]` / `[call_macro]` needs to emit `length($e(src))` and have it compile, the source's `TypeDecl` must be one of: + +- `isGoodArrayType` — `array` (the dynamic array, including `array#`) +- `isGoodTableType` — `table` +- `isString` — `string` / `string#` +- `isArray` — fixed array `T[N]` (NOT `array` — that's `isGoodArrayType`; the naming is confusing) +- `isRange` — `range` / `urange` / `range64` / `urange64` + +**Excluded** (no `length()` overload — emitting `length(src)` will fail to compile inside macro output): + +- `isIterator` — iterators don't carry length, even when wrapping a length-having source. Use the underlying container. +- `isGoodLambdaType` — `def each(lam : lambda<...>)` makes lambdas iterable, but they have no `length()`. This is a common trap when peeling `each()` based solely on "not an iterator." +- Custom user `def each(MyType)` types — depends on whether the user also defined `length(MyType)`; assume no. + +## Canonical predicate + +```das +[macro_function] +def private type_has_length(t : TypeDecl?) : bool { + if (t == null) return false + return (t.isGoodArrayType || t.isGoodTableType || t.isString + || t.isArray || t.isRange) +} +``` + +Note the parenthesization: a bare `||`-chain split across lines hits a `gen2` parse error at the leading `||`. Wrap the chain in `(...)`. + +## Why this matters for `each()` peeling + +A common optimization: when a chain starts `each()._where(...)...`, peel the `each` and iterate `` directly so reserve/length work. The peel must be gated on `type_has_length(._type)` — checking only `!isIterator` would silently accept `each(lambda)` and emit broken `reserve(length(lambda))`. + +Example from `daslib/linq_fold.das` (PR #2689, Phase 2A): + +```das +[macro_function] +def private peel_each_length_source(var top : Expression?) : Expression? { + if (!(top is ExprCall)) return top + var topCall = top as ExprCall + if (topCall.func == null || topCall.func.name != "each" + || topCall.arguments |> length != 1 + || !type_has_length(topCall.arguments[0]._type)) return top + return clone_expression(topCall.arguments[0]) +} +``` + +The `clone_expression` is needed because `topCall.arguments[0]` is `Expression? const` (the args vector entry is const-typed even when the outer call is `var`); the planner stores `top` as `var Expression?` so the clone drops the const. + +## Discovery + +The set of `length()`-supporting types is not advertised as a single predicate anywhere — assembled from `mcp__daslang__describe_type TypeDecl` (the `isXxx` method list) cross-referenced with the `def length(...)` overloads in `daslib/builtin.das` and the `def each(...)` overloads. Lambda iterables surfaced as a Copilot review finding on PR #2689. + +## Questions +- Which TypeDecl predicates identify types where length(expr) is statically resolvable in daslang macros? diff --git a/mouse-data/docs/why-does-daslang-live-s-post-shutdown-return-200-ok-but-the-subprocess-never-actually-exits-on-linux-macos.md b/mouse-data/docs/why-does-daslang-live-s-post-shutdown-return-200-ok-but-the-subprocess-never-actually-exits-on-linux-macos.md new file mode 100644 index 000000000..e05565324 --- /dev/null +++ b/mouse-data/docs/why-does-daslang-live-s-post-shutdown-return-200-ok-but-the-subprocess-never-actually-exits-on-linux-macos.md @@ -0,0 +1,41 @@ +--- +slug: why-does-daslang-live-s-post-shutdown-return-200-ok-but-the-subprocess-never-actually-exits-on-linux-macos +title: Why does daslang-live's POST /shutdown return 200 OK but the subprocess never actually exits on Linux/macOS? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +**Root cause is in vendored libhv (v1.3.4), not daslang.** `live_api.das` registers an `ANY("*")` catch-all alongside specific routes like `GET /status` and `POST /shutdown`. libhv's `Any(path)` (`include/hv/HttpService.h:268-277`) expands internally to one `Handle("METHOD", path, h)` per HTTP verb on the same path key. + +libhv stores all `(path -> method_handlers)` pairs in `pathHandlers`, declared as **`std::unordered_map`** at `include/hv/HttpService.h:105`. `HttpService::GetRoute` (`http/server/HttpService.cpp:72-127`) iterates that map in **container order** and takes the first wildcard or path match it finds. + +Because `std::unordered_map`'s iteration order is implementation- and bucket-defined, `"*"` happens to enumerate BEFORE specific paths like `/status` on Linux libstdc++ (deterministic), and intermittently on Windows MSVC depending on rehash timing. When that happens, every request — including `POST /shutdown` — hits the wildcard handler, which returns the help JSON (200 OK). The real `/shutdown` handler that calls `request_exit()` is never invoked. The main loop spins forever, and the parent times out the subprocess with `popen_exit_code = DAS_POPEN_TIMEOUT = 0x7FFFFF01`. + +**How to confirm you're hitting this:** while the daslang-live subprocess is running, `curl -s http://127.0.0.1:9090/status`. If the body contains `"endpoints"` (the help JSON), the wildcard is winning. If it returns real status JSON with `"has_error"` / `"paused"` / `"fps"` fields, routing works. + +**Workaround (landed as daslang PR #2688, `bbatkin/live-api-drop-any-catchall`):** drop `ANY("*")` from `live_api.das` and serve the help JSON from a specific `GET("/")` handler instead. `/` is an exact-match path, so it never collides with `/status` etc. — the ordering hazard is gone regardless of how libhv stores routes. Unknown paths return libhv's default 404; programmatic clients (playwright, MCP `live_*` tools) never relied on the catch-all. + +**Follow-up work (deferred until libhv-side fix lands):** +1. File the libhv upstream issue — unreported per 2026-05-16 web research; we'd be first. +2. Wire libhv's built-in `errorHandler` hook through dasHV (`WebServer_Adapter` setter), rewrite `live_api.das` to drop the `GET("/")` workaround and use `errorHandler` for the help fallback. That's the idiomatic shape on libhv's terms. +3. Address Copilot review comments on PR #2688 as part of that rewrite: (a) `live_api.das:25` module-level `//!` endpoint list should mention `GET /`; (b) `live_api.das:217` "curl :9090/" example — actually valid curl shorthand for localhost, but rephrasing to `curl http://localhost:9090/` is friendlier. +4. Consider a lint rule against `ANY("*")` registered alongside specific paths. + +**Upstream fix status (researched 2026-05-16):** +- **Unreported** — exhaustive search of `ithewei/libhv` issues for `route|wildcard|Any|GetRoute|HttpService|pathHandlers` returned zero hits on this bug. We'd be the first to file. +- **Not fixed in master** — libhv's `http/server/HttpService.cpp` last touched 2023-07-29 (before v1.3.4); same `std::unordered_map` for `pathHandlers` at `http/server/HttpService.h:107`; same first-match-wins loop in `GetRoute` at line ~72. v1.3.4 IS the latest release (2025-10-25). +- **Known wart upstream**: `docs/PLAN.md` lists "Path router: optimized matching via trie?" as an open question — maintainers know the router is suboptimal, just haven't prioritized it. +- **Industry comparison**: Crow uses a trie; cpp-httplib preserves registration order so static paths always precede regex catch-alls. This is a libhv-specific defect, not industry norm. +- **Documented behavior**: libhv docs (`README.md`, `docs/API.md`, `docs/cn/HttpServer.md`) say NOTHING about route precedence or wildcard semantics. Behavior is implementation-defined. + +**Better script-side option (not yet exercised): `errorHandler` hook.** libhv has a built-in fallback handler that fires after the processor chain runs and status ≥ 400 with empty body (`http/server/HttpService.h:133` + `http/server/HttpHandler.cpp:476-486`). Official example wiring at `examples/httpd/router.cpp:15` + `examples/httpd/handler.cpp:46`. Drop `ANY("*")`, assign `errorHandler` instead — `/status` / `/shutdown` win deterministically, everything else falls through to the help dump. **dasHV does NOT currently expose `errorHandler`**, so this needs C++ glue (a setter on `WebServer_Adapter`) before live_api can use it. PR #2688's `GET("/")` workaround is the interim fix. + +**Also worth knowing**: official libhv examples never register bare `Any("*")` alongside specific paths — they use prefix wildcards like `GET("/wildcard*", ...)` (see `examples/httpd/router.cpp:70`). The maintainers don't exercise our usage pattern, so the bug has presumably never surfaced for upstream users. + +**Why Windows seems to work most of the time:** MSVC's `std::unordered_map` bucket layout for the specific paths daslang-live registers happens to enumerate specific paths before `"*"`. It's not a guarantee — different route counts (e.g. adding more endpoints) trigger rehashes and can flip the order. Not "Windows-correct, Linux-broken" — both are subject to the same hazard. + +**Symptom to watch for in CI logs:** `popen_exit_code=2147483393` (0x7FFFFF01 = `DAS_POPEN_TIMEOUT`) on POSIX, with the playwright drain step taking the full `DEFAULT_TEST_TIMEOUT_SEC` (120s default). Subprocess gets SIGKILLed at the watchdog. + +## Questions +- Why does daslang-live's POST /shutdown return 200 OK but the subprocess never actually exits on Linux/macOS? diff --git a/mouse-data/docs/why-does-each-arr-fail-with-unsafe-when-not-source-of-for-loop-outside-a-for-and-what-s-the-alternative-in-a-linq-chain.md b/mouse-data/docs/why-does-each-arr-fail-with-unsafe-when-not-source-of-for-loop-outside-a-for-and-what-s-the-alternative-in-a-linq-chain.md new file mode 100644 index 000000000..b0371ef2b --- /dev/null +++ b/mouse-data/docs/why-does-each-arr-fail-with-unsafe-when-not-source-of-for-loop-outside-a-for-and-what-s-the-alternative-in-a-linq-chain.md @@ -0,0 +1,36 @@ +--- +slug: why-does-each-arr-fail-with-unsafe-when-not-source-of-for-loop-outside-a-for-and-what-s-the-alternative-in-a-linq-chain +title: Why does `each(arr)` fail with "unsafe when not source of for-loop" outside a for, and what's the alternative in a linq chain? +created: 2026-05-16 +last_verified: 2026-05-16 +links: [] +--- + +`each(arr)` returns an iterator that walks the array. Daslang's safety rules say that iterator is unsafe **unless it's directly consumed by a for-loop in the same scope** — passing it through `|>` chains, capturing it in a `let`, or handing it to a function argument all trip: + +``` +error[31013]: '__::builtin`each`...' is unsafe, when not source of the for-loop; + must be inside the 'unsafe' block +``` + +**Inside `_fold(...)`** the error doesn't fire because `_fold` is a macro that expands to a for-loop body where `each(arr)` IS the source. So `_fold(each(arr)._where(...).count())` compiles cleanly. + +**Outside a fold macro**, in a plain pipe chain, use the array directly — most `_` call-macros (`AstCallMacro_LinqPred2`) accept both `iterator` and `array` for arg 0: + +| Doesn't work | Use instead | +|---|---| +| `let prices <- (each(arr) \|> _where(...) \|> _select_to_array(_.price))` (iterator outside `_fold`) | `let prices <- (arr \|> _where(...) \|> _select(_.price))` — array+macros chains as array; result is `array`, no `_to_array` suffix needed | +| `let c = each(cars)._join(each(dealers), ...)` inside `_fold` (two `each()`s, one not the chain source) | `let c = _fold(cars \|> _join(dealers, ..., ...) \|> count())` — pass arrays directly | +| `let r = each(arr) \|> ...` outside any fold | wrap in `unsafe(each(arr))`, OR start the chain with `arr` directly and let the macro handle iterator promotion | + +**Heuristic:** if the chain ends in a `_fold(...)` / `_old_fold(...)` wrapper or a for-loop, `each(arr)` works as the source. If the chain produces a value (or array) that escapes the expression — a `let`, a function return, the second arg to a macro — drop the `each()` and pass the array directly. + +The compiler error points at the **specific** `each(arr)` call that escapes, so for multi-each chains (`_join`, `_zip`), check which side is the issue. + +## Questions +- Why does `each(arr)` fail with "unsafe when not source of for-loop" outside a for, and what's the alternative in a linq chain? +- error[31013] '__::builtin`each`' is unsafe — how to fix? +- When can I use `each(arr)` in a linq pipe chain? + +## Questions +- Why does `each(arr)` fail with "unsafe when not source of for-loop" outside a for, and what's the alternative in a linq chain? diff --git a/mouse-data/docs/why-does-my-dastest-integration-test-hang-at-readiness-gate-failed-when-external-curl-to-status-works-fine-is-it-a-require-order.md b/mouse-data/docs/why-does-my-dastest-integration-test-hang-at-readiness-gate-failed-when-external-curl-to-status-works-fine-is-it-a-require-order.md index efa990fc8..ff75ae379 100644 --- a/mouse-data/docs/why-does-my-dastest-integration-test-hang-at-readiness-gate-failed-when-external-curl-to-status-works-fine-is-it-a-require-order.md +++ b/mouse-data/docs/why-does-my-dastest-integration-test-hang-at-readiness-gate-failed-when-external-curl-to-status-works-fine-is-it-a-require-order.md @@ -15,61 +15,82 @@ links: [] [imgui_playwright] readiness gate FAILED ``` -(30s `wait_until_ready` timeout, then 120s popen drain timeout. External `curl http://localhost:9090/status` from a sibling shell returns 200 with proper status JSON throughout — only the popen parent's request loop can't see it.) +External `curl http://localhost:9090/status` from a sibling shell returns 200 with proper status JSON throughout — only the popen parent's poll loop "can't see it". Reproduces on macOS and Linux; pre-PR #2685 appeared to NOT reproduce on Windows (which was the trap — raw QPC tick math accidentally masked the bug; post-PR #2685 Windows also returns nanoseconds and shows the same failure). # Root cause -`live/live_api` was required BEFORE `imgui_app + glfw/glfw_boost + opengl/* + glfw_live + opengl_live` somewhere in the requirer chain (usually a wrapper module like `imgui/imgui_harness`). The `[_macro] installing` in `live_api.das` calls `fork_debug_agent_context(@@debug_agent)` at compile time. If that fork happens before GLFW is initialized in the live runtime, the resulting LiveApiServer becomes unreachable from a popen parent on Windows. +**`ref_time_ticks()` returns nanoseconds on all platforms (post-PR #2685), but the wait-loop math assumed microseconds.** -Filed: [#2677](https://github.com/GaijinEntertainment/daScript/issues/2677). Distinct from #2675 (`ANY("*")` route shadowing). +`src/hal/performance_time.cpp` defines `ref_time_ticks()` per platform: -# Fix (mechanical) +| Platform | Returns | +|---|---| +| Linux | `tv_sec * 1e9 + tv_nsec` — **nanoseconds** | +| macOS | `clock_gettime_nsec_np(CLOCK_MONOTONIC_RAW)` — **nanoseconds** | +| Windows | QPC ticks converted to **nanoseconds** via `ticks × (1e9 / freq)`; fast path at 10 MHz QPF = 100 ns/tick multiply (PR #2685). Pre-PR #2685: raw `QuadPart` ticks, ~10 MHz, accidentally close to μs scaling. | -In the requirer module (yours or a wrapper you control), reorder requires so the **windowed backend stack comes first**: +`imgui_playwright`'s `wait_until_ready` (and other deadline loops) used: ```das -// Windowed backend FIRST (correctness, not aesthetics). -require imgui -require imgui_app -require glfw/glfw_boost -require opengl/opengl_boost -require live/glfw_live -require live/opengl_live - -// Live-host + boost-runtime stack AFTER. -require live/live_api -require live/live_commands -require live/live_vars -require live_host -require imgui/imgui_live -require imgui/imgui_boost_runtime -require imgui/imgui_boost_v2 -require imgui/imgui_widgets_builtin -require imgui/imgui_containers_builtin -require imgui/imgui_visual_aids +let deadline = ref_time_ticks() + int64(timeout_sec * 1000000.0f) +while (ref_time_ticks() < deadline) { + GET("{base_url}/status") $(resp) { ... } + sleep(READY_POLL_INTERVAL_MS) +} ``` -This mirrors the canonical order every pre-`imgui_harness` example/test used verbatim. Reordering is a no-op for visibility / re-export semantics — purely a workaround for the install-time ordering bug. +That `* 1000000.0f` assumes ref-time is in microseconds. So: +- **Linux/macOS**: a "30s" deadline is `30 * 1e6 = 30 million nanoseconds = 30 milliseconds`. Loop fires 0-1 polls and exits. The `connect 127.0.0.1:9090 failed!` line is the one in-flight libhv connect attempt timing out — server health is fine; the loop just budgeted itself out of existence. +- **Windows (post-PR #2685)**: `ref_time_ticks()` now also returns nanoseconds on Windows, so the same 30 ms budget applies — the bug is equally visible. +- **Windows (pre-PR #2685)**: raw QPC `QuadPart` ticks at ~10 MHz worked out near enough to 1 MHz that `* 1e6` landed in the "seconds" ballpark by accident, masking the bug on Windows CI. + +# The Windows-only "require order" workaround is misleading + +[#2677](https://github.com/GaijinEntertainment/daScript/issues/2677) and a prior version of this card blamed require-order — windowed-backend stack vs. live-host stack — claiming `[_macro] installing` in `live_api.das` calling `fork_debug_agent_context(@@debug_agent)` before GLFW init was the issue. That diagnosis was wrong. The reorder happened to nudge timings just enough on Windows for the (already-too-short) loop to occasionally win the race, which read as "fix". On POSIX, the same reorder changes nothing — the loop still exits in 30 ms regardless of require order. + +If you see code in `imgui_harness.das` carrying a `// NOTE on require ordering` comment about live_api needing to come after the windowed stack: that comment is load-bearing only by accident on Windows. The real fix is in the timing math. + +# Fix + +Replace any `ref_time_ticks() + int64(seconds * 1000000.0f)` deadline pattern with platform-correct math. Two options: + +```das +// Option A — use the elapsed-microsec helper (always microseconds, all platforms) +let t_start = ref_time_ticks() +let timeout_us = int(timeout_sec * 1000000.0f) +while (get_time_usec(t_start) < timeout_us) { + ... +} + +// Option B — compute deadline in nanoseconds (safe on all platforms after PR #2685) +let deadline = ref_time_ticks() + int64(timeout_sec * 1000000000.0f) +// (On pre-PR #2685 Windows builds, ref_time_ticks() returned raw QPC ticks, so this +// would be wrong there. Prefer Option A if you need to support older builds.) +``` + +**Option A is the right one.** `get_time_usec(reft)` is defined per-platform in `performance_time.cpp` and always returns microseconds. Audit any other `ref_time_ticks() + ... * 1000000.0f` patterns in your codebase the same way. # How to recognize this gotcha - Test hangs at `readiness gate FAILED` (not at `body did not converge` or similar). -- External `curl` to `localhost:9090/status` works while the test hangs (proves the server is up — the popen parent specifically can't reach it). -- Always reproduces — not a flaky timing issue. -- ONLY triggers when run via `popen` (via `with_imgui_app` in `imgui_playwright`, or any `dastest` integration test). Direct `bin/Release/daslang-live.exe