macro_boost: add has_sideeffects + counter-lane elision#2691
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a reusable conservative has_sideeffects(expr) predicate to daslib/macro_boost.das and wires it into the linq_fold counter-lane planner so that the discardable var vfinal = projection bind is elided at macro time when the projection is provably pure. Comes with a probe-based test module exercising 24 expression shapes plus two AST-level integration tests asserting the elision (and its preservation for impure projections). Follow-up to PR #2689 (Phase 2A loop planner).
Changes:
- Introduces
has_sideeffects,func_has_sideeffects,is_safe_op1,is_safe_op2([macro_function]) indaslib/macro_boost.das. - Gates the counter-lane
vfinalbind indaslib/linq_fold.dasonhas_sideeffects(projection). - Adds
tests/macro_boost/test_has_sideeffects.das+_has_sideeffects_probe.dasand two new AST tests intests/linq/test_linq_fold_ast.das; updates theselect_countrow inbenchmarks/sql/LINQ.md.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| daslib/macro_boost.das | Adds the conservative has_sideeffects predicate and helpers. |
| daslib/linq_fold.das | Uses has_sideeffects to skip emitting the discardable vfinal bind for pure projections. |
| tests/macro_boost/test_has_sideeffects.das | 24 classification test cases for the new predicate. |
| tests/macro_boost/_has_sideeffects_probe.das | Helper module exposing a _test_has_sideeffects call_macro that emits ExprConstBool. |
| tests/linq/test_linq_fold_ast.das | Adds an impure-projection target and two AST tests asserting counter-lane bind elision behavior. |
| benchmarks/sql/LINQ.md | Updates the select_count note to reference the new macro-time elision path. |
Comments suppressed due to low confidence (4)
daslib/macro_boost.das:244
func_has_sideeffectstreats any builtin function whose flags don't setknownSideEffectsorunsafeOperationas side-effect-free. Combined with the operator paths above, this can misclassify mutation operators that aren't on the safe-op allowlist. In particular:
ExprOp1with op++,--,+++,---falls through!is_safe_op1(op)and then throughfunc_has_sideeffects(func). If the builtin overload for these increment/decrement operators on workhorse types is not flaggedknownSideEffects/unsafeOperation,has_sideeffectsreturnsfalseand reportsx++as pure.ExprOp2compound assignment operators (+=,-=,*=,/=,%=,&=,|=,^=,<<=,>>=,||=,&&=,^^=,<<<=,>>>=) are parsed asExprOp2(seeInferTypes::isAssignmentOperatorinsrc/ast/ast_infer_type_op.cpp:116). None are inis_safe_op2, but the samefunc_has_sideeffectsfallback applies — if the resolved builtin doesn't carry one of the two flags, the mutation is classified as pure.
Since has_sideeffects is a public, general-purpose predicate (not just used for the linq counter lane), and the contract is "false is a promise of purity", these operators must be on the unsafe side unconditionally regardless of how the resolved builtin happens to be flagged. Consider blacklisting any ExprOp1 op ending in ++/-- and any ExprOp2 op ending in = (other than the already-handled relational ops) up front, before consulting func.flags. The test suite in tests/macro_boost/test_has_sideeffects.das doesn't cover these mutation operators, so a regression here would go undetected.
if (expr is ExprOp1) {
let e1 = expr as ExprOp1
if (!is_safe_op1(e1.op) && func_has_sideeffects(e1.func)) return true
return has_sideeffects(e1.subexpr)
}
if (expr is ExprOp2) {
let e2 = expr as ExprOp2
// Unsafe: division/modulo (div-by-zero panic, design decision); or op not in the
// safe allowlist AND the resolved func indicates side effects. The allowlist also
// bypasses func==null artifacts from partial folding.
if (e2.op == "/" || e2.op == "%"
|| (!is_safe_op2(e2.op) && func_has_sideeffects(e2.func))) return true
return has_sideeffects(e2.left) || has_sideeffects(e2.right)
}
daslib/macro_boost.das:268
func_has_sideeffectsreturnstrue(i.e., "has side effects") whenf == null OR !f.flags.builtIn OR f.flags.knownSideEffects OR f.flags.unsafeOperation. The parenthesization is correct, but the function name and the way callers use it (if (... && func_has_sideeffects(e1.func)) return true) make the polarity hard to follow — particularly the!f.flags.builtInterm, which says "any non-builtin function is unsafe". A short docstring clarification or renaming to e.g.func_may_have_sideeffects(matching the conservative semantics ofhas_sideeffectsitself) would make this less error-prone for future maintainers. The fact thatf == nullis classified as "unsafe" here is also worth noting explicitly, since the surrounding comments at lines 222–225 describe the null-func case as something the allowlist "bypasses" — a reader could reasonably expectfunc_has_sideeffects(null)to be a safer default.
[macro_function]
def private func_has_sideeffects(f : Function?) : bool {
//! True when calling `f` may have side effects. Allowlists builtins
//! (`flags.builtIn`) without `knownSideEffects` or `unsafeOperation`.
return (f == null || !f.flags.builtIn
|| f.flags.knownSideEffects || f.flags.unsafeOperation)
}
daslib/macro_boost.das:185
ExprCastrecurses only intosubexpr, which is correct, but casts that invoke a user-defined conversion function (cast through[generic]/overloaded conversion) can still carry side effects. In daslang,ExprCastis largely a typing/representation change, but if a future cast path lowers into a call expression, that would surface as anExprCallrather than staying insideExprCast. This is informational — no change required if the current AST guarantees no embedded calls inExprCast— but worth confirming.
if (expr is ExprCast) return has_sideeffects((expr as ExprCast).subexpr)
tests/macro_boost/test_has_sideeffects.das:160
- The test file covers 24 cases as advertised, but doesn't exercise the mutation-operator paths (
++,--,+=,-=, etc.) that flow through thefunc_has_sideeffectsfallback inExprOp1/ExprOp2. Given that the design ofhas_sideeffectsexplicitly comments out compound-assignment ops fromis_safe_op2(line 280: "compound-assignment ops are not in the allowlist (mutation)"), adding a test liket |> equal(_test_has_sideeffects((_x += 1, _x)), true)(or whatever construction the comma/sequence operator supports) would lock in the contract and catch regressions if the builtin flags change. Same for unary_x++. This is the most consequential gap in coverage for a "conservative side-effect predicate".
// ── UNSAFE cases — has_sideeffects must return true ──────────────────────
[test]
def test_user_call_unsafe(t : T?) {
var _x = 5
t |> equal(_test_has_sideeffects(side_effect_fn(_x)), true)
}
[test]
def test_table_insert_subscript(t : T?) {
var _tab : table<string; int>
// _tab[k] auto-inserts a default value if k is missing — side effect.
t |> equal(_test_has_sideeffects(_tab["k"]), true)
}
[test]
def test_division_unsafe(t : T?) {
var _x = 10
var _y = 2
// `/` can panic on div-by-zero — kept on the unsafe side by explicit blacklist.
t |> equal(_test_has_sideeffects(_x / _y), true)
}
[test]
def test_modulo_unsafe(t : T?) {
var _x = 10
var _y = 3
t |> equal(_test_has_sideeffects(_x % _y), true)
}
[test]
def test_array_literal_alloc(t : T?) {
t |> equal(_test_has_sideeffects([1, 2, 3]), true)
}
[test]
def test_struct_construct_alloc(t : T?) {
t |> equal(_test_has_sideeffects(Foo(a = 1, b = 2)), true)
}
[test]
def test_string_builder_unsafe_part(t : T?) {
var _x = 5
// The string interpolation itself is safe, but a side-effecting operand propagates.
t |> equal(_test_has_sideeffects("hello {side_effect_fn(_x)}"), true)
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Adds a reusable conservative `has_sideeffects(expr) : bool` predicate to
daslib/macro_boost. Returns true if an expression has — or might have —
side effects; false ONLY when provably pure. Intended for macro-time
elision of discardable evaluations.
Classification:
- Safe leaves: ExprVar, all ExprConst*, ExprAddr, ExprTypeInfo/Decl/Tag.
- Safe via recursion: ExprField/SafeField/Swizzle, ExprRef2Value/Ptr,
ExprPtr2Ref, ExprAddr, ExprIs/IsVariant/AsVariant/SafeAsVariant,
ExprCast, ExprNullCoalescing, ExprStringBuilder (string heap is
no-op per compiler), ExprKeyExists (pure container read).
- ExprAt: safe when subexpr type is NOT isGoodTableType (tables auto-
insert on missing key); ExprSafeAt always safe.
- ExprOp1/Op2/Op3: op-name allowlist for pure ops on workhorse types
(bypasses func==null artifacts from partial folding); falls back to
the function-flag check. `/` and `%` blacklisted (div-by-zero panic).
- ExprCall: allowlist `func.flags.builtIn && !knownSideEffects &&
!unsafeOperation`, recurse args.
- Everything else: conservative true.
Counter-lane integration in daslib/linq_fold.das:
1. Discardable `var vfinal = projection` bind is now emitted only when
`has_sideeffects(projection)` returns true. Pure projections like
`_._field * 2` produce a bare-loop counter at macro time, no
optimizer DCE required.
2. count→length shortcut: when the counter lane has no where-filter
AND every projection in the chain is pure AND the source has a known
length (array/table/string/range/fixed-array), the planner emits
`length(src)` directly — the loop is elided entirely. select_count
benchmark drops from 2 ns/op to 0 ns/op.
3. peel_each fix: `each` is a daslang generic, so the resolved
`func.name` on a typed call is the mangled instance. The original
peel only matched `func.name == "each"` and never fired for typed
chains. Now also checks `func.fromGeneric.name == "each"`. Gated to
array-shaped arguments (isGoodArrayType || isArray) so iterator-
yielding sources like `each(range(10))` keep their wrapper.
4. Block-parameter typedecl branched on source shape: iterator sources
keep `-const` (rvalue, must be consumable); array sources keep the
source's `const&` modifier (peeled `let arr <-` is const-ref).
Tests:
- tests/macro_boost/test_has_sideeffects.das — 24 cases (17 safe + 5
unsafe + 2 conservative-unsafe) wired via a `_test_has_sideeffects`
probe call_macro that emits ExprConstBool at macro time.
- tests/linq/test_linq_fold_ast.das — 5 new tests:
* test_pure_projection_uses_length_shortcut — invoke body returns
`length(src)` directly, no for loop.
* test_bare_count_uses_length_shortcut — same for `each(arr).count()`.
* test_impure_projection_keeps_bind — for-body has bind + ++acc.
* test_peel_each_on_array_source / _on_bare_count — assert peel fires.
* test_peel_each_skips_non_array_source — `each(range(...))` keeps
its wrapper (gate prevents iterator-source peeling).
* test_target_each_range_count_runs — behavioral check for
iterator-source chains.
Benchmarks (100K rows, INTERP, vs Phase 2A baseline):
- select_count: 2 → 0 ns/op (length shortcut elides loop entirely)
- chained_where: 8 → 6 ns/op (peel + const-ref param)
- count_aggregate: 5 → 4 ns/op (1ns from peel)
- to_array_filter: 11 → 10 ns/op (1ns from peel)
569/569 linq tests + 51/51 fold-AST + 24/24 has_sideeffects pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
19c7869 to
c6a9d79
Compare
…owlist Two correctness fixes from Copilot review on PR #2691: 1. **Mutation operators bypass.** `++`, `--`, and compound-assignment ops (`+=`, `-=`, `*=`, `/=`, `%=`, `&=`, `|=`, `^=`, `<<=`, `>>=`, `&&=`, `||=`, `^^=`) fall through the `is_safe_op{1,2}` allowlist check, but the fallback through `func_has_sideeffects` only catches them if the resolved C++ builtin happens to carry the right flag. If the builtin missed the flag (or there is no resolved builtin), `x++` classifies as pure. Add `is_mutation_op1` / `is_mutation_op2` blacklists invoked up front, before any flag check. Note the AST op-string convention: postfix `++` / `--` are `"+++"` / `"---"`. 2. **User operator overloads bypass.** When `e2.op` is in the safe allowlist (`+`, `-`, `*`, `==`, etc.), the old code skipped the `func_has_sideeffects(e2.func)` check entirely. A user-defined `def operator +(...)` on a custom type would then classify as pure regardless of side effects. Restructure: `func != null` → trust the func flags (non-builtin defaults to unsafe via `func_has_sideeffects`); `func == null` → fall back to op-name allowlist for partial-folding artifacts. Tests: - `test_postfix_increment_unsafe`, `test_postfix_decrement_unsafe` - `test_user_op_overload_unsafe` (defines `operator +` on a private struct with a global-counter side effect) CI fix: register `tests/macro_boost/` in `tests/aot/CMakeLists.txt` (missed when the test directory was created in the parent commit). Mirrors the `tests/linq/` pattern: a test-files glob + a module-files list for the `_has_sideeffects_probe.das` helper. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ssion Cards added in the course of the linq_fold splice rewrite + PR GaijinEntertainment#2691 (has_sideeffects + counter-lane elision). Topics: linq_fold / macro-emission patterns: - daslang-generic-instance-detect-via-fromgeneric — func.fromGeneric is the canonical "which generic was this instantiated from?" link; func.name on typed instances is mangled. - daslib-macro-boost-has-sideeffects-predicate — new public predicate, full classification table, known limitations, test plumbing. - qmacro-invoke-source-bind-typedecl-modifier-iter-vs-array — typedecl block-param const/ref handling differs between iterator and array sources; the two diagnostic error messages tell you which branch you picked wrong. - qmacro-gensym-per-callsite-via-lineinfo — backtick-prefixed names + line+column suffix, force_at / force_generated / can_shadow. - my-fold-macro-emits-a-loop-with-for-it-in-source-... (UPDATED) — peel_each pattern corrected for generic-instance detection + positive array gate + block-param typedecl handling. LINQ semantics: - are-there-parity-tests-in-tests-linq-that-compare-fold-output-to-... - which-typedecl-predicates-identify-types-where-length-expr-is-... - why-does-each-arr-fail-with-unsafe-when-not-source-of-for-loop-... - what-s-the-right-sqlite-linq-chain-form-for-aggregates-sum-min-max-... - my-macro-substitutes-it-for-a-projection-expression-via-template-... - when-a-call-macro-needs-to-pick-copy-vs-move-init-for-a-projection-... - where-does-nolint-rule-go-when-a-lint-warning-is-emitted-from-inside-... Tooling / ops: - how-do-i-run-dastest-in-benchmark-only-mode-and-what-s-the-command-... - cpp-profiler-macos-samply-instruments.md - what-s-the-end-to-end-checklist-for-adding-a-new-daslib-das-module-... - how-do-i-call-a-dasimgui-or-any-managed-c-method-on-a-struct-field-... Updated: - why-does-my-dastest-integration-test-hang-at-readiness-gate-failed-... — original card pointed at a require-order red herring; real cause was ref_time_ticks() returning ns on POSIX while wait_until_ready's deadline math assumed μs. Fix landed in PR GaijinEntertainment#2685. No code changes — docs only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds
has_sideeffects(expr) : boolto daslib/macro_boost.das — a reusable conservative side-effect predicate. Returnstruewhen an expression has or might have side effects;falseonly when provably pure. Wired into daslib/linq_fold.das for three counter-lane optimizations: discardable-bind elision, count→length shortcut, andeach(<array>)peeling.Follow-up to PR #2689 (Phase 2A loop planner).
Classification
SAFE — recurse operands:
ExprVar, allExprConst*,ExprAddr,ExprTypeInfo/Decl/TagExprField,ExprSafeField,ExprSwizzleExprRef2Value,ExprRef2Ptr,ExprPtr2Ref,ExprAddrExprIs,ExprIsVariant,ExprAsVariant,ExprSafeAsVariantExprCast,ExprNullCoalescingExprStringBuilder(string heap is no-op per compiler — recurse into.elements)ExprKeyExists(pure container read)ExprAtwhen.subexpr._typeis NOTisGoodTableType(tables auto-insert on missing key)ExprSafeAtalwaysExprOp1/Op2/Op3via pure-op-name allowlist on workhorse types (bypassesfunc==nullartifacts from partial folding); falls back tofunc.flags.builtIn && !knownSideEffects && !unsafeOperation./and%blacklisted (div-by-zero panic).ExprCallwhen func is a C++ builtin with no side-effect flagsUNSAFE (return true): everything else, including allocations (
ExprNew,ExprMakeArray,ExprMakeStruct, etc.), user-defined calls, lambda invocations.Counter-lane integration
Three planner-level uses in linq_fold.das:
1. Discardable bind elision — the per-element
var vfinal = projectionis now emitted only whenhas_sideeffects(projection)is true. Pure projections (_._field * 2) produce a bare-loop counter directly in the macro output, no optimizer DCE needed. Impure projections keep the bind, preserving the per-element evaluation invariant from test_counter_lane_projection_side_effects.2. count→length shortcut — when the counter lane has no where-filter AND every projection in the chain is pure AND the source has known length (array/table/string/range/fixed-array), the planner emits
length(src)directly. The loop is elided entirely.3. peel_each fix —
eachis a daslang generic; the resolvedfunc.nameis the mangled instance, so the original peel never fired for typed chains. Now also checksfunc.fromGeneric.name == "each". Gated to array-shaped arguments only (isGoodArrayType || isArray) so iterator-yielding sources likeeach(range(10))keep their wrapper. Block-parameter typedecl is branched on source shape: iterator sources keep-const(rvalue, must be consumable); array sources keep the source'sconst&modifier solet arr <-chains type-check.Tests
_test_has_sideeffectsprobe call_macro (tests/macro_boost/_has_sideeffects_probe.das) that runs the predicate at macro time and emitsExprConstBoolof the result.test_pure_projection_uses_length_shortcut— invoke body returnslength(src)directly, no for-looptest_bare_count_uses_length_shortcut— same foreach(arr).count()test_impure_projection_keeps_bind— for-body has bind +++acctest_peel_each_on_array_source/_on_bare_count— assert peel fires for arraystest_peel_each_skips_non_array_source—each(range(...))keeps its wrapper (gate prevents iterator-source peeling)test_target_each_range_count_runs— behavioral check for iterator-source chainsBenchmark deltas (100K rows, INTERP, vs Phase 2A baseline)
select → countwhere → where → countwhere → countwhere → select → to_arrayTest plan
mcp__daslang__lintclean on new/edited filesmcp__daslang__compile_checkpasses on new/edited filesmcp__daslang__run_testontests/macro_boost/test_has_sideeffects.das— 24/24 passmcp__daslang__run_testontests/linq/test_linq_fold_ast.das— 51/51 passmcp__daslang__run_testontests/linq/test_linq_fold.das— 77/77 pass (incl. existing side-effects test)tests/linq/*suite green (569 tests across 14 files)Future work
[no_side_effects]annotation so daslang-defined pure helpers (length,key_exists, etc.) can opt in to the allowlist. Currently only C++ builtins withbuiltIn=trueare caught.count(distinct(x))and similar — needs distinct/sorted/etc. handling in the planner.🤖 Generated with Claude Code