[pull] master from GaijinEntertainment:master#1022
Merged
Conversation
The JIT (wasm) radio on daslang.io/playground/ has been disabled in
production because the per-sample wasm artifacts never made it into the
deployed _site. This wires up the staging step, ships two benchmark
samples (Dictionary, SHA-256) wired through the cross-compile pipeline,
and along the way unblocks the host-LLVM wasm32 JIT path that several
samples actually need.
Playground side:
- .github/workflows/pages.yml: overlay web/output/samples/examples/*.wasm
onto _site/playground/samples/examples/ so the HEAD probe in main.js
(updateEngineAvailability) stops 404'ing.
- web/examples/ui/samples/{data.json,examples/}: rename the old
random_sequence sample to dict.das (closer to dasProfile naming),
bump n 5000 -> 200000 so it's a real benchmark, and add sha256.das
adapted from dasProfile (standalone, no config.das/testProfile deps).
- web/CMakeLists.txt: cross-compile dict + sha256 in the foreach.
- site/tests/playground/dropdowns.spec.js: update the e2e selector for
the renamed sample, add SHA-256 dropdown coverage.
dasLLVM wasm32 cross-compile fixes (host LLVM vs emcc-built runtime
archive ABI alignment, surfaced cross-compiling dict/sha256 to wasm):
* Signature drifts between modules/dasLLVM/daslib/{llvm_jit_common,
llvm_exe}.das declarations and src/builtin/module_jit.cpp:
- jit_prologue: add the 6th arg (LineInfoArg* at).
- jit_array_lock/unlock, jit_free_heap/persistent, jit_iterator_delete,
jit_simnode_interop, jit_register_standalone_variable: fix return
type (void in C++, was voidptr/int1 in the IR declaration).
- get_jit_table_at/find/erase: add the missing 2 args
(Context*, LineInfoArg*); call sites pass null since the C++
dispatcher only looks at baseType.
- llvm_jit.das jit_prologue call site: pass the 6th at arg.
* For-loop array iteration: stop ptrtoint'ing both ends to LLVMIntPtrType
(statically i64 on the 64-bit host even when the codegen target is
wasm32) and compare pointers directly. The old form let host pointer
width leak into the IR, so wasm32 lowering produced a never-true
termination compare; array<T> for-loops printed elements then ran
off the end of the buffer forever.
* init_jit now takes an optional target_triple and pins the module's
data layout BEFORE any IR is built when cross-compiling. Otherwise
LLVM eagerly constant-folds GEPs using the host's data layout and
bakes host-size strides into the IR.
* write_wasm passes +simd128,+nontrapping-fptoint to with_target_machine
so vec4f returns get lowered the same way as the emcc-built runtime
archive (which compiles with -msimd128 per web/CMakeLists.txt).
Without the feature flag, host LLVM sret-lowered vec4f returns while
the runtime returned them as native v128 - unreachable trap on every
jit_invoke_block_* / jit_call_*.
* g_target_is_wasm flag (set in init_jit, read by process_function_hints,
build_noalias_list, attach_loop_metadata): bypass user [hint(...)]
application for wasm32 - alwaysinline over complex inlined bodies
triggers a host-LLVM wasm32 backend miscompile that native JIT does
not hit. The unsafe_range_check / unsafe_alias / unsafe_capture
options also get default-false on wasm, which keeps bounds checks
in (safer; tiny perf cost).
* LLVM_JIT_CODEGEN_VERSION 0x04 -> 0x08 to invalidate native JIT DLL
caches across these changes.
Local browser timings on the staged playground (Chromium):
Dictionary (n=200000, 10 iters): interp 8.9 ms/iter, JIT 5 ms/iter
SHA-256 (1024 KB, 10 iters): interp 2.6 MB/sec, JIT 200 MB/sec
hello/loop/func also run cleanly in both engines after the fixes.
Residual: sha256 JIT main-exit OOB after the timing output. The bench
output is correct and the trap fires post-print; tracked separately.
Won't affect the demo.
Lint clean on all 6 changed .das files. 285 jit_tests pass on the
interpreter; native -jit on dict + sha256 also green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ime helper
Fixes the CI wasm_cross sha256 trap surfaced after the previous commit
(memory access OOB at jit_shutdown, inside das::gc_root::~gc_root after
correct timing output is printed).
Root cause
----------
The IR for reading ctx->globals / ctx->shared computes a byte offset
from Context*:
%gp = GEP Context*, CONTEXT_OFFSET_OF_GLOBALS ; e.g. + 152
%glb = load ptr from %gp
CONTEXT_OFFSET_OF_GLOBALS comes from src/builtin/module_jit.cpp:
addConstant<uint32_t>(*this, "CONTEXT_OFFSET_OF_GLOBALS",
uint32_t(offsetof(Context, globals)));
`offsetof` is evaluated by the C++ compiler that builds module_jit.cpp,
i.e. the HOST compiler. On 64-bit MSVC / clang that's 152. The wasm32
runtime archive (libDaScript_runtime.a) is compiled by emcc with
wasm32 layout — 4-byte pointers — and Context::globals lives at a
different (smaller) offset. The JIT'd code reads from Context+152,
which on wasm32 lands past the field, returning a garbage "globals"
pointer. init_globals then writes the program's `let primes = ...`
constants (and any other globals) into the wrong memory, which doesn't
surface until module shutdown reads the corrupted gc_root.
Fix
---
* Add two C++ helpers in src/builtin/module_jit.cpp:
DAS_API void * jit_get_globals_base(Context *) { return ctx->globals; }
DAS_API void * jit_get_shared_base (Context *) { return ctx->shared; }
Registered as JIT externs alongside the existing get_jit_get_*_mnh
pair. Because they live in the runtime archive, each side compiles
them with the right Context layout for that target.
* In modules/dasLLVM/daslib/llvm_jit.das, gate the two global-pointer
resolution sites on g_target_is_wasm:
if (g_target_is_wasm) {
// call jit_get_globals_base / jit_get_shared_base
} else {
// inline GEP + load using CONTEXT_OFFSET_OF_GLOBALS — unchanged
}
Native JIT keeps the inlined GEP+load (no extra call), so there is
no perf cost for the JIT path that already worked. Only the
wasm-cross-compile target goes through the helper — and on wasm
LLVM inlines it through the runtime archive at link time anyway.
* LLVM_JIT_CODEGEN_VERSION 0x08 -> 0x09 (caches that bake the IR
shape must invalidate).
Verification
------------
Local repro with wasmtime 44.0.1 (same version CI uses):
wasmtime -W exceptions=y sha256.wasm
"sha256", 0.006423000, 10
155.69049 mb/sec
-> exit 0 (was: out of bounds memory access at jit_shutdown)
Browser playground also clean for all five samples (hello / loop /
func / dict / sha256).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* init_jit features string: only set +simd128/+nontrapping-fptoint when cross-compile target is wasm. Previously applied to ANY non-empty triple — would break a future non-wasm cross-compile (e.g. aarch64, riscv) at LLVMCreateTargetMachine time. Caught by Copilot. * sha256 sample header: clarify up front that this is the SHA-256 COMPRESSION FUNCTION (no padding finalization), kept identical to the dasProfile cross-language bench shape so the numbers compare apples to apples. Input is always 1024 bytes (16 full blocks), so the no-trailing-bytes case doesn't arise. Caught by Copilot. * pages.yml playground samples staging: cp from site/playground/samples (the directory both stage_site_playground AND stage_site_playground_wasm populate as their canonical output) instead of the previous two-step from web/examples/ui/samples plus the .wasm overlay from web/output/samples/examples. Drops one cp step and one source path. Caught by aleksisch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…inalize_decs_emission extract, qn sweep
- A1: collapse `finalize_emission` into `finalize_emission_stmts`. Single
caller (`plan_reverse`) now extracts its qmacro_block body via
`push_block_list` before delegating. -8 LOC.
- A2: extract `finalize_decs_emission(emission, at, wrapToIter)` helper.
Three callers (`plan_decs_order_family`, `plan_decs_reverse`,
`plan_decs_distinct`) consolidate the `force_at + force_generated +
conditional iter wrap` tail into a single call. The two wrap conditions
(`needIterWrap && returnsArray` for order_family, `needIterWrap &&
needBuffer` for distinct) merge into a pre-multiplied `wrapToIter`
boolean at the call site. -9 LOC net.
- qname sweep: 124 of 132 `"`{prefix}`{at.line}`{at.column}"` sites
collapse to `qn("prefix", at)`. The 8 remaining sites carry an extra
backtick-segment after `{at.column}` (e.g. `{length(preCondStmts)}`,
`{spec.slot}`) that doesn't fit `qn`'s signature; left as-is.
- Category A inline-collapse / Category B push_block_list adoption:
SKIPPED after fresh audit. The plan's targeted sites have shifted to
conditional-push shapes (after prior slices) that no longer fit clean
inline-collapse. Defer to a future cleanup if it surfaces again.
Codegen invariant: `qn()` returns the byte-identical string as the inline
form, so all AOT baselines pass without refresh. Bench smoke 7 reps × 7
benches: m3f figures byte-identical to PR #2796 baselines (count 4 / sum 2
/ average 5 / aggregate_match 6 / groupby_count 36 / zip_dot_product 7 /
distinct_count 15 ns/op).
Validation matrix (9 lanes):
- Interp: linq 1332/1332, decs 245/245, ast_match 371/371, dasSQLITE 782/782
- AOT: same
- JIT: same modulo test_capture_cfb.das pre-existing failure on master
Net: -16 LOC (139 ins / 155 del). MCP lint + CI lint + format all clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fixes Playground JIT toggle: ship benchmark samples + fix wasm32 cross-compile
…echanical-cleanup linq_fold: mechanical cleanups (PR 3) — finalize_emission collapse + qn sweep
…_from + qmacro_block_to_array
Replace runs of two-or-more `stmts |> push <| qmacro_expr() { ... }` calls into
the same array with a single `stmts |> push_from <| qmacro_block_to_array() { ... }`
emission. Each cluster body reads as one multi-statement emission unit instead of
N separate per-statement pushes.
Pattern is composed entirely from existing pieces: `push_from` from builtin.das
plus the `qmacro_block_to_array` macro already used in 15 init sites in this
file (the `var stmts <- qmacro_block_to_array() { ... }` form), so consumers
familiar with that idiom recognize the shape immediately.
Net: -50 LOC (21 ins / 71 del), single file. Audit covered decs_boost
(0 raw sites), sqlite_linq (6 isolated sites, no clusters), and ast_match
(97 sites but all in the single-line `push <| qmacro_expr(\${...})` paren form,
already as compact as collapsing achieves) — only linq_fold has the multi-line
cluster pattern.
Validation: 9-lane matrix (interp+AOT+JIT × linq 1332 / decs 245 / ast_match 371 /
dasSQLITE 782) all green; bench smoke 7×7 m3f byte-identical to master across
count/sum/average/aggregate_match/groupby_count/zip_dot_product/distinct_count.
MCP lint + CI lint + format_file all clean.
…ush-cluster-collapse linq_fold: collapse 21 consecutive qmacro_expr push clusters
…ush cluster consolidation Codifies the patterns established by PRs #2793-#2799 so future macro work follows the documented forms instead of rediscovering them. Three new sections: - "Shared AST-match helpers" — table of 11 public helpers in daslib/ast_match.das + daslib/templates_boost.das (match_call_in_module, match_call_in_linq, peel_lambda_*, peel_tuple_field_read, extract_const_string, qn, qm_peel_ref2value, push_block_list) with signatures + when-to-reach-for-each. Includes a "when patterns apply vs don't" note: introspection-heavy files (linq_fold, sqlite_linq, ast_match) benefit; emit-only files (decs_boost, the emitter half of templates_boost) don't. - "qmatch — predicate-style pattern matching" — anti-pattern (hand-rolled is X / as X cascades) vs preferred predicate form with $e/$f/$v/$i tags bound to PRE-DECLARED outer variables (not result-struct fields). Documents the QMatchResult shape, points to sqlite_linq for 37+ adoption sites + tests/ast_match for grammar exercises. - "Push cluster consolidation" (new subsection under "qmacro vs quote") — consecutive `arr |> push <| qmacro_expr() { ... }` runs into the same array collapse into a single emission via either Form A (push_from + qmacro_block_to_array, preferred, no clone) or Form B (push_block_list + qmacro_block, clones, use when the source block stays alive). Includes "when NOT to collapse" guard. One section updated: - "Peel ExprRef2Value before qmatch" now routes through qm_peel_ref2value (single source of truth) instead of showing the manual if-peel snippet. Adds note on why the helper still uses while-peel (conservative until ast_block_folding.cpp synthesis paths are audited). PR 6 (decs_boost migration from the original ladder plan) intentionally skipped: audit confirmed decs_boost has zero hand-rolled is_*_call helpers, zero qname construction, zero ExprRef2Value while-loops, zero push qmacro_expr clusters, and zero peel_lambda candidates. The file is already lean — the migration would manufacture work. +100 / -15 LOC. Doc-only; no code changes.
…skill-helpers-doc skills/das_macros: document AST-match helpers, qmatch idiom, push cluster consolidation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )