[pull] master from GaijinEntertainment:master by pull[bot] · Pull Request #1007 · forksnd/daScript

pull · 2026-05-18T08:58:27Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

Closes most of the gap to libc++ std::sort and beats libstdc++ std::sort 19/20 cells. Header moves to include/daScript/simulate/ so aot.h can wire the typed das_sort<T> into the workhorse binding path. Bench gets a parallel libstdc++ build target so we can A/B both stdlibs from one tree. Algorithm changes in include/daScript/simulate/das_qsort_r.h (promoted from src/builtin/): - size_t indices throughout (>2G element support) - byte_swap sized dispatch for w ∈ {4,8,16,32,64,128,256}, chunked memcpy fallback for the rest. 30-140× faster than the generic loop at common widths (micro-bench in examples/sort/bench_byte_swap.cpp) - New das_block_partition_r: byte-pointer port of libc++ __bitset_partition. Populate a uint64_t mask of comparison outcomes for 64 elements branchlessly, then drive swaps with countr_zero. Cuts mispredictions from ~32/partition to ~1/64 on random data - das_qsort_r is now hybrid: block partition for hi-lo ≥ 128, Hoare for smaller ranges. Median-of-3 pivot placed at data[lo] for both paths - New das_sort<T, Compare> + das_sort_block<T, Compare>: typed mirrors of the byte-pointer impls. Same algorithm shape using std::swap and typed indexing. Provides the apples-to-apples peer for std::sort and the daslang typed-binding entry point - sized_memcpy helper for hole-sliding sift_down (das_sift_down_r) inner loop. Per-level memcpy at known struct widths lowers to a single SIMD load/store pair - das_heapsort_helper_r / das_make_heap_r / das_push_heap_r / das_pop_heap_r unchanged (Phase 0 winners — hole-sliding sift was already the bake-off champion for those) Daslang binding (include/daScript/simulate/aot.h): the 10 typed-sort call sites in scblk / scblk_array / builtin_sort_cblock switch from unqualified sort() (== std::sort via using namespace std) to das_sort. Linux/libstdc++ users gain ~1.5× on typed sorts; Mac/libc++ users see no regression because compile-time constant propagation through sizeof(T) already specializes our template to match libc++ performance on workhorse types and beats it on struct types. Bench infrastructure (examples/sort/): - bench_sort_family.cpp: 5-arm sort deep-dive table (std::sort, C qsort, das_qsort_r, das_qsort_block_r, das_sort<T>, das_sort_block<T>), correctness verification on every candidate, stdlib + compiler print - bench_byte_swap.cpp: new standalone micro-bench for the byte_swap primitive (chunked256, chunked64, words64-kernel-style, sized-dispatch, hybrid) - CMakeLists.txt: optional parallel libstdc++ build target (gated on g++-N availability) so a single configure produces both libc++ and libstdc++ binaries Phase 0.1 bake-off scaffolding (the candidates from Phase 0.1 — introsort, pdqsort-lite Hoare variant, Lomuto introselect, Floyd two-phase sift, ternary qsort, etc.) is not retained in the final header. Final state is: byte-pointer block-partition pdqsort hybrid, typed mirror, byte-pointer hole-sliding heap ops, byte-pointer heap-of-N partial_sort, byte-pointer Hoare-introselect nth_element. Headline benchmarks at N=100K (M-series Mac): vs libc++ std::* (pdqsort + block-partition): - 9/20 wins, including all of nth_element (0.64-0.74×), sort/struct types (0.61-0.91×), make_heap/int32 (0.95×) - Losses: sort/workhorse (1.37-1.38×), heap_sort/big structs (1.12×) vs libstdc++ std::* (Musser introsort): - 19/20 wins. Only heap_sort/P128 ties (1.01×). Across the board we beat libstdc++ 1.1-1.8× — Musser introsort hasn't been updated to pdqsort upstream Daslang runtime: - sort_struct_by_key/100K cblock path: 281 → 255 ns/op (9% faster) - m3_topn_array/100K (top_n_by) = 38 ns/op, matches SQLite's ORDER BY ... LIMIT 10 at 37 ns/op (LINQ-vs-SQL parity restored) Verification: ctest 29/29; tests/linq/test_linq_sorting.das 59/59; full dastest 8378 tests (8372 pass, 6 skipped, 0 failures — identical to master baseline). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Cache the questions that took >5 min of in-session research so future sessions can answer them in 1 ask: PR #2707 (sort family bake-off) findings: - byte-swap-micro-win-invisible-under-cblock-dominance - das-qsort-r-vs-std-perf-comparison - libcxx-stdsort-block-partition-pdqsort - qsort-byte-swap-implementations-survey - standalone-example-no-daslang-link - what-daslib-operations-exist-for-partial-sort-nth-element-heap-ops-and-top-n-selection - what-s-the-right-anti-dce-pattern-for-a-c-microbenchmark-inner-loop-so-the-optimizer-can-t-elide-it - where-are-the-cross-compiler-bit-scan-and-popcount-helpers-in-daslang-s-c-headers Doc-CI iteration findings: - sphinx-w-fails-on-my-pr-branch-with-undefined-label-struct-module-x-but-master-ci-is-green-... - what-ci-checks-must-pass-when-i-regenerate-doc-source-stdlib-via-das2rst-das Site-deploy gotcha: - why-does-a-new-top-level-html-page-e-g-daspkg-html-added-under-site-404-on-daslang-io-after-merging-to-master Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…-bakeoff Sort family: block-partition pdqsort + typed das_sort<T>

Override sphinx_rtd_theme's sidebartitle block so the orange `> daslang.io` logo links to https://daslang.io instead of pathto(_root_doc) (which is a self-link on the docs index). Mirrors upstream block at sphinx_rtd_theme/layout.html with the <a href> swapped. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ven-sort site/blog: add "Do you even sort?" post

…-2026-05-18 mouse-data: 11 new cards from sort-family + doc-CI sessions

…skuriakova-afdc9d doc: clickable daslang.io banner in sphinx sidebar

PR #2707 (sort-family) swapped daslang's qsort to block-partition pdqsort. That changed the tie-break order among equal sort keys, which broke das2rst-driven module docs in a non-obvious way: daslib/rst.das:1882 sorts `grp.func` by function_name only — `$(a,b) => function_name(a.fn) < function_name(b.fn)`. For overloaded functions the comparator returns false both ways (equal key), so which overload comes out "first" depends on qsort's internal tie-break. Downstream, the loop at lines 1912-1929 stamps `is_overload = (cur_name == prev_func_name)`. The first overload in iteration order gets `is_overload=false` → full :Arguments: emission with :ref: to each param type. Different overloads use different param types, so the choice of "first" decides which :ref: targets the page references. Symptom: dasImgui CI's sphinx-build -W failed with `undefined label: 'alias-imvec4'` in doc/source/stdlib/generated/imgui_style_builtin.rst — after #2707, push_style_one(ImGuiCol; ImVec4) now wins the detailed slot, and daslib/rst.das describe_type() emits a :ref:`ImVec4 <alias-imvec4>` for any TypeDecl whose `td.alias` is non-empty (set by the C++ binding `t->alias = "ImVec4"`). The alias label is never defined — `:ref:` to nowhere — sphinx-build -W exits non-zero. This was a latent bug: rst.das relied on unstable sort tie-breaking (see daslang qsort-is-not-stable lore). #2707 just exposed it. Fix: sort by the full signature string (rst_describe_function_short) instead of just the function name. The string starts with the function name, so name-alphabetical primary order is preserved, and overloads sort deterministically by signature within each name-run. Regenerated 77 doc/source/stdlib/handmade/function-*.rst entries — the new "first detailed" overload per name-run across math, builtin, ast, ast_boost, raster, strings, pugixml, debugapi, dashv, rtti, strings_boost, uriparser. Each stub filled by copying the closest signature-matched sibling's description; math overloads hand-checked for vector/scalar semantic drift (mad fusion claim dropped, round nearest-even claim qualified, identity 3x3 wording trimmed). Sphinx -W --keep-going -b html builds clean (0 warnings). das2rst.das re-run is idempotent (no new stubs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…break daslib/rst: deterministic overload sort tiebreak (unblocks dasImgui doc CI)

…ere splice Restructures _fold(chain) into a three-tier cascade: 1. splice — fused for-loop, lambdas inlined (hot patterns) 2. fallback — fold_linq_default: array-shape pipeline with _inplace reuse + delete 3. raw — clone_expression passthrough All tiers preserve semantics; _fold(chain) is observationally equivalent to chain. Obviates the previously-planned Phase 2D fail-loudly contract. Phase 1 retirement: - _old_fold macro deleted (everywhere: macro, helpers, header refs, bench files) - g_foldSeq dispatch table + 7 FoldSequence patterns deleted (fold_where_count, fold_where_select, fold_select_where, fold_where, fold_select, fold_order_distinct x2) — splice arms cover every shape they recognized - recursiveMacroName param dropped from fold_linq_default; hardcoded to "_fold" - where__to_array double-underscore rename bug fixed (callName ends_with "_") Phase 3 new splice arms (plan_order_family): - bare arr |> order[_by]?[_descending]? → direct call (drops iterator wrapper) - src |> order[_by]?[_descending]? |> take(K) → top_n[_by][_descending] - src |> where_*(p)+ |> order*(key?) → fused prefilter buffer + sort_inplace - src |> where_*(p)+ |> order*(key?) |> take(K) → fused prefilter + top_n* Phase 3d first select+where splice (was blocked since Phase 2A): - daslib/templates_boost.das: new replaceVariablePeeling helper that peels the typer-inserted ExprRef2Value wrapper during substitution into typed AST (mirrors qm_peel_ref2value in daslib/ast_match) - daslib/linq_fold.das: fold_linq_cond_peel uses the new helper to splice select(proj) |> where(pred) into a fused predicate, bailing to tier 2 when has_sideeffects(proj) to avoid double-evaluation. All four terminator lanes covered: array / counter / accumulator / early-exit. Phase 2 library additions: - daslib/linq.das: top_n_by_descending and top_n_descending (array + iterator source variants each) — mirror top_n_by / top_n with flipped comparator (partial_sort + reversed less for array; bounded min-heap for iterator) - linqCalls dict registers top_n / top_n_by / top_n_descending / top_n_by_descending so flatten_linq recognizes them Concurrent runtime fix: - src/builtin/module_builtin_runtime_sort.cpp:84 builtin_sort_string switched from unqualified sort() (= std::sort via using namespace std) to das_sort (block-partition pdqsort from PR #2707). The runtime path order_by<string> takes; on Linux/libstdc++ users see the same ~1.5x speedup PR #2707 brought to typed sorts. Benchmarks (100K rows, INTERP, m3 vs m3f, smaller better): - order_take_desc: m3 698 → m3f 56 ns/op (12.5x — new top_n_by_descending) - sort_take: m3 713 → m3f 56 ns/op (12.7x — top_n_by via splice) - select_where_order_take m3 354 → m3f 39 ns/op (9.1x — fused prefilter+top_n_by) - select_where_count: m3 57 → m3f 5 ns/op (11.4x — Phase 3d peel) - chained_where: m3 45 → m3f 6 ns/op (7.5x) - bare_order_where: m3 357 → m3f 340 ns/op (1.05x — sort dominates) Three new bench files (bare_order_where, order_take_desc, select_where_count) + m3f_old column dropped from all 29 existing files + 2 new top_n test funcs (13 subtests across array+iterator sources, including N=1, N=0, N>length, struct types, parity vs hand-rolled reference) + new plan_order_family + Phase 3d AST shape tests in test_linq_fold_ast.das. Tests: 8393/8393 dastest; 7782 AOT, all pass. Sphinx -W clean. detect-dupe clean (siblings-by-design only). Modeled on PR #2707 (single squashed commit, multi-area bundle, headline numbers in PR body). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Previously `daspkg install` / `update` / `build` invoked `cmake --build` without `--parallel`, so on generators whose default is single-job (MSBuild on Windows, Make on Linux/macOS) the build ran serially. Adding `--parallel` lets CMake pick a sensible per-generator default. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ort-family linq_fold: retire _old_fold; 3-tier cascade; order-family + select+where splice

…llel-build daspkg: parallelize cmake build in build_package

borisbat and others added 14 commits May 17, 2026 21:07

site/blog: add Do you even sort? post

d8a0bde

Merge pull request #2707 from GaijinEntertainment/bbatkin/sort-family…

0b66d51

…-bakeoff Sort family: block-partition pdqsort + typed das_sort<T>

Merge pull request #2708 from GaijinEntertainment/bbatkin/blog-do-u-e…

9e2405b

…ven-sort site/blog: add "Do you even sort?" post

Merge pull request #2709 from GaijinEntertainment/bbatkin/mouse-cards…

34660e0

…-2026-05-18 mouse-data: 11 new cards from sort-family + doc-CI sessions

Merge pull request #2710 from GaijinEntertainment/claude/gracious-pro…

5643825

…skuriakova-afdc9d doc: clickable daslang.io banner in sphinx sidebar

Merge pull request #2711 from GaijinEntertainment/daslib-rst-sort-tie…

ced9f51

…break daslib/rst: deterministic overload sort tiebreak (unblocks dasImgui doc CI)

Merge pull request #2712 from GaijinEntertainment/bbatkin/linq-fold-s…

551b8dc

…ort-family linq_fold: retire _old_fold; 3-tier cascade; order-family + select+where splice

Merge pull request #2713 from GaijinEntertainment/bbatkin/daspkg-para…

f2b24fb

…llel-build daspkg: parallelize cmake build in build_package

pull Bot locked and limited conversation to collaborators May 18, 2026

pull Bot added the ⤵️ pull label May 18, 2026

pull Bot merged commit f2b24fb into forksnd:master May 18, 2026

pull Bot had a problem deploying to github-pages May 18, 2026 08:58 Error

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from GaijinEntertainment:master#1007

[pull] master from GaijinEntertainment:master#1007
pull[bot] merged 14 commits into
forksnd:masterfrom
GaijinEntertainment:master

pull Bot commented May 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pull Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pull Bot commented May 18, 2026 •

edited

Loading