[pull] master from GaijinEntertainment:master#1006
Merged
Conversation
…appers Adds C++ bindings for partial_sort / nth_element / make_heap / push_heap / pop_heap, mirroring the existing __builtin_sort surface across 19 workhorse types (numeric + vector) plus generic byte-pointer paths for user struct types. Extends das_qsort_r.h with introselect (das_nth_element_r, das_partial_sort_r) and binary-heap (das_make_heap_r, das_push_heap_r, das_pop_heap_r) templates in the same inline-comparator style as the existing musl-derived smoothsort. Without these the new primitives would only work on workhorse types since std::partial_sort etc. template on iterator type, not void*+width. User-facing daslib wrappers in sort_boost.das (array form, with and without custom comparator block) plus q-prefix dispatch macros (qpartial_sort, qnth_element, qmake_heap, qpush_heap, qpop_heap) mirroring qsort. top_n / top_n_by added to daslib/linq.das with both array and iterator sources; iterator-source uses heap-during-scan via the new push_heap / pop_heap primitives for O(M log N) without full materialization. Tests-cpp/small/test_sort_family.cpp covers the das_qsort_r.h templates directly (7 TEST_CASEs, 110 assertions) including the previously-untested das_qsort_r baseline and the new templates' edge cases (empty, single, N=0, N>length, sorted, reverse-sorted, struct payload, introselect depth-bound). Foundational PR for the linq_fold splice project's BufferTopN emit mode (see ~/.claude/plans/keen-hopping-balloon.md and benchmarks/sql/LINQ.md). Daslang tests, benchmarks, doc-comments, das2rst.das grouping, and the tutorials/language/33_algorithm tutorial expansion land in follow-up commits on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…expansion
Adds 4 new [test] blocks to tests/linq/test_linq_sorting.das (test_partial_sort,
test_nth_element, test_top_n_by, test_heap_ops) covering 25 subtests across
default / custom-comparator / key-fn forms, struct payload preservation,
edge cases (N=0, N=length, N>length, k out-of-range), and the bounded-heap
top_n_by usage pattern. Full suite passes 53/53.
Six benchmark files: benchmarks/sort/{_common,sort,partial_sort,nth_element,
heap_ops}.das (raw micro-benchmarks, no SQL form — Boris flagged that sort
itself had no dedicated bench previously) plus benchmarks/sql/top_n_by_bench.das
(4-way m1/m3/m3f_old/m3f comparison plus the new m3_topn_array and
m3_topn_iter columns).
doc/reflections/das2rst.das gains Partial sorting + Heap operations groups
under document_module_sort_boost, plus a Top-N selection group under
document_module_linq. Regenerated docs: no Uncategorized symbols, no stubs.
tutorials/language/33_algorithm.{das,rst} expands with 4 new sections (sort,
partial_sort/nth_element, Heap operations, top_n/top_n_by) following the
existing demo() function style. Tutorial runs end-to-end with correct output.
Lint cleanup: PERF017/PERF018 in linq.top_n_by + heap_ops.das + tutorial, plus
~11 pre-existing LINT003 var-to-let warnings in test_linq_sorting.das. All
modified files lint-clean. mcp__daslang__format_file confirms already-formatted.
Pre-existing issue flagged but not fixed in this PR: benchmarks/sql/sort_take.das
fails to compile on master due to unsafe(each(...)) inside _fold. Out of
Phase 0 scope; should be a separate fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a Phase 0 row to the phase status table. Phase 0 lands the foundational <algorithm> sort-family bindings (partial_sort, nth_element, make_heap, push_heap, pop_heap) plus the linq.top_n_by family. With these primitives in place, PR B (BufferTopN emit mode in linq_fold) has the C++ partial_sort primitive it needs to splice [where][select]* |> order_by(...) |> take(N) into a single fused loop at O(M log N) instead of full sort. Benchmark numbers will be captured at PR time once 100K-row runs complete. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n bench
CI fix (the 6 failing linux/windows builds):
- src/builtin/das_qsort_r.h used uint32_t (pre-existing line 45 plus new
template additions) but never included <cstdint>. Worked on macOS because
the SDK transitively pulls it in; clang/Linux + MSVC/Win don't. Adding
the include makes the header self-contained.
Copilot review accepts (9 of 11; rejected the .rst half-open range claim):
- sort_boost.das: drop dim branch from build_dispatch_call (no dim
wrapper overloads exist for partial_sort/nth_element/heap ops, unlike
plain sort), update error message accordingly.
- sort_boost.das: nth_element doc-comment now says "O(N) average on the
kth element" instead of inaccurate "sub-O(N)".
- test_sort_family.cpp: pop_heap SUBCASE name now says descending (it
always verified descending; only the label was wrong).
- benchmarks/sort/*: tautological sanity checks replaced with effective
ones. Fixture is now ((i + 1) * 37) % 1000 so arr[0] != 0 pre-sort
(was 0, which coincided with the post-sort min and made several
checks pass even with a no-op sort). Added expected_sorted_value_at
helper and per-bench checks that verify the actual post-sort
position values.
- linq.das: top_n_by(array, ...) reserves length(arr) before pushing
every element instead of min(n, length), avoiding repeated reallocs
when n << length(arr).
Collapse top_n_by_bench into sort_take:
- top_n_by_bench's m1/m3/m3f_old/m3f columns were byte-identical to
sort_take's. Folded the two new top_n columns (m3_topn_array,
m3_topn_iter) into sort_take.das and deleted the standalone file —
one benchmark per shape, no duplicated measurements.
- Also fixed a pre-existing master compile failure in sort_take.das:
each(arr) inside _fold needs unsafe { } (the flagged-but-not-fixed
item in the original PR description).
CI fix: - das_qsort_r.h byte_swap uses memcpy without <cstring>. Mac SDK pulls it in transitively; Linux/MSVC don't. Same pattern as the cstdint fix from the previous commit. Copilot review round 2 (8 of 8 accepts): - sort_boost.das preamble + 5 q* macro doc-comments: drop dim references now that build_dispatch_call rejects dim sources. - sort_boost.das build_dispatch_call: accept macroName alongside opName so the wrong-arg-count error names the macro the user called (qpartial_sort) instead of the target wrapper (partial_sort). Update the unsupported-source error to "qpartial_sort can only operate on array or handled vector" same way. - benchmarks/sort/_common.das expected_sorted_value_at: tighten the precondition docstring — formula is exact only when n is a multiple of 1000 (10000 and 100000 are our actual call sites; the helper is not safe for arbitrary n). - linq.das + 33_algorithm.rst: top_n iterator-source comment said "at most n+1 elements ever resident" but the implementation never holds n+1 — pop_heap, overwrite last slot, push_heap keeps length at n throughout. Comment corrected to "at most n". - test_linq_sorting.das: new test_q_macros [test] block — 5 subtests covering qpartial_sort / qnth_element / qmake_heap / qpush_heap / qpop_heap dispatch on array<int>. Closes the macro-expansion coverage gap copilot flagged.
Round-2 review caught two additional N+1 references I missed: - tutorials/language/33_algorithm.das:166 (tutorial source) - benchmarks/sql/sort_take.das:16 (bench file header) Plus one self-spotted in daslib/linq.das:457 (section-comment header above the top_n block — separate from the docstring on line 482 that was fixed in 012c415). All three now correctly say "at most N elements resident" — matching the implementation, which keeps the heap length at N throughout the pop_heap → overwrite → push_heap sequence.
Copilot round 3 (2 of 2 accepts): - daslib/linq.das: drop ``buf |> reserve(n)`` from the iterator-source top_n_by overload. Iterator cardinality is unknown, so a caller passing n much larger than the iterator's actual size would force a large upfront allocation for no gain. daslang's array<T>::push has geometric growth, so the O(log n) doublings during the fill phase amortize to O(1) per push — no asymptotic change, and the foot-gun is gone. The array-source overload keeps reserve(length(arr)) because cardinality is known there. - src/builtin/module_builtin_runtime_sort.cpp:518 — comment said "strudel/sort_string is a special path"; strudel is daslang's music module, completely unrelated. Typo for __builtin_sort_string.
…-family phase 0: <algorithm> sort-family bindings + top_n_by
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )