Skip to content

Use-after-free in to_array(map(each(arr).reverse(), lambda)) iterator-fusion chain #2505

@borisbat

Description

@borisbat

Summary

A latent use-after-free lives in the iterator-fusion runtime. The pattern that triggers it (in daslib/aot_cpp.das describeCppTypeEx, the original site of the test_archive AOT codegen crash):

let args <- to_array(map(each(typeDecl.dim).reverse(), @(itd) { return ",{itd}>"; }))

Replacing the chain with a plain for-loop fully eliminates the crash (200/200 iterations clean). The chain itself crashes 1-in-N times depending on heap layout — repro rate observed climbing 1/96 → 1/51 → 1/4 across recent sessions as the heap-layout entropy shifted.

Symptom

EXCEPTION_ACCESS_VIOLATION (0xC0000005) reading a page-aligned freed address (DAS_TRACK_ALLOC freed-page sentinel — the suballocator unmaps freed pages, so a UAF becomes a hard AV instead of "happens to read garbage").

[ 0] register_fusion + 0x2f8f7   <-- fused SimNode crashes here, every time
[ 1] das::SimNode_BlockNF::eval + 0x43b1a
...
daslang stack:
  _lambda_aot_cpp_235_10`function
  _lambda_aot_cpp_58_15`function
  builtin`to_array`8900707383779332554 from daslib/aot_cpp.das:409:20
  invoke block ... aot_cpp.das:293:11
  describeCppTypeEx

The C++ stack always tops at the same register_fusion + 0x2f8f7 offset across repros, suggesting one specific fused SimNode is the culprit.

Diagnosis

  • TypeDecl itself is alive at every guard point. Inserted verify_typedecl_gc(typeDecl) C++ binding (added in Use-after-free in to_array(map(each(arr).reverse(), lambda)) iterator-fusion chain #2505 / e6ae32f82) at the top of describeCppTypeEx and immediately before the chain — gc_magic stayed 0x1ee70001 (alive) under DAS_GC_DEBUG=1 (memory-poisoning sweep mode) on every crash.
  • So the freed page is not the TypeDecl, not its dim field directly — it is something the iterator chain or fusion-node generates and frees prematurely.

Suspect

One of these (or their fusion combination):

  1. each(arr).reverse() — reverse-iterator wrapper holding a back-pointer into a stack temporary that the fusion optimizer elides early.
  2. map(iter, lambda) — map iterator capturing a freed inner reference.
  3. to_array(iter) collecting strings — interpolation ",{itd}>" allocates a transient string, possibly on a context heap that to_array's fusion frees before consumption.

Repro

Plain windows-64 Release build, DAS_TRACK_ALLOC=ON (default for the build folder per CMakeCache).

for i in $(seq 1 200); do
  rm -f tests/archive/_aot_generated/test_aot_archive_test_archive.das.cpp
  bin/Release/daslang.exe utils/aot/main.das -- -aot \
    "$(pwd)/tests/archive/test_archive.das" \
    "$(pwd)/tests/archive/_aot_generated/test_aot_archive_test_archive.das.cpp" \
    || break
done

Crashes in 1/4 to 1/96 iterations on master HEAD. Pre-fix CI repros:

Workaround

PR replacing the chain with a plain for-loop in describeCppTypeEx. The double-reverse in the original was a no-op anyway, so the for-loop is also faster and clearer.

Suggested investigation

  1. Build RelWithDebInfo locally (DAS_TRACK_ALLOC auto-on) so register_fusion + 0x2f8f7 resolves to a symbol — the fused SimNode name will identify the carrier.
  2. Bisect: drop .reverse(), then drop to_array(...), then drop each(). With 1/4 repro rate this is fast.
  3. Once isolated, audit the iterator's destructor / fusion-node lifetime for who owns the captured iterator-state buffer.

This bug is orthogonal to any specific test or codegen path — describeCppTypeEx was the trigger because it runs the chain on TypeDecl::dim, but every other site combining each() + reverse() + map() + to_array() (or the same fusion combo with different operators) is at risk.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions