Skip to content

Context::runWithCatch leaves exception/stopFlags set; C++ bindings have to clean up by hand #2523

@borisbat

Description

@borisbat

Symptom

A C++ binding that uses Context::runWithCatch to call back into daslang code, observes a panic from that callback, and then does not itself rethrow (e.g. it surfaces the panic to a third-party framework like SQLite via sqlite3_result_error) leaks the panic past the binding boundary.

The outer daslang stack — still mid-execution of whatever called the binding — sees context->exception != nullptr on the next eval step and rethrows. The user sees the same panic twice: once via the binding's framework-level error path, then again as a hard EXCEPTION: print at the top level.

Why

Context::runWithCatch (src/simulate/simulate_exceptions.cpp:127) catches the dasException, restores ABI/stack state, and stores the message into exception / exceptionMessage / exceptionAt. It then returns false.

It does not clear exception, last_exception, or stopFlags. The clearing is the binding's responsibility.

The corresponding pattern that does clear (in the daslang VM itself) lives at src/simulate/simulate_exceptions.cpp:213-218, inside SimNode_TryCatch::eval:

} catch ( const dasException & ) {
    context.abiArg = aa;
    context.abiCMRES = acm;
    context.stack.pop(EP,SP);
    context.stopFlags = 0;
    context.last_exception = context.exception;
    context.exception = nullptr;
    catch_block->eval(context);
}

Same shape repeats in das_try_recover (line 285+) and the debugger variant (line 242+). All four in-VM users clear; runWithCatch doesn't.

Where this bit (real bug encountered today)

modules/dasSQLITE/src/dasSQLITE.userfn.cpp — the register_function trampoline calls back into a daslang scalar UDF via runWithCatch, then on panic surfaces the message via sqlite3_result_error. Without the manual cleanup, try_query_scalar(...) returned Err correctly and then crashed on the way out with the same panic message. The fix:

bool ok = R->context->runWithCatch([&]() {
    result = R->context->callOrFastcall(R->fn.PTR, args, &R->at);
});
if ( !ok ) {
    const char * msg = R->context->getException();
    std::string copy = msg ? msg : "<no message>";
    R->context->last_exception = R->context->exception;
    R->context->exception = nullptr;
    R->context->stopFlags = 0;
    sqlite3_result_error(sctx, copy.c_str(), -1);
    return;
}

Took ~30 minutes of head-scratching to localize because the symptom (an EXCEPTION: after a successful Err return) doesn't point at the missing reset; it looks like a second panic.

Suggested fix (pick one)

  1. Add runWithCatchAndClear helper alongside runWithCatch. Same body, but does the last_exception/exception/stopFlags reset before returning false. Bindings that explicitly want to "surface and forget" use the new name; existing callers (most of which are inside daslang machinery that wants the state visible to the surrounding try_recover lowering) keep using runWithCatch unchanged.

  2. Reset by default in runWithCatch and add an opt-out runWithCatchKeepException for the rare case. Slightly more invasive, but matches the "make the safe path the default" principle.

  3. At minimum, document the contract in simulate.h next to the declaration, with a code example showing the four-line cleanup. Worst option but cheapest.

My weak preference is option 1 — it adds a single new symbol, no breakage, and bindings calling back from C frameworks (SQLite, libuv, dasHV's WS handlers, dasGlfw key callbacks, anything with xDestroy-style lifetimes) will reach for the clearer name.

Repro

Any C++ binding that:

  1. Holds a Func,
  2. Calls it via runWithCatch from a C-level callback that runs outside the daslang stack frame that originated the call,
  3. Catches the panic and surfaces it through a non-daslang error channel without rethrowing.

Easiest concrete repro is the dasSQLITE register_function shipped on dassqlite-chunk10-operational-sqlite HEAD d0ac4e182tests/dasSQLITE/test_75_register_function_null_panic.das's test_panic_surfaces_as_error_and_connection_survives reproduces the symptom (and verifies the fix) end to end. Reverting the three-line cleanup in dasSQLITE.userfn.cpp makes that test fail with a leaked EXCEPTION: print.

Affects

Every present and future C++ binding that wraps a daslang Func and surfaces panics through a non-daslang framework. Currently only dasSQLITE's register_function hits this; future bindings (UDF aggregates / windows for dasSQLITE, custom event loops, UI callbacks, anything with C-level error channels) will hit it identically.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions