Casync clean by adespawn · Pull Request #414 · scylladb/nodejs-rs-driver

adespawn · 2026-03-13T17:04:04Z

This commit adds a custom async bridge between Rust and JavaScript using N-API,
allowing for scheduling of async tasks without the use of tokio::spawn,
that is used when creating async functions through the napi-rs macros.

The main motivation for this change, was to improve the performance of the driver.
With the existing approach we spend a lot of CPU time on synchronization
between the main thread and the tokio threads. By reducing CPU time,
I aim to also improve the driver runtime.

This approaches polls all the futures on the Node.js main thread,
replacing napi-rs's built-in async task system which polls on
Tokio worker threads.

Architecture:

Single weak ThreadsafeFunction (TSFN) shared across all futures,
with manual ref/unref to control Node.js event loop lifetime
FutureRegistry (thread-local on main thread) stores in-flight futures
paired with their napi_deferred handles
Per-future Waker backed by Arc implementing the Wake trait,
which pushes the future id into a shared woken_ids vec and signals
the TSFN
Coalesced signaling via AtomicBool prevents flooding the event loop
when multiple wakers fire simultaneously
Single-threaded Tokio runtime drives the I/O reactor only; futures
are polled on the main thread inside the TSFN callback with the
Tokio runtime context entered

Key design decisions:

Polling on main thread ensures napi_env is always valid during
ToNapiValue conversion, avoiding cross-thread napi safety issues
Type-erased BoxFuture and SettleCallback allow heterogeneous futures
in a single HashMap without leaking generic parameters
Promise created via raw napi_create_promise/napi_resolve_deferred
to bypass napi-rs's async machinery entirely

[This PR was created with heavy use of LLM tools. At the current moment, the code was significantly refactored to match the existing style kept at this repository and improve error handling]

This PR aims to significantly improve the performance of the driver.

Refs: #75. With this optimisation, the performance for the GA release should not be a problem any loger.

Some early results:

Copilot

Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

adespawn · 2026-03-16T14:35:28Z

The code should be ready for some early review. For this PR to be fully ready I want to add more benchmarks

wprzytula · 2026-03-16T14:50:18Z

Please ask your LLM to generate an ASCII chart of the implemented solution.

benchmark/logic/parametrized_select.js

wprzytula · 2026-03-16T14:52:53Z

lib/client.js

+// Initialize the direct-poll bridge once per process.
+// This sets up the Tokio reactor thread and the wake mechanism used by all
+// bridged async Rust functions (session queries, paging, etc.).
+rust.initPollBridge();


❓ When is this executed? Is this guaranteed to be executed at most once? Is this idempotent?

When is this executed

When the file is first imported

Is this guaranteed to be executed at most once

Almost always yes: https://nodejs.org/docs/latest/api/modules.html#caching

Is this idempotent

No. The following calls will lead to a panic.

Is this idempotent

Do you think it's worth making it idempotent, i.e. setting an atomic flag at the beginning of its execution that prevents double initialization?

Probably yes.

src/lib.rs

src/errors.rs

src/tests/casync_tests.rs

src/async_bridge.rs

src/casync.rs

wprzytula · 2026-03-16T15:44:33Z

src/async_bridge.rs

+/// Submit a typed Rust future to be polled directly by the Node event loop.
+///
+/// Future can return a typed value `T` on success
+/// or an error `E` on failure. Both `T` and `E` are converted to JS values via
+/// `ToNapiValue` on the main thread when the future settles.
+///
+/// The error type `E` should produce a JS Error object from `to_napi_value` so
+/// that the rejection value is a proper error (e.g. `ConvertedError`).
+pub fn submit_future<F, T>(env: &Env, fut: F) -> ConvertedResult<JsPromise<T>>
+where
+    F: Future<Output = std::result::Result<T, ConvertedError>> + Send + 'static,
+    T: napi::bindgen_prelude::ToNapiValue + Send + 'static,
+{
+    // This is a driver error, so panic is warranted here. There is no reasonable way to recover.
+    assert!(
+        INITIALIZED.load(Ordering::Relaxed),
+        "init_poll_bridge must be called before submit_future. This is a bug in the driver."
+    );
+
+    let (deferred, promise) = create_promise(env)?;
+
+    let boxed: BoxFuture = Box::pin(async move {
+        let result = fut.await;
+        Box::new(move |env: Env, deferred| unsafe {
+            // SAFETY: This closure is only ever invoked from `poll_woken`, which runs
+            // on the Node main thread inside the TSFN callback - the only place where
+            // `env` is a valid napi_env. `deferred` is consumed exactly once here,
+            // satisfying the napi contract that each deferred is resolved or rejected
+            // exactly once. `to_napi_value` receives the same valid `env`.
+            let (js_val, resolve) = match result {
+                Ok(val) => (T::to_napi_value(env.raw(), val), true),
+                Err(err) => (ConvertedError::to_napi_value(env.raw(), err), false),
+            };
+            let status = js_val
+                // First we try to accept / reject with converted value / error.
+                .and_then(|v| {
+                    if resolve {
+                        check_status!(sys::napi_resolve_deferred(env.raw(), deferred, v))
+                    } else {
+                        check_status!(sys::napi_reject_deferred(env.raw(), deferred, v))
+                    }
+                })
+                // If this fails, or we failed to convert the value / error into a JS value,
+                // we reject with a fallback reason.
+                .or_else(|e| reject_with_reason(env, deferred, &e.reason));
+
+            if let Err(e) = status {
+                // If both fail, we assume something terrible has happened. We cannot
+                // inform JS side about the error by regular error handling, so we panic to
+                // avoid silent failures and orphaned promises.
+                panic!(
+                    "Failed to settle promise in TSFN callback. This may indicate either a bug in the driver or a severe runtime error.\nRoot cause:\n {}",
+                    e.reason
+                );
+            }
+        }) as SettleCallback
+    });
+
+    REGISTRY.with(|r| r.borrow_mut().insert(env, boxed, deferred))?;
+    Ok(JsPromise(promise, PhantomData))
+}


💭 I'm wondering if it makes sense to perform the first poll() straightaway. This could reduce latency. When executing prepared statements (the main point of our interest), the logic is as follows:

serialize statement's bound values,

calculate token,

configure the execution,

ask LBP for routing decision,

create a request frame,

send the frame via a channel to a tokio task managing the connection (router),

wait until the response arrives.
If I'm not mistaken, all points but the last can happen during a single poll! The rest is on the router, which is driven by the tokio runtime worker thread(s).

wait until the response arrives.

All points except this one I can agree can happen in the single poll

I insist on this, because the latency gains can be significant.

From simple testing ([c/regular]-insert), I see:

No gain on regular insert (which we determined is a decent aproximation of average latency)

Instruction count increase on c-inserts

While this is something worth investigating, basic experiments do not show improvement on. Considering that, we can move such experiments to a follow-up PR

wprzytula · 2026-03-16T15:45:02Z

In general, looks promising!

adespawn · 2026-03-18T08:57:20Z

Rebased on main

adespawn · 2026-03-18T12:40:19Z

Addressed some comments and added a new wrapper for safety (this one fully written by hand). For now, the changes are not split into components properly yet.

src/casync.rs

src/async_bridge.rs

wprzytula · 2026-03-18T14:57:08Z

src/async_bridge.rs

+/// Submit a typed Rust future to be polled directly by the Node event loop.
+///
+/// Future can return a typed value `T` on success
+/// or an error `E` on failure. Both `T` and `E` are converted to JS values via
+/// `ToNapiValue` on the main thread when the future settles.
+///
+/// The error type `E` should produce a JS Error object from `to_napi_value` so
+/// that the rejection value is a proper error (e.g. `ConvertedError`).
+pub fn submit_future<F, T>(env: &Env, fut: F) -> ConvertedResult<JsPromise<T>>
+where
+    F: Future<Output = std::result::Result<T, ConvertedError>> + Send + 'static,
+    T: napi::bindgen_prelude::ToNapiValue + Send + 'static,
+{
+    // This is a driver error, so panic is warranted here. There is no reasonable way to recover.
+    assert!(
+        INITIALIZED.load(Ordering::Relaxed),
+        "init_poll_bridge must be called before submit_future. This is a bug in the driver."
+    );
+
+    let (deferred, promise) = create_promise(env)?;
+
+    let boxed: BoxFuture = Box::pin(async move {
+        let result = fut.await;
+        Box::new(move |env: Env, deferred| unsafe {
+            // SAFETY: This closure is only ever invoked from `poll_woken`, which runs
+            // on the Node main thread inside the TSFN callback - the only place where
+            // `env` is a valid napi_env. `deferred` is consumed exactly once here,
+            // satisfying the napi contract that each deferred is resolved or rejected
+            // exactly once. `to_napi_value` receives the same valid `env`.
+            let (js_val, resolve) = match result {
+                Ok(val) => (T::to_napi_value(env.raw(), val), true),
+                Err(err) => (ConvertedError::to_napi_value(env.raw(), err), false),
+            };
+            let status = js_val
+                // First we try to accept / reject with converted value / error.
+                .and_then(|v| {
+                    if resolve {
+                        check_status!(sys::napi_resolve_deferred(env.raw(), deferred, v))
+                    } else {
+                        check_status!(sys::napi_reject_deferred(env.raw(), deferred, v))
+                    }
+                })
+                // If this fails, or we failed to convert the value / error into a JS value,
+                // we reject with a fallback reason.
+                .or_else(|e| reject_with_reason(env, deferred, &e.reason));
+
+            if let Err(e) = status {
+                // If both fail, we assume something terrible has happened. We cannot
+                // inform JS side about the error by regular error handling, so we panic to
+                // avoid silent failures and orphaned promises.
+                panic!(
+                    "Failed to settle promise in TSFN callback. This may indicate either a bug in the driver or a severe runtime error.\nRoot cause:\n {}",
+                    e.reason
+                );
+            }
+        }) as SettleCallback
+    });
+
+    REGISTRY.with(|r| r.borrow_mut().insert(env, boxed, deferred))?;
+    Ok(JsPromise(promise, PhantomData))
+}


I insist on this, because the latency gains can be significant.

This commit adds a custom async bridge between Rust and JavaScript using N-API, allowing for scheduling of async tasks without the use of tokio::spawn, that is used when creating async functions through the napi-rs macros. The main motivation for this change, was to improve the performance of the driver. With the existing approach we spend a lot of CPU time on synchronization between the main thread and the tokio threads. By reducing CPU time, I aim to also improve the driver runtime. This approaches pools all the futures on the Node.js main thread, replacing napi-rs's built-in async task system which polls on Tokio worker threads. Architecture: - Single weak ThreadsafeFunction (TSFN) shared across all futures, with manual ref/unref to control Node.js event loop lifetime - FutureRegistry (thread-local on main thread) stores in-flight futures paired with their napi_deferred handles - Per-future Waker backed by Arc<WakerInner> implementing the Wake trait, which pushes the future id into a shared woken_ids vec and signals the TSFN - Coalesced signaling via AtomicBool prevents flooding the event loop when multiple wakers fire simultaneously - Single-threaded Tokio runtime drives the I/O reactor only; futures are polled on the main thread inside the TSFN callback with the Tokio runtime context entered Key design decisions: - Polling on main thread ensures napi_env is always valid during ToNapiValue conversion, avoiding cross-thread napi safety issues - Type-erased BoxFuture and SettleCallback allow heterogeneous futures in a single HashMap without leaking generic parameters - Promise created via raw napi_create_promise/napi_resolve_deferred to bypass napi-rs's async machinery entirely [This commit including this commit message was created with heavy use of LLM tools. At the current moment, the code was slightly refactored to partially match the existing style kept at this repository.]

There was a bug that led to incorrect assertion in the benchmark

Copilot

Pull request overview

This PR introduces a custom Rust↔JS async bridge (via N-API) that polls Rust futures on the Node.js main thread instead of using napi-rs’s built-in async task system, aiming to reduce synchronization overhead and improve driver performance.

Changes:

Add a new async_bridge module that manages promise creation/settlement, a future registry, and a shared TSFN-based wake mechanism.
Refactor SessionWrapper and paging executor APIs to return a custom JsPromise<T> backed by submit_future() (instead of async fn + napi-rs async machinery).
Add unit tests for the bridge behavior (resolve/reject paths, concurrency, waker coalescing), and initialize the bridge during lib/client.js module load.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
test/unit/casync-tests.js	Adds unit coverage for the new casync bridge (resolve/reject/concurrency/waker).
src/tests/mod.rs	Exposes the new Rust test module for JS-accessible test helpers.
src/tests/casync_tests.rs	Implements Rust-side test entrypoints that exercise the bridge.
src/session.rs	Switches key async driver APIs to the new `submit_future()` + `JsPromise<T>` model.
src/napi_helpers.rs	Adds a non-Send/Sync deferred wrapper to prevent double-settlement.
src/lib.rs	Exposes new Rust modules (`async_bridge`, `napi_helpers`).
src/errors.rs	Adds `Display` for `ConvertedError`; removes unused async helper.
src/async_bridge.rs	Core implementation of the polling bridge, TSFN signaling, and promise settlement.
lib/client.js	Initializes the bridge once at module load time.
benchmark/logic/parametrized_select.js	Fixes benchmark insert to use the provided row count parameter.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-03T12:46:32Z

src/async_bridge.rs

+        let rt = tokio::runtime::Builder::new_multi_thread()
+            .worker_threads(1)
+            .enable_all()
+            .build()?;
+
+        // Create the TSFN from any c callback. This callback will be replaced in the build_callback step,
+        // but we still need to provide c function, to use napi-rs callback builder.
+        // We could do this directly through node-api interface, but here napi-rs simplifies this process.
+        // We also have to use callback witch matching type, to ensure everything runs correctly.
+        let noop_fn = env.create_function::<(), ()>("pollBridgeNoop", noop_callback_c_callback)?;
+
+        let tsfn = noop_fn
+            .build_threadsafe_function::<()>()
+            // We will manually ref/unref this tsfn based on whether we have outstanding futures.
+            .weak::<true>()
+            .build_callback(|ctx| {
+                let raw_env = ctx.env;
+                REGISTRY.with(|r| {
+                    r.borrow_mut().poll_woken(raw_env);
+                });
+                Ok(())
+            })?;
+
+        REGISTRY.with(|r| {
+            let mut reg = r.borrow_mut();
+            reg.tokio_rt = Some(rt);
+            reg.bridge.set_tsfn(tsfn);
+        });
+
+        // Cleanup hook — shut down the runtime when Node exits.
+        env.add_env_cleanup_hook((), |_| {
+            REGISTRY.with(|r| {
+                r.borrow_mut().shutdown();
+            });
+        })?;
+
+        if INITIALIZED.swap(true, Ordering::SeqCst) {
+            return Err(Error::from_reason(
+                "init_poll_bridge can only be called once",
+            ));
+        }
+
+        Ok(())


init_poll_bridge() performs side effects (creates a Tokio runtime, builds/sets the TSFN, and registers an env cleanup hook) before checking INITIALIZED. If it’s called a second time, it returns an error but has already overwritten the registry state and potentially leaked/duplicated resources/hooks. Consider checking (and returning) before any initialization work, and/or making the function idempotent per napi_env/thread rather than a single global AtomicBool (important for Node worker threads / multiple envs).

Suggested change

let rt = tokio::runtime::Builder::new_multi_thread()

.worker_threads(1)

.enable_all()

.build()?;

// Create the TSFN from any c callback. This callback will be replaced in the build_callback step,

// but we still need to provide c function, to use napi-rs callback builder.

// We could do this directly through node-api interface, but here napi-rs simplifies this process.

// We also have to use callback witch matching type, to ensure everything runs correctly.

let noop_fn = env.create_function::<(), ()>("pollBridgeNoop", noop_callback_c_callback)?;

let tsfn = noop_fn

.build_threadsafe_function::<()>()

// We will manually ref/unref this tsfn based on whether we have outstanding futures.

.weak::<true>()

.build_callback(|ctx| {

let raw_env = ctx.env;

REGISTRY.with(|r| {

r.borrow_mut().poll_woken(raw_env);

});

Ok(())

})?;

REGISTRY.with(|r| {

let mut reg = r.borrow_mut();

reg.tokio_rt = Some(rt);

reg.bridge.set_tsfn(tsfn);

});

// Cleanup hook — shut down the runtime when Node exits.

env.add_env_cleanup_hook((), |_| {

REGISTRY.with(|r| {

r.borrow_mut().shutdown();

});

})?;

if INITIALIZED.swap(true, Ordering::SeqCst) {

return Err(Error::from_reason(

"init_poll_bridge can only be called once",

));

}

Ok(())

if INITIALIZED

.compare_exchange(false, true, Ordering::SeqCst, Ordering::SeqCst)

.is_err()

{

return Err(Error::from_reason(

"init_poll_bridge can only be called once",

));

}

let result: JsResult<()> = (|| {

let rt = tokio::runtime::Builder::new_multi_thread()

.worker_threads(1)

.enable_all()

.build()?;

// Create the TSFN from any c callback. This callback will be replaced in the build_callback step,

// but we still need to provide c function, to use napi-rs callback builder.

// We could do this directly through node-api interface, but here napi-rs simplifies this process.

// We also have to use callback witch matching type, to ensure everything runs correctly.

let noop_fn =

env.create_function::<(), ()>("pollBridgeNoop", noop_callback_c_callback)?;

let tsfn = noop_fn

.build_threadsafe_function::<()>()

// We will manually ref/unref this tsfn based on whether we have outstanding futures.

.weak::<true>()

.build_callback(|ctx| {

let raw_env = ctx.env;

REGISTRY.with(|r| {

r.borrow_mut().poll_woken(raw_env);

});

Ok(())

})?;

REGISTRY.with(|r| {

let mut reg = r.borrow_mut();

reg.tokio_rt = Some(rt);

reg.bridge.set_tsfn(tsfn);

});

// Cleanup hook — shut down the runtime when Node exits.

env.add_env_cleanup_hook((), |_| {

REGISTRY.with(|r| {

r.borrow_mut().shutdown();

});

})?;

Ok(())

})();

if result.is_err() {

INITIALIZED.store(false, Ordering::SeqCst);

}

result

Copilot · 2026-04-03T12:46:32Z

src/tests/casync_tests.rs

+/// This exercises the CString::new fallback in reject_with_reason — the error
+/// is produced by a type whose Display output contains '\0'. Because the normal
+/// ConvertedError::to_napi_value path uses napi-rs string APIs (not CString),
+/// the null byte only matters when that path itself fails, causing reject_with_reason
+/// to be called. We trigger that by making T::to_napi_value fail: the future
+/// succeeds (Ok variant), but the value cannot be converted, so the settle
+/// callback falls through to reject_with_reason.
+///
+/// More practically this test validates that a ConvertedError with a null byte
+/// does NOT crash the process — the promise is simply rejected with a fallback
+/// message.


The doc comment for tests_casync_reject_null_byte describes triggering the reject_with_reason fallback by making to_napi_value fail on a successful (Ok) result, but the implementation currently rejects directly via Err::<i32, ConvertedError>(NullByteError.into()). Either update the comment to match the behavior being tested, or adjust the test to actually exercise the fallback path you want to validate.

Suggested change

/// This exercises the CString::new fallback in reject_with_reason — the error

/// is produced by a type whose Display output contains '\0'. Because the normal

/// ConvertedError::to_napi_value path uses napi-rs string APIs (not CString),

/// the null byte only matters when that path itself fails, causing reject_with_reason

/// to be called. We trigger that by making T::to_napi_value fail: the future

/// succeeds (Ok variant), but the value cannot be converted, so the settle

/// callback falls through to reject_with_reason.

///

/// More practically this test validates that a ConvertedError with a null byte

/// does NOT crash the process — the promise is simply rejected with a fallback

/// message.

/// The error is produced by a type whose Display output contains '\0', so if

/// the rejection machinery needs a CString-based fallback reason, CString::new

/// will fail and the code must still reject safely without crashing.

///

/// This test exercises the direct rejection path by returning

/// `Err::<i32, ConvertedError>(NullByteError.into())`. More practically, it

/// validates that a ConvertedError with a null byte does NOT crash the process

/// — the promise is simply rejected with a fallback message.

Copilot · 2026-04-03T12:46:33Z

src/napi_helpers.rs

+    // We want to block DeferredPtr from being Send or Sync,
+    // as we can use napi_deferred pointer can be used only in the main nodejs thread.


The safety comment for _not_send_sync is currently hard to read (“as we can use napi_deferred pointer can be used only…”). Please rephrase to clearly state that napi_deferred must only be used on the main Node.js thread and that DeferredPtr is intentionally !Send/!Sync to enforce that.

Suggested change

// We want to block DeferredPtr from being Send or Sync,

// as we can use napi_deferred pointer can be used only in the main nodejs thread.

// `napi_deferred` must only be used on the main Node.js thread.

// `DeferredPtr` is intentionally !Send and !Sync to enforce that.

adespawn force-pushed the casync-clean branch from 98b4967 to 11b3b84 Compare March 16, 2026 14:26

adespawn requested a review from Copilot March 16, 2026 14:26

Copilot AI reviewed Mar 16, 2026

View reviewed changes

Copilot started reviewing on behalf of adespawn March 16, 2026 14:38 View session

wprzytula requested changes Mar 16, 2026

View reviewed changes

adespawn force-pushed the casync-clean branch from 11b3b84 to 1b2b1fa Compare March 18, 2026 08:57

adespawn force-pushed the casync-clean branch from 8034511 to 9bf0cdd Compare March 18, 2026 12:40

wprzytula reviewed Mar 18, 2026

View reviewed changes

src/casync.rs Outdated Show resolved Hide resolved

wprzytula requested changes Mar 18, 2026

View reviewed changes

Add display to converted error

2ae44ff

adespawn force-pushed the casync-clean branch from 9bf0cdd to f838504 Compare March 27, 2026 15:48

adespawn added 3 commits March 27, 2026 16:51

Fix parametrized select

fc643f3

There was a bug that led to incorrect assertion in the benchmark

Wrapper over DeferredPtr

ee2138e

adespawn force-pushed the casync-clean branch from f838504 to ee2138e Compare March 27, 2026 15:51

adespawn mentioned this pull request Apr 2, 2026

Optimize async execution #428

Open

adespawn requested a review from Copilot April 3, 2026 12:37

Copilot started reviewing on behalf of adespawn April 3, 2026 12:38 View session

Copilot AI reviewed Apr 3, 2026

View reviewed changes

		// We want to block DeferredPtr from being Send or Sync,
		// as we can use napi_deferred pointer can be used only in the main nodejs thread.

Conversation

adespawn commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

adespawn commented Mar 16, 2026

Uh oh!

wprzytula commented Mar 16, 2026

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wprzytula commented Mar 16, 2026

Uh oh!

adespawn commented Mar 18, 2026

Uh oh!

adespawn commented Mar 18, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

adespawn commented Mar 13, 2026 •

edited

Loading