Skip to content

RUST: Casper tests fail with stack overflow due to insufficient default thread stack size #305

@spreston8

Description

@spreston8

Casper tests fail with stack overflow due to insufficient default thread stack size

Summary

Multiple tests in casper/tests/ fail with a stack overflow error when run with cargo test. The default Rust thread stack size (2MB) is insufficient for the deep async recursion used by the Rholang interpreter.

Reproduction

cd casper
cargo test --test mod multi_parent_casper_should_be_able_to_create_a_chain_of_blocks_from_different_deploys

Expected: Test passes
Actual:

thread 'add_block::multi_parent_casper_add_block_spec::multi_parent_casper_should_be_able_to_create_a_chain_of_blocks_from_different_deploys' has overflowed its stack
fatal runtime error: stack overflow, aborting

Workaround

Setting a larger stack size via environment variable allows tests to pass:

RUST_MIN_STACK=8388608 cargo test --test mod

Root Cause

The tests use #[tokio::test] which runs on the tokio runtime with the default Rust thread stack size of 2MB. However, the Rholang interpreter uses deep async recursion via Box::pin(self.*_inner(...)) patterns that require significantly more stack space.

Stack Size Requirements

Stack Size Result
2MB (default) Fails
4MB Fails
4.5MB Passes
8MB All 304 tests pass

Affected Code

Tests

  • 285 tests using #[tokio::test] across 64 files in casper/tests/
  • First failing test encountered: casper/tests/add_block/multi_parent_casper_add_block_spec.rs:68

Deep Recursion Source

The recursive async calls are in rholang/src/rust/interpreter/reduce.rs:

// Lines 90-96
pub fn eval<'a>(
    &'a self,
    par: Par,
    env: &'a Env<Par>,
    rand: Blake2b512Random,
) -> Pin<Box<dyn std::future::Future<Output = Result<(), InterpreterError>> + std::marker::Send + 'a>> {
    Box::pin(self.eval_inner(par, env, rand))
}

Similar patterns at lines 211-213 (produce), 268-270 (consume), and 521-523 (dispatch).

Existing Workaround Pattern

Some tests already work around this issue using a manual thread with larger stack:

// From casper/tests/genesis/contracts/pos_spec.rs
#[test]
fn pos_spec() {
    std::thread::Builder::new()
        .stack_size(16 * 1024 * 1024)
        .spawn(|| {
            tokio::runtime::Runtime::new().unwrap().block_on(async {
                // ... test body ...
            })
        })
        .unwrap()
        .join()
        .unwrap();
}

Potential Solutions

1. CI/Local Environment Variable (Quick Fix)

Set RUST_MIN_STACK=8388608 in CI configuration and document for local development.

2. Custom Test Macro

Create a macro that wraps tests with the larger stack size pattern:

macro_rules! async_test {
    ($name:ident, $body:expr) => {
        #[test]
        fn $name() {
            std::thread::Builder::new()
                .stack_size(8 * 1024 * 1024)
                .spawn(|| {
                    tokio::runtime::Runtime::new().unwrap().block_on($body)
                })
                .unwrap()
                .join()
                .unwrap();
        }
    };
}

3. Cargo Configuration

Create .cargo/config.toml with rustflags to increase stack size (platform-specific).

4. Refactor Interpreter (Long-term)

Convert recursive async evaluation to iterative with explicit stack data structure. This is a significant change but would eliminate the stack size dependency.

Environment

  • OS: Linux (WSL2)
  • Rust: (check with rustc --version)
  • Default thread stack: 2MB
  • Required stack: ~4.5MB minimum, 8MB recommended

Related Files

  • casper/tests/add_block/multi_parent_casper_add_block_spec.rs
  • rholang/src/rust/interpreter/reduce.rs
  • casper/tests/genesis/contracts/pos_spec.rs (existing workaround example)
  • casper/tests/genesis/contracts/tree_hash_map_spec.rs (existing workaround example)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions