-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Summary
test test_propose.py::test_find_block_by_deploy_id fails with the stack overflow error
INFO peers:rnode.py:392 l.bootstrap: thread 'tokio-runtime-worker' (14) has overflowed its stack
INFO peers:rnode.py:392 l.bootstrap: fatal runtime error: stack overflow, aborting
This behaviour is noticed only with two rholang files: ['longslow.rho', 'shortslow.rho'].
Expected: test passes
Actual: test fails due to stack overflow
Steps to reproduce
- Modify the test_propose.py with the next patch
@@ -67,15 +67,16 @@ def test_find_block_by_deploy_id(command_line_options: CommandLineOptions, docke
Note: Excludes files that are known to cause stack overflow or other issues
when running in batch mode with limited resources (e.g., longslow.rho, shortslow.rho).
"""
with start_node(command_line_options, docker_client, random_generator) as bootstrap:
- relative_paths = bootstrap.shell_out('sh', '-c', 'ls /opt/docker/examples/*.rho').splitlines()
+ relative_paths = bootstrap.shell_out('sh', '-c', 'ls /opt/docker/examples/longslow.rho').splitlines()
# Filter out problematic files that cause stack overflow in batch mode
# longslow.rho and shortslow.rho cause deep recursion leading to stack overflow when resources are constrained
- excluded_files = {'longslow.rho', 'shortslow.rho'}
+ # excluded_files = {'longslow.rho', 'shortslow.rho'}
+ excluded_files = {}
filtered_paths = [p for p in relative_paths if os.path.basename(p) not in excluded_files]
if not filtered_paths:
# Fallback to all paths if filtering removes everything
filtered_paths = relative_paths
- Run the next script from the integration_tests folder
_SKIP_CHECK_CODE=1 pipenv run -v ./run_tests test/test_propose.py::test_find_block_by_deploy_id
Investigation
During the initial investigation the issue with the stack overflow was localized to the call rho_runtime.rs:285
let res = i.inj_attempt(reducer, term, initial_phlo, normalizer_env, rand).await;
Seems like the recursive calls in the underneath logic are triggering the stack overflow in tokio runtime.
Additionally, there where applied work arounds:
- trying to execute that call in tokio::spawn_blocking. - did not help
- executing the task in the std::thread with manually configured stack size up to 64MB - did not help.
Solution Proposal
Rewrite the contract rho execution logic to omit deep recursive calls.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working