fix: catch xmss_setup_prover panic and return error code instead of aborting (#722)#724
Merged
ch4r10t33r merged 2 commits intomainfrom Apr 13, 2026
Merged
Conversation
added 2 commits
April 13, 2026 21:15
…borting (#722) When the rec_aggregation prover bytecode file is missing (ENOENT), init_aggregation_bytecode() panics inside xmss_setup_prover(). A Rust panic propagating through an extern "C" boundary is UB and causes the process to abort via panic_cannot_unwind. Root cause: rec_aggregation/compilation.rs:52 calls .unwrap() on a file-open; the aggregator node panics and aborts during the first maybeAggregateOnInterval tick on a multi-subnet devnet. Fix (two layers): rust/multisig-glue/src/lib.rs - Remove static Once guards (replaced by OnceLock<bool>) - Wrap init body in std::panic::catch_unwind; cache bool result - Change xmss_setup_prover / xmss_setup_verifier return type to i32 (0 = success, -1 = failure). Subsequent calls return the cached result. pkgs/xmss/src/aggregation.zig - Update extern fn declarations to return c_int - setupProver() / setupVerifier() now return !void and surface error.ProverSetupFailed / error.VerifierSetupFailed - All three callsites changed to try setupProver() / try setupVerifier() - aggregateSignatures() already returned !void so the error propagates up through computeAggregatedSignatures -> aggregateUnlocked -> aggregate -> maybeAggregateOnInterval, which already catches the error and logs a warning rather than crashing Closes #722
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #722
Root cause
init_aggregation_bytecode()inrec_aggregation/compilation.rs:52calls.unwrap()on a file-open. When the prover bytecode file is missing (ENOENT — e.g. on a fresh aggregator node), it panics insidexmss_setup_prover(). A Rust panic propagating through anextern "C"boundary is UB and triggerspanic_cannot_unwind, aborting the process.This fires on every
maybeAggregateOnIntervaltick, so the aggregator crashes repeatedly during devnet4 multi-subnet testing.Fix
rust/multisig-glue/src/lib.rsstatic Onceguards (replaced byOnceLock<bool>)std::panic::catch_unwind; cache theboolresultxmss_setup_prover/xmss_setup_verifierreturn type fromvoid→i32(0= success,-1= failure)pkgs/xmss/src/aggregation.zigextern fndeclarations toc_intreturn typesetupProver()/setupVerifier()now return!voidand surfaceerror.ProverSetupFailed/error.VerifierSetupFailedtry setupProver()/try setupVerifier()aggregateSignatures()already returns!void, so the error propagates throughcomputeAggregatedSignatures→aggregateUnlocked→aggregate→maybeAggregateOnInterval, which already catches and logs a warning — so no crash, just a skipBehaviour after fix
Aggregation is skipped for that interval; the node keeps running.
Testing
zig buildpasses ✅