Skip to content

fix: catch xmss_setup_prover panic and return error code instead of aborting (#722)#724

Merged
ch4r10t33r merged 2 commits intomainfrom
fix/xmss-setup-prover-panic-722
Apr 13, 2026
Merged

fix: catch xmss_setup_prover panic and return error code instead of aborting (#722)#724
ch4r10t33r merged 2 commits intomainfrom
fix/xmss-setup-prover-panic-722

Conversation

@zclawz
Copy link
Copy Markdown
Contributor

@zclawz zclawz commented Apr 13, 2026

Closes #722

Root cause

init_aggregation_bytecode() in rec_aggregation/compilation.rs:52 calls .unwrap() on a file-open. When the prover bytecode file is missing (ENOENT — e.g. on a fresh aggregator node), it panics inside xmss_setup_prover(). A Rust panic propagating through an extern "C" boundary is UB and triggers panic_cannot_unwind, aborting the process.

This fires on every maybeAggregateOnInterval tick, so the aggregator crashes repeatedly during devnet4 multi-subnet testing.

Fix

rust/multisig-glue/src/lib.rs

  • Remove static Once guards (replaced by OnceLock<bool>)
  • Wrap init body in std::panic::catch_unwind; cache the bool result
  • Change xmss_setup_prover / xmss_setup_verifier return type from voidi32 (0 = success, -1 = failure)
  • Subsequent calls return the cached result with no work

pkgs/xmss/src/aggregation.zig

  • Update extern fn declarations to c_int return type
  • setupProver() / setupVerifier() now return !void and surface error.ProverSetupFailed / error.VerifierSetupFailed
  • All three callsites changed to try setupProver() / try setupVerifier()
  • aggregateSignatures() already returns !void, so the error propagates through computeAggregatedSignaturesaggregateUnlockedaggregatemaybeAggregateOnInterval, which already catches and logs a warning — so no crash, just a skip

Behaviour after fix

[warn] failed to aggregate attestation signatures for slot=64: error.ProverSetupFailed

Aggregation is skipped for that interval; the node keeps running.

Testing

  • zig build passes ✅

zclawz added 2 commits April 13, 2026 21:15
…borting (#722)

When the rec_aggregation prover bytecode file is missing (ENOENT),
init_aggregation_bytecode() panics inside xmss_setup_prover(). A Rust
panic propagating through an extern "C" boundary is UB and causes the
process to abort via panic_cannot_unwind.

Root cause: rec_aggregation/compilation.rs:52 calls .unwrap() on a
file-open; the aggregator node panics and aborts during the first
maybeAggregateOnInterval tick on a multi-subnet devnet.

Fix (two layers):

rust/multisig-glue/src/lib.rs
- Remove static Once guards (replaced by OnceLock<bool>)
- Wrap init body in std::panic::catch_unwind; cache bool result
- Change xmss_setup_prover / xmss_setup_verifier return type to i32
  (0 = success, -1 = failure). Subsequent calls return the cached result.

pkgs/xmss/src/aggregation.zig
- Update extern fn declarations to return c_int
- setupProver() / setupVerifier() now return !void and surface
  error.ProverSetupFailed / error.VerifierSetupFailed
- All three callsites changed to try setupProver() / try setupVerifier()
- aggregateSignatures() already returned !void so the error propagates
  up through computeAggregatedSignatures -> aggregateUnlocked -> aggregate
  -> maybeAggregateOnInterval, which already catches the error and logs
  a warning rather than crashing

Closes #722
Copy link
Copy Markdown
Contributor

@ch4r10t33r ch4r10t33r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@ch4r10t33r ch4r10t33r merged commit 1a8a166 into main Apr 13, 2026
12 checks passed
@ch4r10t33r ch4r10t33r deleted the fix/xmss-setup-prover-panic-722 branch April 13, 2026 21:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

panic: xmss_setup_prover fails with NotFound in aggregator on multi-subnet devnet

2 participants