Skip to content

fix(xmss): prevent FFI keypair leaks on partial load and unclean shutdown#194

Merged
dimka90 merged 1 commit intomainfrom
fix/ffi-keypair-lifecycle
Apr 2, 2026
Merged

fix(xmss): prevent FFI keypair leaks on partial load and unclean shutdown#194
dimka90 merged 1 commit intomainfrom
fix/ffi-keypair-lifecycle

Conversation

@dimka90
Copy link
Copy Markdown
Collaborator

@dimka90 dimka90 commented Apr 1, 2026

Summary

  • Clean up previously loaded XMSS keypairs when loadValidatorKeys() fails mid-loop, preventing Rust memory leaks on partial load (zeam errdefer keypair.deinit() pattern,
    cli/src/node.zig:433)
  • Add defer n.Close() in main.go so keypairs are freed on all exit paths including os.Exit and panic
  • Free XMSS keypairs in Node.Close() before closing DB and P2P services
  • Add runtime.KeepAlive guards after all CGo calls that pass Go pointers to Rust:
    • leansig.go: RestoreKeypair, Sign, Verify, VerifyWithKeypair
    • leanmultisig.go: Aggregate, VerifyAggregated

Context

Rust-allocated XMSS keypairs (loaded via CGo FFI) were leaked in two scenarios:

  1. If loading validator 3 out of 5 failed, keypairs 0-2 were never freed — the error return path didn't clean up partial state
  2. If the process exited via os.Exit(1) (error path in main.go) or panic, Close() was never called and keypairs were never returned to Rust

The runtime.KeepAlive guards are a defensive hardening — without them, the Go GC could theoretically collect slices passed to C while the Rust FFI call is still executing. Unlikely
in practice since CGo pins the goroutine, but it's the accepted best practice per CGo documentation.

Test plan

  • go build ./... compiles cleanly
  • go vet ./... passes
  • go test ./node/... -count=1 — node tests pass
  • go test ./xmss/leanmultisig/... -count=1 — multisig tests pass
  • go test -race ./node/... ./xmss/leanmultisig/... — no races
  • Local 3-node devnet — nodes start, sign attestations, and shut down cleanly
  • Kill node with SIGINT — verify no Rust memory warnings in logs
  • Intentionally break a validator key file — verify node exits cleanly without leaking other loaded keys
    Closes FFI keypairs leaked on partial load failure and missing runtime.KeepAlive guards #191

@dimka90 dimka90 requested a review from devylongs April 1, 2026 21:26
@dimka90 dimka90 requested review from morelucks and shaaibu7 April 2, 2026 09:13
…down

Rust-allocated XMSS keypairs were leaked when loadValidatorKeys() failed
mid-loop (previously loaded keys never freed) and when the process exited
via os.Exit or panic (Close() never called).

Changes:
- Clean up previously loaded keypairs on error in loadValidatorKeys(),
  matching zeam's errdefer keypair.deinit() pattern (cli/src/node.zig:433)
- Add defer n.Close() in main.go so keypairs are freed on all exit paths
- Free XMSS keypairs in Node.Close() before closing DB and P2P
- Add runtime.KeepAlive guards after all CGo calls that pass Go pointers
  to Rust in leansig.go (RestoreKeypair, Sign, Verify, VerifyWithKeypair)
  and leanmultisig.go (Aggregate, VerifyAggregated) to prevent GC from
  collecting slices during FFI execution
@dimka90 dimka90 force-pushed the fix/ffi-keypair-lifecycle branch from f52e22c to fa69ccd Compare April 2, 2026 09:45
@dimka90 dimka90 merged commit 054cef2 into main Apr 2, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FFI keypairs leaked on partial load failure and missing runtime.KeepAlive guards

3 participants