Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
162 changes: 162 additions & 0 deletions fc-crashes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# Force-close fuzzer LDK crashes

Minimized crash sequences found by the chanmon_consistency fuzzer with
force-close support. All crashes are `debug_assert` or `panic!` inside
LDK, not in the fuzzer harness. Byte 0 encodes monitor styles (bits
0-2) and channel type (bits 3-4: 0=Legacy, 1=KeyedAnchors).

## 1. channelmonitor.rs:2727 - HTLC input not found in transaction

```
debug_assert!(htlc_input_idx_opt.is_some());
```

When resolving an HTLC spend, the monitor searches for the HTLC
outpoint in the spending transaction's inputs but doesn't find it.
Falls back to index 0 in release mode, which would produce incorrect
tracking.

Minimized (17 bytes):
```
0x40 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xff 0xdc 0xde 0xff
```

Byte 0 = 0x40: Legacy channels, no async monitors. The sequence is
mostly 0xff (settlement) repeated, with height advances (0xdc, 0xde)
near the end. This suggests the crash happens during settlement when
processing on-chain HTLC spends after repeated settlement attempts.

## 2. onchaintx.rs:913 - Duplicate claim ID in pending requests

```
debug_assert!(self.pending_claim_requests.get(&claim_id).is_none());
```

The OnchainTxHandler registers a claim event with a claim_id that
already exists in the pending_claim_requests map.

Minimized (10 bytes):
```
0x08 0xd2 0x70 0x70 0x71 0x70 0x10 0x19 0xde 0xff
```

Byte 0 = 0x08: KeyedAnchors channels, no async monitors.
- 0xd2: B force-closes the A-B channel
- 0x70/0x71: disconnect/reconnect peers
- 0x10, 0x19: process messages on nodes A and B
- 0xde: advance chain 200 blocks
- 0xff: settle

B force-closes, peers disconnect and reconnect, messages are exchanged,
then height advances and settlement triggers the duplicate claim.

## 3. onchaintx.rs:1025 - Inconsistent internal maps

```
panic!("Inconsistencies between pending_claim_requests map and claimable_outpoints map");
```

The OnchainTxHandler detects that its `pending_claim_requests` and
`claimable_outpoints` maps are out of sync.

Minimized (14 bytes):
```
0x00 0x3c 0x11 0x19 0xd0 0xde 0xff 0xff 0x19 0x21 0x19 0xde 0x26 0xff
```

Byte 0 = 0x00: Legacy channels, all monitors completed.
- 0x3c: send hop payment A->B->C (1M msat)
- 0x11, 0x19: process messages to commit HTLC on A-B
- 0xd0: A force-closes A-B
- 0xde: advance 200 blocks
- 0xff: settle (first round)
- 0xff: settle again (second round, processes more messages)
- 0x19, 0x21, 0x19: continue processing B and C messages
- 0xde: advance 200 more blocks
- 0x26: process events on node C
- 0xff: settle (third round)

A hop payment partially committed, then A force-closes. Multiple
settlement rounds with continued message processing in between triggers
the internal map inconsistency.

## 4. test_channel_signer.rs:395 - Signing revoked commitment

```
panic!("can only sign the next two unrevoked commitment numbers, revoked={} vs requested={}")
```

The test channel signer is asked to sign an HTLC transaction for a
commitment number that has already been revoked.

Minimized (18 bytes):
```
0x22 0x71 0x71 0x71 0x71 0x71 0x71 0x71 0xff 0xff 0xff 0xff 0xff 0xff 0xde 0xde 0xb5 0xff
```

Byte 0 = 0x22: Legacy channels, async monitors on node B.
- 0x71: disconnect B-C peers (repeated, only first effective)
- 0xff: settle (repeated 6 times)
- 0xde 0xde: advance 400 blocks
- 0xb5: restart node B with alternate monitor state
- 0xff: settle

Async monitors on B with peer disconnection, repeated settlements,
height advances, and a node restart with a different monitor state.
The stale monitor combined with the restart puts B's signer in a state
where it's asked to sign for an already-revoked commitment.

## 5. channelmanager.rs:9836 - Payment blocker not found

```
debug_assert!(found_blocker);
```

During payment processing, the ChannelManager expects to find a
specific blocker entry for an in-flight payment but it's missing.

Minimized (13 bytes):
```
0x00 0x3c 0x11 0x19 0x11 0x1f 0x19 0x21 0x19 0x27 0x27 0xde 0xff
```

Byte 0 = 0x00: Legacy channels, all monitors completed.
- 0x3c: send hop A->B->C (1M msat)
- 0x11, 0x19, 0x11: commit HTLC on A-B
- 0x1f: B processes events (forwards HTLC to C)
- 0x19, 0x21, 0x19: commit HTLC on B-C
- 0x27, 0x27: C processes events (claims payment)
- 0xde: advance 200 blocks
- 0xff: settle

A straightforward A->B->C hop payment that completes normally (C
claims), followed by a height advance and settlement. No force-close
in this sequence, so the height advance before settlement may cause
HTLC timeout processing that conflicts with the claim path.

## 6. channelmanager.rs:19484 - Monitor update ID ordering violation

```
debug_assert!(update.update_id >= pending_update.update_id);
```

A ChannelMonitorUpdate has an update_id that is less than a pending
update's id, violating the expected monotonic ordering.

Minimized (10 bytes):
```
0x84 0x70 0x11 0x19 0x11 0x1f 0xd0 0x11 0x1f 0xba
```

Byte 0 = 0x84: Legacy channels, no async monitors, high bits set
(bits 3-4 = 0, bits 7 and 2 set).
- 0x70: disconnect A-B peers
- 0x11, 0x19, 0x11: process messages (likely reestablish after setup)
- 0x1f: process B events
- 0xd0: A force-closes A-B channel
- 0x11: process A messages
- 0x1f: process B events
- 0xba: restart node B with alternate monitor state

A force-close followed by continued message/event processing and a
node B restart triggers a monitor update with an out-of-order ID.
1 change: 1 addition & 0 deletions fuzz/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@ hfuzz_target
target
hfuzz_workspace
corpus
artifacts
107 changes: 107 additions & 0 deletions fuzz/FC-INFO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Force-Close Fuzzing Notes

This file records the current contract for `chanmon_consistency` force-close
coverage. It is intentionally short. Keep branch history and one-off debugging
notes elsewhere.

## Goal

Force-close fuzzing here should:

- exercise realistic off-chain to on-chain transitions
- keep force-close from changing the eventual outcome of claimed payments
- only allow claimed-payment sender failures when force-close dust touched a
used payment path
- allow unclaimed HTLCs to resolve by CLTV timeout
- drive the harness far enough that it observes real terminal outcomes
- avoid manufacturing timeout wins by starving message delivery or claim
propagation

## Hard-Mode Invariant

The current hard mode is:

- once the harness calls `claim_funds`, that HTLC must eventually produce
`PaymentClaimed` at the receiver
- after that claim, the sender must eventually produce a terminal outcome,
`PaymentSent` or `PaymentFailed`
- if the sender produces `PaymentFailed` for a claimed payment, some used
force-close path for that payment must have been dust-trimmed
- force-close dust on a used path is not, by itself, enough to require
`PaymentFailed`; the payment may still end in `PaymentSent`
- if no used force-close path for the claimed payment was dust-trimmed, the
sender must eventually produce `PaymentSent`
- going on-chain does not create any broader exception than that dust case
- unclaimed HTLCs may still fail by CLTV expiry
- CSV waits on force-close outputs are normal and expected; they are not
payment outcome changes
- a payment disappearing from `list_recent_payments()` is not enough, the
harness must observe or drive the terminal outcome directly

In this mode, the following are harness failures:

- `HTLCHandlingFailed::Receive` after we already chose to claim the HTLC
- a receiver-side claim without the receiver later getting `PaymentClaimed`
- a claimed HTLC without any sender-side terminal event
- a claimed HTLC getting `PaymentFailed` without any dust-trimmed used
force-close path
- a claimed HTLC that should fulfill resolving by CLTV timeout instead
- cleanup stopping while live balances or other pending work still show that
more progress is possible

## Timeouts

Do not conflate CSV and CLTV:

- CSV is normal force-close settlement latency
- CLTV expiry changes the HTLC outcome

The harness should keep driving through CSV waits. It should only protect
claimed HTLCs that should still fulfill from CLTV-expiry resolution.

## Harness Rules

The main rules for preserving the invariant are:

- advance large height jumps one block at a time, with bounded draining before
and after each block
- process queued messages and events before confirming newly broadcast
transactions, so preimages can propagate before timeout paths win
- keep sender-side payment bookkeeping independent of
`list_recent_payments()`
- track which channels each payment actually used, and when force-closing,
snapshot which used payment paths become dust-blocked on the closer's
commitment
- keep driving while `ClaimableOnChannelClose`, HTLC-related claimable balances,
queued messages, pending monitor updates, or pending broadcasts still show
unresolved work
- only stop before a CLTV boundary when crossing it would let a claimed HTLC
that has not yet reached a sender terminal event expire instead
- do not hide pending-payment state behind unrelated auto-driving before an
explicit force-close opcode; a bounded pre-close drain is acceptable when it
is only making already-queued work visible

## Review Checklist

When changing this harness, verify:

- claimed HTLCs still require `PaymentClaimed`
- claimed HTLCs still require a sender-side terminal event
- claimed HTLCs only allow `PaymentFailed` when some used force-close path was
dust-trimmed
- claimed HTLCs without dust-trimmed used force-close paths still require
`PaymentSent`
- unclaimed HTLCs may still time out on-chain
- force-close opcodes still act on the currently pending state
- large synthetic height jumps do not become blind timeout buttons again
- sender-side obligations are not reconciled away through local caches

## Verification

The standard check is:

```bash
~/repo/rl-tools/run_fuzz_runner.sh --timeout-secs 20
```

Re-run the full corpus after any meaningful force-close harness change.
Loading
Loading