Use `cost / path amt limit` as the pathfinding score, not `cost` #3890

TheBlueMatt · 2025-06-25T13:53:59Z

While walking nodes in our Dijkstra's pathfinding, we may find a
channel which is amount-limited to less than the amount we're
currently trying to send. This is fine, and when we encounter such
nodes we simply limit the amount we'd send in this path if we pick
the channel.

When we encounter such a path, we keep summing the cost across hops
as we go, keeping whatever scores we assigned to channels between
the amount-limited one and the recipient, but using the new limited
amount for any channels we look at later as we walk towards the
sender.

This leads to somewhat inconsistent scores, especially as our
scorer assigns a large portion of its penalties and a portion of
network fees are proportional to the amount. Thus, we end up with a
somewhat higher score than we "should" for this path as later hops
use a high proportional cost. We accepted this as a simple way to
bias against small-value paths and many MPP parts.

Sadly, in practice it appears our bias is not strong enough, as
several users have reported that we often attempt far too many MPP
parts. In practice, if we encounter a channel with a small limit
early in the Dijkstra's pass (towards the end of the path), we may
prefer it over many other paths as we start assigning very low
costs early on before we've accumulated much cost from larger
channels.

Here, we swap the `cost` Dijkstra's score for `cost / path amount`.
This should bias much stronger against many MPP parts by preferring
larger paths proportionally to their amount.

This somewhat better aligns with our goal - if we have to pick
multiple paths, we should be searching for paths the optimize
fee-per-sat-sent, not strictly the fee paid.

However, it might bias us against smaller paths somewhat stronger
than we want - because we're still using the fees/scores calculated
with the sought amount for hops processed already, but are now
dividing by a smaller sent amount when walking further hops, we
will bias "incorrectly" (and fairly strongly) against smaller
parts.

Still, because of the complaints on pathfinding performance due to
too many MPP paths, it seems like a worthwhile tradeoff, as
ultimately MPP splitting is always the domain of heuristics anyway.

I'm somewhat optimistically labeling this "backport 0.1", since we've been doing rather large pathfinding changes in backports anyway, and this directly addresses a major user complaint (with a rather small patch), so it seems worth backporting. However, it does come with some potential for tradeoffs, so open to discussion here.

`RouteGraphNode` is the main heap entry in our dijkstra's next-best heap. Thus, because its rather constantly being sorted, we care a good bit about its size as fitting more of them on a cache line can provide some additional speed. In 43d250d, we switched from tracking nodes during pathfinding by their `NodeId` to a "counter" which allows us to avoid `HashMap`s lookups for much of the pathfinding process. Because the `dist` lookup is now quite cheap (its just a `Vec`), there's no reason to track `NodeId`s in the heap entries. Instead, we simply fetch the `NodeId` of the node via the `dist` map by examining its `candidate`'s pointer to its source `NodeId`. This allows us to remove a `NodeId` in `RouteGraphNode`, moving it from 64 to 32 bytes. This allows us to expand the `score` field size in a coming commit without expanding `RouteGraphNode`'s size. While we were doing the `dist` lookup in `add_entries_to_cheapest_to_target_node` anyway, the `NodeId` lookup via the `candidate` may not be free. Still, avoiding expanding `RouteGraphNode` above 128 bytes in a few commits is a nice win.

We track the total CLTV from the recipient to the current hop in `RouteGraphNode` so that we can limit its total during pathfinding. While its great to use a `u32` for that to match existing CLTV types, allowing a total CLTV limit of 64K blocks (455 days) is somewhat absurd, so here we swap the `total_cltv_delta` to a `u16`. This keeps `RouteGraphNode` to 32 bytes in a coming commit as we expand `score`.

ldk-reviews-bot · 2025-06-25T13:54:03Z

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

ldk-reviews-bot · 2025-06-28T00:01:11Z

🔔 1st Reminder

Hey @tnull @valentinewallace! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2025-06-28T11:15:19Z

🔔 1st Reminder

Hey @tnull @valentinewallace! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2025-06-30T00:02:01Z

🔔 2nd Reminder

Hey @tnull @valentinewallace! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

ldk-reviews-bot · 2025-06-30T11:16:07Z

🔔 2nd Reminder

Hey @tnull @valentinewallace! This PR has been waiting for your review.
Please take a look when you have a chance. If you're unable to review, please let us know so we can find another reviewer.

valentinewallace

Concept ACK, can we add a test?

lightning/src/routing/router.rs

tnull

Hmm, re:

Here, we swap the `cost` Dijkstra's score for `cost / path amount`.
This should bias much stronger against many MPP parts by preferring
larger paths proportionally to their amount.

I vaguely recall some discussion some years back, but even after some digging was unable to find the corresponding PR to reestablish context unfortunately.

lightning/src/routing/router.rs

TheBlueMatt · 2025-07-02T20:14:45Z

I vaguely recall some discussion some years back, but even after some digging was unable to find the corresponding PR to reestablish context unfortunately.

Same, I know we've discussed it before...

lightning/src/routing/router.rs

TheBlueMatt · 2025-07-07T22:52:21Z

Apologies for the delay, pushed a test.

tnull

Fixups LGTM, would be ready for squashin' from my side. Verified that the new test would fail on BASE.

TheBlueMatt · 2025-07-08T20:46:44Z

Squashed without further changes.

valentinewallace

LGTM pretty much

valentinewallace · 2025-07-08T21:05:35Z

lightning/src/routing/router.rs

+		if fee_cost == u64::MAX || value_msat == 0 {
+			u64::MAX.into()
+		} else {
+			((fee_cost as u128) << 64) / value_msat as u128


Can we extract the 64 const here and below and document why it was chosen? Otherwise it's not clear here unless you see the comment in the middle of the big macro below...

Not sure that a constant of "the bit length of a u64" is more useful than a comment, so I added the comment here as well :).

lightning/src/routing/router.rs

While walking nodes in our Dijkstra's pathfinding, we may find a channel which is amount-limited to less than the amount we're currently trying to send. This is fine, and when we encounter such nodes we simply limit the amount we'd send in this path if we pick the channel. When we encounter such a path, we keep summing the cost across hops as we go, keeping whatever scores we assigned to channels between the amount-limited one and the recipient, but using the new limited amount for any channels we look at later as we walk towards the sender. This leads to somewhat inconsistent scores, especially as our scorer assigns a large portion of its penalties and a portion of network fees are proportional to the amount. Thus, we end up with a somewhat higher score than we "should" for this path as later hops use a high proportional cost. We accepted this as a simple way to bias against small-value paths and many MPP parts. Sadly, in practice it appears our bias is not strong enough, as several users have reported that we often attempt far too many MPP parts. In practice, if we encounter a channel with a small limit early in the Dijkstra's pass (towards the end of the path), we may prefer it over many other paths as we start assigning very low costs early on before we've accumulated much cost from larger channels. Here, we swap the `cost` Dijkstra's score for `cost / path amount`. This should bias much stronger against many MPP parts by preferring larger paths proportionally to their amount. This somewhat better aligns with our goal - if we have to pick multiple paths, we should be searching for paths the optimize fee-per-sat-sent, not strictly the fee paid. However, it might bias us against smaller paths somewhat stronger than we want - because we're still using the fees/scores calculated with the sought amount for hops processed already, but are now dividing by a smaller sent amount when walking further hops, we will bias "incorrectly" (and fairly strongly) against smaller parts. Still, because of the complaints on pathfinding performance due to too many MPP paths, it seems like a worthwhile tradeoff, as ultimately MPP splitting is always the domain of heuristics anyway.

TheBlueMatt · 2025-07-08T22:41:33Z

Squash-pushed an additional comment:

$ git diff-tree -U1 ac95262dd d08fbe782
diff --git a/lightning/src/routing/router.rs b/lightning/src/routing/router.rs
index 43a4c7e8f2..5ad5c3e778 100644
--- a/lightning/src/routing/router.rs
+++ b/lightning/src/routing/router.rs
@@ -2133,2 +2133,4 @@ impl<'a> PaymentPath<'a> {
 		} else {
+			// In order to avoid integer division precision loss, we simply shift the costs up to
+			// the top half of a u128 and divide by the value (which is, at max, just under a u64).
 			((fee_cost as u128) << 64) / value_msat as u128

TheBlueMatt · 2025-07-15T17:55:07Z

Backported in #3932

v0.1.5 - Jul 16, 2025 - "Async Path Reduction" Performance Improvements ======================== * `NetworkGraph`'s expensive internal consistency checks have now been disabled in debug builds in addition to release builds (lightningdevkit#3687). Bug Fixes ========= * Pathfinding which results in a multi-path payment is now substantially smarter, using fewer paths and better optimizing fees and successes (lightningdevkit#3890). * A counterparty delaying claiming multiple HTLCs with different expiries can no longer cause our `ChannelMonitor` to continuously rebroadcast invalid transactions or RBF bump attempts (lightningdevkit#3923). * Reorgs can no longer cause us to fail to claim HTLCs after a counterparty delayed claiming multiple HTLCs with different expiries (lightningdevkit#3923). * Force-closing a channel while it is blocked on another channel's async `ChannelMonitorUpdate` can no longer lead to a panic (lightningdevkit#3858). * `ChannelMonitorUpdate`s can no longer be released to storage too early when doing async updates or on restart. This only impacts async `ChannelMonitorUpdate` persistence and can lead to loss of funds only in rare cases with `ChannelMonitorUpdate` persistence order inversions (lightningdevkit#3907). Security ======== 0.1.5 fixes a vulnerability which could allow a peer to overdraw their reserve value, potentially cutting into commitment transaction fees on channels with a low reserve. * Due to a bug in checking whether an HTLC is dust during acceptance, near-dust HTLCs were not counted towards the commitment transaction fee, but did eventually contribute to it when we built a commitment transaction. This can be used by a counterparty to overdraw their reserve value, or, for channels with a low reserve value, cut into the commitment transaction fee (lightningdevkit#3933).

TheBlueMatt added 2 commits June 25, 2025 12:41

TheBlueMatt added the backport 0.1 label Jun 25, 2025

ldk-reviews-bot requested a review from valentinewallace June 25, 2025 14:04

tnull requested review from tnull June 26, 2025 11:14

TheBlueMatt linked an issue Jun 27, 2025 that may be closed by this pull request

MPP routing appears to create significantly more shards than is required #3717

Closed

valentinewallace reviewed Jun 30, 2025

View reviewed changes

lightning/src/routing/router.rs Outdated Show resolved Hide resolved

lightning/src/routing/router.rs Show resolved Hide resolved

tnull reviewed Jul 1, 2025

View reviewed changes

lightning/src/routing/router.rs Show resolved Hide resolved

lightning/src/routing/router.rs Show resolved Hide resolved

lightning/src/routing/router.rs Outdated Show resolved Hide resolved

tnull reviewed Jul 3, 2025

View reviewed changes

lightning/src/routing/router.rs Outdated Show resolved Hide resolved

lightning/src/routing/router.rs Show resolved Hide resolved

TheBlueMatt mentioned this pull request Jul 7, 2025

Make PaymentParameters::max_total_cltv_expiry_delta a u16 #3913

Open

tnull reviewed Jul 8, 2025

View reviewed changes

TheBlueMatt requested review from valentinewallace and tnull July 8, 2025 20:46

TheBlueMatt force-pushed the 2025-06-routing-score-proportional branch from dea5750 to ac95262 Compare July 8, 2025 20:46

valentinewallace previously approved these changes Jul 8, 2025

View reviewed changes

TheBlueMatt dismissed valentinewallace’s stale review via d08fbe7 July 8, 2025 22:41

TheBlueMatt force-pushed the 2025-06-routing-score-proportional branch from ac95262 to d08fbe7 Compare July 8, 2025 22:41

TheBlueMatt added the weekly goal Someone wants to land this this week label Jul 9, 2025

TheBlueMatt added this to Weekly Goals Jul 9, 2025

tnull approved these changes Jul 9, 2025

View reviewed changes

valentinewallace approved these changes Jul 9, 2025

View reviewed changes

valentinewallace merged commit 79fc513 into lightningdevkit:main Jul 9, 2025
27 of 28 checks passed

github-project-automation bot moved this to Done in Weekly Goals Jul 9, 2025

TheBlueMatt self-assigned this Jul 10, 2025

TheBlueMatt mentioned this pull request Jul 15, 2025

0.1.5 backports #3932

Merged

TheBlueMatt removed the backport 0.1 label Jul 15, 2025

Use cost / path amt limit as the pathfinding score, not cost #3890

Use cost / path amt limit as the pathfinding score, not cost #3890

Uh oh!

Conversation

TheBlueMatt commented Jun 25, 2025

Uh oh!

ldk-reviews-bot commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Jun 28, 2025

Uh oh!

ldk-reviews-bot commented Jun 28, 2025

Uh oh!

ldk-reviews-bot commented Jun 30, 2025

Uh oh!

ldk-reviews-bot commented Jun 30, 2025

Uh oh!

valentinewallace left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TheBlueMatt commented Jul 2, 2025

Uh oh!

Uh oh!

Uh oh!

TheBlueMatt commented Jul 7, 2025

Uh oh!

tnull left a comment

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt commented Jul 8, 2025

Uh oh!

valentinewallace left a comment

Choose a reason for hiding this comment

Uh oh!

valentinewallace Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Jul 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

TheBlueMatt commented Jul 8, 2025

Uh oh!

Uh oh!

TheBlueMatt commented Jul 15, 2025

Uh oh!

Uh oh!

Use `cost / path amt limit` as the pathfinding score, not `cost` #3890

Use `cost / path amt limit` as the pathfinding score, not `cost` #3890

ldk-reviews-bot commented Jun 25, 2025 •

edited

Loading