-
Notifications
You must be signed in to change notification settings - Fork 419
lightning-liquidity
: Add serialization logic, persist service state
#4059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
👋 Thanks for assigning @TheBlueMatt as a reviewer! |
124211d
to
26f3ce3
Compare
a98dff6
to
d630c4e
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4059 +/- ##
==========================================
- Coverage 88.60% 88.54% -0.07%
==========================================
Files 176 178 +2
Lines 132126 133672 +1546
Branches 132126 133672 +1546
==========================================
+ Hits 117072 118355 +1283
- Misses 12380 12590 +210
- Partials 2674 2727 +53
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
d630c4e
to
70118e7
Compare
70118e7
to
dd43edc
Compare
this all LGTM. I have a small concern: maybe I’m being a little paranoid, but read_lsps2_service_peer_states and read_lsps5_service_peer_states pull every entry from the KVStore into memory with no limit. That could lead to unbounded state, exhausting memory and crash. Maybe we can add a limit on how many entries we load into memory to protect against this dos? not sure how realistic this is though. maybe an attacker could have access to or share the same storage with the victim, and they could dump effectively infinite data onto disk. in this scenario, probably the victim would be vulnerable to other attacks too, but still.. |
Reading state from disk (currently) happens on startup only, so crashing wouldn't be the worst thing, we would simply fail to start up properly. Some even argue that we need to panic if we hit any IO errors at this point to escalate to an operator. We could add some safeguard/upper bound, but I'm honestly not sure what it would protect against.
Heh, well, if we assume the attacker has write access to our |
🔔 1st Reminder Hey @TheBlueMatt! This PR has been waiting for your review. |
dd43edc
to
f73146b
Compare
pub token: Option<String>, | ||
} | ||
|
||
impl_writeable_tlv_based!(LSPS2GetInfoRequest, { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really want to have two ways to serialize all these types? Wouldn't it make more sense to just use the serde
serialization we already have and wrap that so that it can't all be misused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think I'd be in favor of using TLV serialization for our own persistence.
Note that the compat guarantees of LSPS0/the JSON/serde format might not exactly match what we require in LDK, and our Rust representation might also diverge from the pure JSON impl. On top of that JSON is of course much less efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is there some easy way to avoid exposing that in the public API, then? Maybe a wrapper struct oe extension trait for serialization somehow? Seems like kinda a footgun for users, I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is there some easy way to avoid exposing that in the public API, then? Maybe a wrapper struct oe extension trait for serialization somehow? Seems like kinda a footgun for users, I think?
Not quite sure I understand the footgun? You mean because these types then have Writeable
as well as Serialize
implementations on them and users might wrongly pick Writeable
when they use the types independently from/outside of lightning-liquidity
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, for example. Someone who uses serde presumably has some wrapper that serde-writes Writeable
structs and suddenly their code could read/compile totally fine and be reading the wrong kind of thing. If they have some less-used codepaths (eg writing Events before they process them and then removing them again after) they might not find immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, for example. Someone who uses serde presumably has some wrapper that serde-writes
Writeable
structs and suddenly their code could read/compile totally fine and be reading the wrong kind of thing.
I'm confused - Writeable
is an LDK concept not connected to serde
? Do you mean Serialize
? But that also has completely separate API? So how would they trip up? You mean they'd confuse Writeable
and Serialize
?
) -> Pin<Box<dyn Future<Output = Result<(), lightning::io::Error>> + Send>> { | ||
let outer_state_lock = self.per_peer_state.read().unwrap(); | ||
let mut futures = Vec::new(); | ||
for (counterparty_node_id, peer_state) in outer_state_lock.iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh? Why would we ever want to do a single huge persist pass and write every peer's state at once? Shouldn't we be doing this iteratively? Same applies in the LSPS2 service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, only persisting what's needed/changed will be part of the next PR as it ties into how we wake the BP to drive persistence (cf. "Avoid re-persisting peer states if no changes happened (needs_persist
flag everywhere)" bullet over at #4058 (comment)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused why we're adding this method then? If its going to be removed in the next PR in the series we should just not add it in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's not gonna be removed, but extended: PeerState
(here as well as in LSPS2) will gain a dirty/needs_persist
flag and we'd simply skip persisting any entries that haven't been changed since the last persistence round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like a weird design if we need to persist something immediately while its being operated on - we have the node in question why walk a whole peer list? Can you put up the followup code so we can see how its going to be used? Given this PR is mostly boilerplate I honestly wouldn't mind it being a bit bigger, as long as the code isn't too crazy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems like a weird design if we need to persist something immediately while its being operated on - we have the node in question why walk a whole peer list?
Yes, this is why persist_peer_state
is a separate method - for inline persistence where we already hold the lock to the peer state we'd just call that. For the general/eventual persistence the background processor task calls LiquidityManager::persist
which calls through to the respective LSPS*ServiceHandler::persist
methods which then only persists the entries marked dirty since the last persistence round.
Can you put up the followup code so we can see how its going to be used? Given this PR is mostly boilerplate I honestly wouldn't mind it being a bit bigger, as long as the code isn't too crazy.
Sure will do as soon as it's ready an in a coherent state, although I had hoped to land this PR this week.
f73146b
to
2971982
Compare
Rebased to address minor conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responded to the outstanding comments, not quite sure I fully get all the rationale here.
Please let me know if/when I can squash fixups. |
Feel free. |
🔔 2nd Reminder Hey @TheBlueMatt @martinsaposnic! This PR has been waiting for your review. |
.. this is likely only temporary necessary as we can drop our own `dummy_waker` implementation once we bump MSRV.
We add simple `persist` call to `LSPS2ServiceHandler` that sequentially persist all the peer states under a key that encodes their node id.
We add simple `persist` call to `LSPS5ServiceHandler` that sequentially persist all the peer states under a key that encodes their node id.
We add simple `persist` call to `EventQueue` that persists it under a `event_queue` key.
5fb2ae3
to
4e4404d
Compare
Squashed. > git diff-tree -U2 5fb2ae31d 4e4404d2f
> |
🔔 1st Reminder Hey @TheBlueMatt @martinsaposnic! This PR has been waiting for your review. |
.. as we currently prune the the pending request state on peer disconnection anyways, so even if peers would reconnect the service can't use the events anymore anyways.
We read any previously-persisted state upon construction of `LiquidityManager`.
We read any previously-persisted state upon construction of `LiquidityManager`.
We read any previously-persisted state upon construction of `LiquidityManager`.
We let the background processor task regularly call `LiquidityManger::persist`. We also change the semantics of the `Future` for waking the background processor to also be used when we need repersisting (which we'll do in the next commit).
.. we only persist the event queue if necessary and wake the BP to do so when something changes.
.. we only persist the service handler if necessary.
.. to allow access in a non-async context
.. and wrap them accordingly for the `LSPS2ServiceHandlerSync` variant.
.. we only persist the service handler if necessary.
We add a simple test that runs the LSPS2 flow, persists, and ensures we recover the service state after reinitializing from our `KVStore`.
We add a simple test that runs the LSPS5 flow, persists, and ensures we recover the service state after reinitializing from our `KVStore`.
Previously, we'd persist peer states to the `KVStore`, but, while we pruned them eventually from our in-memory state, we wouldn't remove it from the `KVStore`. Here, we change this and regularly prune and delete peer state entries from the `KVStore`. Note we still prune the state-internal data on peer disconnection, but leave removal to our (BP-driven) async `persist` calls.
4e4404d
to
80bc554
Compare
Now pushed two more fixups: > git diff-tree -U2 4e4404d2f 80bc554de
diff --git a/lightning-liquidity/src/events/event_queue.rs b/lightning-liquidity/src/events/event_queue.rs
index 7636a815d..d6c6991d4 100644
--- a/lightning-liquidity/src/events/event_queue.rs
+++ b/lightning-liquidity/src/events/event_queue.rs
@@ -1,4 +1,5 @@
use super::LiquidityEvent;
+use crate::lsps2::event::LSPS2ServiceEvent;
use crate::persist::{
LIQUIDITY_MANAGER_EVENT_QUEUE_PERSISTENCE_KEY,
@@ -330,4 +331,11 @@ impl Writeable for EventQueueSerWrapper<'_> {
LiquidityEvent::LSPS2Service(event) => {
4u8.write(writer)?;
+ if matches!(event, LSPS2ServiceEvent::GetInfo { .. })
+ || matches!(event, LSPS2ServiceEvent::BuyRequest { .. })
+ {
+ // Skip persisting GetInfoRequest and BuyRequest events as we prune the pending
+ // request state currently anyways.
+ continue;
+ }
event.write(writer)?;
},
diff --git a/lightning-liquidity/src/lsps0/event.rs b/lightning-liquidity/src/lsps0/event.rs
index 97a3a9500..4141b51df 100644
--- a/lightning-liquidity/src/lsps0/event.rs
+++ b/lightning-liquidity/src/lsps0/event.rs
@@ -15,4 +15,6 @@ use bitcoin::secp256k1::PublicKey;
/// An event which an bLIP-50 / LSPS0 client may want to take some action in response to.
+///
+/// **Note: ** This event will *not* be persisted across restarts.
#[derive(Clone, Debug, PartialEq, Eq)]
pub enum LSPS0ClientEvent {
diff --git a/lightning-liquidity/src/lsps1/event.rs b/lightning-liquidity/src/lsps1/event.rs
index 508a5a42a..9443ad226 100644
--- a/lightning-liquidity/src/lsps1/event.rs
+++ b/lightning-liquidity/src/lsps1/event.rs
@@ -26,4 +26,6 @@ pub enum LSPS1ClientEvent {
/// [`LSPS1ClientHandler::create_order`] to place an order.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`LSPS1ClientHandler::request_supported_options`]: crate::lsps1::client::LSPS1ClientHandler::request_supported_options
/// [`LSPS1ClientHandler::create_order`]: crate::lsps1::client::LSPS1ClientHandler::create_order
@@ -44,4 +46,6 @@ pub enum LSPS1ClientEvent {
/// failed as the LSP returned an error response.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`LSPS1ClientHandler::request_supported_options`]: crate::lsps1::client::LSPS1ClientHandler::request_supported_options
SupportedOptionsRequestFailed {
@@ -67,4 +71,6 @@ pub enum LSPS1ClientEvent {
/// to get information from LSP about progress of the order.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`LSPS1ClientHandler::check_order_status`]: crate::lsps1::client::LSPS1ClientHandler::check_order_status
OrderCreated {
@@ -91,4 +97,6 @@ pub enum LSPS1ClientEvent {
/// Will be emitted in response to calling [`LSPS1ClientHandler::check_order_status`].
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`LSPS1ClientHandler::check_order_status`]: crate::lsps1::client::LSPS1ClientHandler::check_order_status
OrderStatus {
@@ -114,4 +122,6 @@ pub enum LSPS1ClientEvent {
/// failed as the LSP returned an error response.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`LSPS1ClientHandler::create_order`]: crate::lsps1::client::LSPS1ClientHandler::create_order
/// [`LSPS1ClientHandler::check_order_status`]: crate::lsps1::client::LSPS1ClientHandler::check_order_status
@@ -143,4 +153,6 @@ pub enum LSPS1ServiceEvent {
/// payment and order id for this order for the client.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`LSPS1ServiceHandler::send_payment_details`]: crate::lsps1::service::LSPS1ServiceHandler::send_payment_details
RequestForPaymentDetails {
@@ -161,4 +173,6 @@ pub enum LSPS1ServiceEvent {
/// regarding the status of the payment and order.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`LSPS1ServiceHandler::update_order_status`]: crate::lsps1::service::LSPS1ServiceHandler::update_order_status
CheckPaymentConfirmation {
diff --git a/lightning-liquidity/src/lsps2/event.rs b/lightning-liquidity/src/lsps2/event.rs
index a5c03af9a..29cc577f2 100644
--- a/lightning-liquidity/src/lsps2/event.rs
+++ b/lightning-liquidity/src/lsps2/event.rs
@@ -27,4 +27,6 @@ pub enum LSPS2ClientEvent {
/// you want to use if you wish to proceed opening a channel.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`LSPS2ClientHandler::select_opening_params`]: crate::lsps2::client::LSPS2ClientHandler::select_opening_params
OpeningParametersReady {
@@ -47,4 +49,6 @@ pub enum LSPS2ClientEvent {
/// When the invoice is paid, the LSP will open a channel with the previously agreed upon
/// parameters to you.
+ ///
+ /// **Note: ** This event will *not* be persisted across restarts.
InvoiceParametersReady {
/// The identifier of the issued bLIP-52 / LSPS2 `buy` request, as returned by
@@ -67,4 +71,6 @@ pub enum LSPS2ClientEvent {
/// failed as the LSP returned an error response.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`LSPS2ClientHandler::request_opening_params`]: crate::lsps2::client::LSPS2ClientHandler::request_opening_params
GetInfoFailed {
@@ -84,4 +90,6 @@ pub enum LSPS2ClientEvent {
/// failed as the LSP returned an error response.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`LSPS2ClientHandler::select_opening_params`]: crate::lsps2::client::LSPS2ClientHandler::select_opening_params
BuyRequestFailed {
@@ -111,4 +119,6 @@ pub enum LSPS2ServiceEvent {
/// `[LSPS2ServiceHandler::invalid_token_provided`] to error the request.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`LSPS2ServiceHandler::opening_fee_params_generated`]: crate::lsps2::service::LSPS2ServiceHandler::opening_fee_params_generated
/// [`LSPS2ServiceHandler::invalid_token_provided`]: crate::lsps2::service::LSPS2ServiceHandler::invalid_token_provided
@@ -133,4 +143,6 @@ pub enum LSPS2ServiceEvent {
/// [`LSPS2ServiceHandler::invoice_parameters_generated`].
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`ChannelManager::get_intercept_scid`]: lightning::ln::channelmanager::ChannelManager::get_intercept_scid
///
diff --git a/lightning-liquidity/src/lsps5/event.rs b/lightning-liquidity/src/lsps5/event.rs
index ddd4dfd52..a9c105225 100644
--- a/lightning-liquidity/src/lsps5/event.rs
+++ b/lightning-liquidity/src/lsps5/event.rs
@@ -40,4 +40,6 @@ pub enum LSPS5ServiceEvent {
/// [`validate`], which guards against replay attacks and tampering.
///
+ /// **Note: ** This event will be persisted across restarts.
+ ///
/// [`validate`]: super::validator::LSPS5Validator::validate
/// [`url`]: super::msgs::LSPS5WebhookUrl
@@ -95,4 +97,6 @@ pub enum LSPS5ClientEvent {
/// to notify the client about this registration.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`lsps5.set_webhook`]: super::msgs::LSPS5Request::SetWebhook
/// [`SendWebhookNotification`]: super::event::LSPS5ServiceEvent::SendWebhookNotification
@@ -130,4 +134,6 @@ pub enum LSPS5ClientEvent {
/// registering a new one.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`lsps5.set_webhook`]: super::msgs::LSPS5Request::SetWebhook
/// [`app_name`]: super::msgs::LSPS5AppName
@@ -183,4 +189,6 @@ pub enum LSPS5ClientEvent {
/// registration if desired.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`lsps5.remove_webhook`]: super::msgs::LSPS5Request::RemoveWebhook
WebhookRemoved {
@@ -204,4 +212,6 @@ pub enum LSPS5ClientEvent {
/// the given [`app_name`] was not found in the LSP's registration database.
///
+ /// **Note: ** This event will *not* be persisted across restarts.
+ ///
/// [`lsps5.remove_webhook`]: super::msgs::LSPS5Request::RemoveWebhook
/// [`AppNameNotFound`]: super::msgs::LSPS5ProtocolError::AppNameNotFound |
🔔 3rd Reminder Hey @TheBlueMatt @martinsaposnic! This PR has been waiting for your review. |
0u8 => { | ||
// LSPS0ClientEvents are not persisted. | ||
continue; | ||
}, | ||
1u8 => { | ||
// LSPS1ClientEvents are not persisted. | ||
continue; | ||
}, | ||
2u8 => { | ||
// LSPS1ServiceEvents are not persisted. | ||
continue; | ||
}, | ||
3u8 => { | ||
// LSPS2ClientEvents are not persisted. | ||
continue; | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meh, why bother writing a byte to indicate we have something that we aren't persisting? We do that in the rust-lightning Event
crap but it'd be nice to avoid. It does mean calculating the number of events to write rather than just looking at the vec len, but it seems worth it.
}, | ||
/// You should open a channel using [`ChannelManager::create_channel`]. | ||
/// | ||
/// **Note: ** As this event is persisted and might get replayed after restart, you'll need to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure how to do this as described? I guess an implementation would need to persist the SCID <-> user_channel_id correspondence and then check that store before opening a new channel? ISTM we either need to require users pass a unique user_channel_id
to invoice_parameters_generated
and then tell users to make this idempotent based on user_channel_id
or remove the user_channel_id
stuff and then suggest users use user_channel_id
('s upper/lower 64 bits) to make the opens idempotent.
Obviously my preference is the second, also because it allows use of a non-LDK service (or their own user_channel_id management) if the user wants to go the "store SCID <-> channel info" route.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess an implementation would need to persist the SCID <-> user_channel_id correspondence and then check that store before opening a new channel?
That mapping should be present via ChannelDetails
, no?
ISTM we either need to require users pass a unique user_channel_id to invoice_parameters_generated and then tell users to make this idempotent based on user_channel_id
Yes, happy to extend the docs to mention that user_channel_id
has to be unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That mapping should be present via ChannelDetails, no?
Sorry, intercept SCID.
Yes, happy to extend the docs to mention that user_channel_id has to be unique.
It still feels really weird to me that we're relying on an LDK-ism like user_channel_id
in the API here, let alone one that is supposed to be for users to do anything they want with :/.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(though admittedly my suggested alternative approach also relies on user_channel_id
to store the intercept SCID, though at least its optional)
This is the second PR in a series of PRs adding persistence tolightning-liquidity
(see #4058). As this is already >1000LoC, I now decided to put this up as an intermediary step instead of adding everything in one go.In this PR we add the serialization logic for for the LSPS2 and LSPS5 service handlers as well as for the event queue. We also have
LiquidityManager
take aKVStore
towards which it persists the respective peer states keyed by the counterparty's node id.LiquidityManager::new
now also deserializes any previously-persisted state from that givenKVStore
.We then have
BackgroundProcessor
drive persistence, skip persistence for unchanged LSPS2/LSPS5PeerState
s, and useasync
inline persistence forLSPS2ServiceHandler
where needed.This also adds a bunch of boilerplate to account for both
KVStore
andKVStoreSync
variants, following the approach we previously took withOutputSweeper
etc.cc @martinsaposnic