-
Notifications
You must be signed in to change notification settings - Fork 419
#4059 followups #4118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#4059 followups #4118
Conversation
👋 Thanks for assigning @tnull as a reviewer! |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4118 +/- ##
==========================================
+ Coverage 88.39% 88.53% +0.13%
==========================================
Files 178 179 +1
Lines 134179 134413 +234
Branches 134179 134413 +234
==========================================
+ Hits 118610 119002 +392
+ Misses 12895 12653 -242
- Partials 2674 2758 +84
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
.await?; | ||
let mut future_opt = None; | ||
{ | ||
let per_peer_state = self.per_peer_state.read().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, if we want to check again here, why bother with the two need_remove
/need_persist
lists in the first place? Wouldn't it be easier to keep the two steps then separate, ie. first just iterate all peers and repersist what needs persisting, then retain
and remove any prunable peers in one go? Seems like keeping the two steps separate might be simpler?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, this did end up a bit messier than I was hoping. One thing I still want to avoid, though, is holding the top-level write lock while persisting. Its not clear to me how to accomplish that without two passes (unless we want to do a loop-until-we-loop-and-dont-find-peers-to-remove which I guess we could but feels a bit wasteful). Do you have any suggestions there?
LSPS5 doesn't have inner peer locks, so we ultimately take the write lock inside the second loop (and into a persist!) as well. There the three-pass thing is definitely a weaker argument, but I'm still not sure if a loop-until-we-loop-and-dont-find-peers-to-remove is better.
} else { | ||
self.persist_peer_state(counterparty_node_id).await?; | ||
|
||
if self.persistence_in_flight.fetch_sub(1, Ordering::AcqRel) != 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use fetch_update
to avoid the race here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm? Not sure which race you're referring to. I copied this logic from PeerManager
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't there a potential race between fetch_sub
and store
below? Although granted I don't believe it would have much consequence, doing both in one operation just seemed cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, I actually think its correct. If we do a fetch_sub
and the number we get back is greater than 1, we still "own" the "mutex" - no other persist
should be in the loop. We can thus write one to reset the counter such that we won't loop too many times. fetch_update
isn't atomic, its actually a compare_and_swap
loop which I think doens't quite work.
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
6793e7b
to
1cb7cc8
Compare
While it would be correct to use `(their_network_key, intercept_scid)` as the idempotency key when handling `LSPS2ServiceEvent::OpenChannel`, we don't actually expect anyone to do so as it would require separate storage to track the `intercept_scid` -> opened channel mappings. Thus, we update the documentation to note that the correct idempotency key is `(their_network_key, user_channel_id)`.
1cb7cc8
to
3b7f95c
Compare
Rebased and addressed the issue @tnull identified. |
32959fb
to
b099dd6
Compare
Squashed without further changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment, otherwise LGTM
If we note that a peer should be removed in LSPS2/5 handling, we need to make sure that the peer wasn't re-added between dropping its state in memory and going to remove its state from disk. If it is, we need to overwrite the current on-disk state instead.
There are various race conditions between `persist` calls on the LSPS2 and LSPS5 services. Thus, we simply ensure that no two `persist` calls can be in-flight at the same time, ensuring that any new state updates between them will ultimately be persisted by re-persisting from the top in the originally-running `persist` call if another is started before it finishes.
b099dd6
to
a9ddf3f
Compare
Made the removal check in LSPS2 consistent, as well as made the counterparty node id nomenclature consistent: $ git diff-tree -U2 b099dd6976 a9ddf3f1c9
diff --git a/lightning-liquidity/src/lsps5/service.rs b/lightning-liquidity/src/lsps5/service.rs
index ca04ccf179..1111c682fb 100644
--- a/lightning-liquidity/src/lsps5/service.rs
+++ b/lightning-liquidity/src/lsps5/service.rs
@@ -275,10 +275,10 @@ where
}
- for counterparty_node_id in need_persist.into_iter() {
- debug_assert!(!need_remove.contains(&counterparty_node_id));
- self.persist_peer_state(counterparty_node_id).await?;
+ for client_id in need_persist.into_iter() {
+ debug_assert!(!need_remove.contains(&client_id));
+ self.persist_peer_state(client_id).await?;
}
- for counterparty_node_id in need_remove {
+ for client_id in need_remove {
let mut future_opt = None;
{
@@ -289,9 +289,9 @@ where
// itself.
let mut per_peer_state = self.per_peer_state.write().unwrap();
- if let Entry::Occupied(mut entry) = per_peer_state.entry(counterparty_node_id) {
+ if let Entry::Occupied(mut entry) = per_peer_state.entry(client_id) {
let state = entry.get_mut();
- if state.is_prunable() {
+ if state.is_prunable() && !self.client_has_open_channel(&client_id) {
entry.remove();
- let key = counterparty_node_id.to_string();
+ let key = client_id.to_string();
future_opt = Some(self.kv_store.remove(
LIQUIDITY_MANAGER_PERSISTENCE_PRIMARY_NAMESPACE,
@@ -312,5 +312,5 @@ where
future.await?;
} else {
- self.persist_peer_state(counterparty_node_id).await?;
+ self.persist_peer_state(client_id).await?;
}
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also double-checked this builds and tests on LDK Node.
Mostly some issues in persistence, but also one doc fix.