Conversation
2ecf9d3 to
5a23188
Compare
247919a to
281946f
Compare
average-gary
left a comment
There was a problem hiding this comment.
Two concurrency concerns flagged below.
| let Some(downstream) = channel_manager_data.downstream.get(&downstream_id) else { | ||
| return Err(PoolError::disconnect(PoolErrorKind::DownstreamNotFound(downstream_id), downstream_id)); | ||
| }; | ||
| let Some(downstream) = self.downstream.get(&downstream_id) else { |
There was a problem hiding this comment.
DashMap guard held across .await
Unlike every other handler in this file, handle_update_channel doesn't use the closure pattern to scope DashMap guards. The downstream Ref acquired here lives through the for message in messages { message.forward(...).await; } loop at the bottom, blocking the entire shard for the duration of the async send.
Wrap the body in a closure like the other handlers do:
let process_update_channel = || {
let Some(downstream) = self.downstream.get(&downstream_id) else { ... };
// ... build messages ...
Ok(messages)
};
let messages = process_update_channel()?;There was a problem hiding this comment.
The downstream object gets dropped as soon as its stop being used and we are calling await at the very end, and their scope doesn't intersect.
| let vardiff_key = vardiff.key().clone(); | ||
| let vardiff_state = vardiff.value_mut(); | ||
| let downstream_id = &vardiff_key.downstream_id; | ||
| let channel_id = &vardiff_key.channel_id; |
There was a problem hiding this comment.
Deadlock risk: inverted lock ordering with submit handlers
This loop holds three nested DashMap guards simultaneously: self.vardiff (iter_mut) → self.downstream (get_mut) → downstream.standard_channels (get_mut).
The submit handlers acquire these in the opposite order: self.downstream → standard_channels → self.vardiff.
Under shard collision this is a classic lock-ordering deadlock. Consider collecting the keys first to avoid holding the vardiff iter guard while acquiring the others:
let keys: Vec<_> = self.vardiff.iter().map(|r| r.key().clone()).collect();
for key in keys {
let Some(mut vardiff) = self.vardiff.get_mut(&key) else { continue };
// ...
drop(vardiff); // or scope it tightly
}There was a problem hiding this comment.
TBH, I don't understand this. ;)
| let messages = self.channel_manager_data.super_safe_lock(|channel_manager_data| { | ||
| let Some(downstream) = channel_manager_data.downstream.get_mut(&downstream_id) else { | ||
| return Err(PoolError::disconnect(PoolErrorKind::DownstreamIdNotFound, downstream_id)); | ||
| let process_open_standard_mining_channel = || { |
There was a problem hiding this comment.
Why did you put this here instead of the let messages = ?
There was a problem hiding this comment.
The reason we require a closure here is that the block contains return statements. Without the closure, those returns would exit the entire handler method instead of just the block.
That said, I am not a big fan of this pattern anymore. It originally existed to work with the nested locking pattern we had before. Since that is no longer the case, we don’t really need this structure anymore. The code can likely be simplified to something much leaner and easier to reason about.
There was a problem hiding this comment.
I would simplify this block though, and I would do the same in all the other places where you introduced this closure.
There was a problem hiding this comment.
For sure, I am currently doing that. Will push changes in sometime
There was a problem hiding this comment.
Before merging this, I would squash them in the previous commits accordingly.
There was a problem hiding this comment.
For sure, they were there for ease of review.
There was a problem hiding this comment.
Commits are squashed now.
pool-apps/pool/src/lib/channel_manager/mining_message_handler.rs
Outdated
Show resolved
Hide resolved
pool-apps/pool/src/lib/channel_manager/mining_message_handler.rs
Outdated
Show resolved
Hide resolved
pool-apps/pool/src/lib/channel_manager/mining_message_handler.rs
Outdated
Show resolved
Hide resolved
pool-apps/pool/src/lib/channel_manager/template_distribution_message_handler.rs
Outdated
Show resolved
Hide resolved
2645ba9 to
cc66092
Compare
| if downstream.requires_custom_work.load(Ordering::SeqCst) { | ||
| error!("OpenStandardMiningChannel: Standard Channels are not supported for this connection"); | ||
| let open_standard_mining_channel_error = OpenMiningChannelError { | ||
| let send_error = |error_code: &'static str| async { |
There was a problem hiding this comment.
Shouldn't this become a function in utils.rs, which can be called from every place where we need it?
I see it's currently defined and repeated multiple times.
There was a problem hiding this comment.
If we look at the implementation of all such closure, we can see that it points to a very specific error message tied to the method.
There was a problem hiding this comment.
It's not completely true, because we have some cases where the error message is exactly the same.
For example, we have two identical closures for the OpenMiningChannelError.
Since it seems something which can be used for different error messages, why can't it be a function in utils.rs, where you can also pass the error message you want, and it does the job (probably matching on the error message which is passed) ?
There was a problem hiding this comment.
Passing the entire error message to a helper method somewhat defeats the purpose of the closure in the first place. The closure was introduced to avoid constructing the error message repeatedly and to eliminate boilerplate across multiple call sites when the only variation is the error code.
There was a problem hiding this comment.
With one helper method you put the logic which is inside the different closures only in one place, but you can use it for different error messages, and then call it from different contexts to send a specific error message with a specific error code.
Example:
forward_error_message_to_channel_manager(error_message_type, error_code)
and then:
forward_error_message_to_channel_manager(OPEN_MINING_CHANNEL_ERROR_MESSAGE_TYPE, "standard-channels-not-supported-for-custom-work")
or
forward_error_message_to_channel_manager(SET_CUSTOM_MINING_JOB_ERROR_MESSAGE_TYPE, "pool-payout-script-missing")
There was a problem hiding this comment.
unless I'm missing context, which is a very real possibility
There was a problem hiding this comment.
I don't think that should be an issue to be tracked for, its just a helper closure to remove repetitive message construction during method execution.
There was a problem hiding this comment.
there seems to be two topics of discussion here:
RouteTo- closure
I'm just pointing out that (IIUC) @GitGab19 said we need an issue to keep track of 2. (replying to your ping), #330 was presented as the answer, while it's scope only covers 1
I'm fine if we decide to move forward without addressing the concerns raised about closure convolution, I'm just trying to make sure we're all on the same page and not masquerading one issue with another
anyways, I'm hitting the road in a bit so won't be able to do a deep dive on this PR today so I'll leave it for you guys to figure it out
There was a problem hiding this comment.
I am removing the closure.
There was a problem hiding this comment.
Should be good now, updated the commits. IT passes. :)
cc66092 to
49b0f44
Compare
49b0f44 to
5a88034
Compare
5a88034 to
6f19477
Compare
closes: #205