Conversation
7958eb3 to
af042d1
Compare
|
this PR is introducing
relevant statistics on the repo:
these numbers indicate it could potentially remain well maintained for mid-long term, but I'll try tagging @Amanieu here to hear his plans for it |
| @@ -0,0 +1,20 @@ | |||
| use parking_lot::RwLock as InnerRwLock; | |||
There was a problem hiding this comment.
parking_lot has interesting potential
but we should evaluate those trade-offs later
the proposal here is to add a variant to custom_mutex.rs which uses std::sync::Mutex, so we should stick with the std::sync::RwLock to reduce review complexity here
6dda40e to
1eb2abe
Compare
|
On the apps side, we are moving from a single lock guarding a wide structure with multiple embedded types to a micro-locking model, where each type owns its own lock. During the DashMap migration and the trimming of During last week’s Vintuem call, we also discussed that introducing Implementation nuances As we move toward micro-locks per structure, lock ergonomics become more challenging. Relying solely on closure-based “safe” APIs quickly leads to deeply nested callbacks, which hurts readability and maintainability: a.safe_read(|x| {
b.safe_write(|x| {
c.safe_read(|x| {
})
})
})To avoid this, the custom implementation provides both closure-based safe read/write APIs and explicit read/write guard APIs. These guards are Poisoning semantics One important distinction to highlight is lock poisoning, where the standard library and
This difference directly impacts the |
93107f9 to
740bca8
Compare
|
Out of curiosity, did you guys consider using https://crates.io/crates/arc-swap? |
stratum-apps/src/custom_rw_lock.rs
Outdated
| impl<T: ?Sized> Drop for ReadGuard<'_, T> { | ||
| fn drop(&mut self) { | ||
| if !self.released.load(Ordering::Acquire) { | ||
| panic!( | ||
| "ReadGuard dropped without explicit release(); \ | ||
| this is a bug. Call release() to acknowledge lock lifetime." | ||
| ); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
consider the following scenario.
If this is used across the apps and a panic occurs during message processing before released = true, we trigger a second panic: one from the failing downstream operation, and another from Drop.
Like I did below:
#[test]
fn test_poisoning() {
let lock = RwLock::new(0);
let mut guard = lock.write().unwrap();
*guard = 1;
*guard = 2;
panic!("Simulated panic while holding write lock");
}Because the second panic happens inside Drop, the runtime aborts:
thread 'custom_rw_lock::tests::test_poisoning' panicked at src/custom_rw_lock.rs:250:9:
Simulated panic while holding write lock
thread 'custom_rw_lock::tests::test_poisoning' panicked at src/custom_rw_lock.rs:152:13:
WriteGuard dropped without explicit release(); this is a bug. Call release() to acknowledge lock lifetime.
stack backtrace:
[trimmed]
thread 'custom_rw_lock::tests::test_poisoning' panicked at library/core/src/panicking.rs:226:5:
panic in a destructor during cleanup
thread caused non-unwinding panic. aborting.
Instead of just dropping the faulty client, the entire process aborts.
There was a problem hiding this comment.
A panic is always fatal, that’s the rule of thumb so in that sense it’s acceptable. Given how much effort we’ve put into improving error handling, we shouldn’t be hitting panics in normal operation anyway.
That said, this does highlight a design concern. With the use cases we have, the current locking pattern doesn’t fully mimic the control flow flexibility we get with closures. In a closure, we can return early and rely on scope-based cleanup. In these read/write cases, we don’t have that same safety.
For example:
let xyz = channel_manager.read().unwrap();
...
f()?; // if this returns early, we never explicitly release the lock
...
xyz.release();If an early return happens, we lose the opportunity to release the lock, which could lead to a panic or a poisoned state.
I need to think more about how to structure this so we don’t end up shooting ourselves in the foot.
There was a problem hiding this comment.
I have just removed the explicit lifetime context altogether, its not possible in rust to gain closure type linear typing during runtime,
4995b82 to
66d6d43
Compare
|
Keeping this pr simple closure based only. |
ACK, did you removed the unit tests by accident? |
closes: #255