`pool_sv2`: Stale client entries in `/api/v1/clients` after TCP disconnection

The `/api/v1/clients` endpoint reports clients that no longer have an active TCP connection to the pool. This produces a mismatch between what the monitoring API reports and what is actually connected.

## Reproduction Evidence

Running the following commands on the pool server shows a clear discrepancy:
```
$ ss -tn | grep 3333
ESTAB 0 0 <POOL_IP>:3333  <CLIENT_IP_1>:60497
ESTAB 0 0 <POOL_IP>:3333  <CLIENT_IP_1>:54639
ESTAB 0 0 <POOL_IP>:3333  <CLIENT_IP_2>:56512
ESTAB 0 0 <POOL_IP>:3333  <CLIENT_IP_1>:58306
```

4 active TCP connections, but the monitoring API reports 5 clients, two of which have 0 channels and -0 hashrate:
```
$ curl -s http://0.0.0.0:9090/api/v1/clients | jq
{
  "offset": 0, "limit": 25, "total": 5,
  "items": [
    { "client_id": 174, "extended_channels_count": 1, "standard_channels_count": 0, "total_hashrate": 626961400000 },
    { "client_id": 2,   "extended_channels_count": 0, "standard_channels_count": 0, "total_hashrate": -0 },
    { "client_id": 175, "extended_channels_count": 0, "standard_channels_count": 1, "total_hashrate": 944576860000 },
    { "client_id": 146, "extended_channels_count": 1, "standard_channels_count": 0, "total_hashrate": 5661616000000 },
    { "client_id": 7,   "extended_channels_count": 0, "standard_channels_count": 0, "total_hashrate": -0 }
  ]
}
```

Client IDs 2 and 7 have no active TCP connections, no channels, and zero hashrate — yet they persist in the API response.

## Hypothesis (needs deeper analysis)

The pool has a `remove_downstream()` function and a `DownstreamShutdown` state handler that should clean up disconnected clients:

https://github.com/stratum-mining/sv2-apps/blob/a13b6435a340c955a2e21cb73f67ac23e6db275f/pool-apps/pool/src/lib/channel_manager/mod.rs#L424-L435

https://github.com/stratum-mining/sv2-apps/blob/a13b6435a340c955a2e21cb73f67ac23e6db275f/pool-apps/pool/src/lib/mod.rs#L229-L240

The monitoring API's `get_sv2_clients()` reads directly from the downstream `HashMap`:

https://github.com/stratum-mining/sv2-apps/blob/a13b6435a340c955a2e21cb73f67ac23e6db275f/pool-apps/pool/src/lib/monitoring.rs#L85-L97

One possible hypothesis is that `State::DownstreamShutdown` is only sent on graceful disconnections, and abrupt TCP disconnections (RST, network timeout, etc.) fail to trigger the cleanup path — leaving stale entries in the downstream HashMap. However, this needs deeper analysis to confirm; there may be other explanations.

## Impact

• Monitoring API reports inflated client counts
• Zombie entries with total_hashrate: -0 suggest uninitialized/stale state
• Makes it harder to diagnose real connectivity issues

## Environment

• pool_sv2 running on Linux
• Observed with multiple miner clients connecting simultaneously

	pub fn remove_downstream(
	&self,
	downstream_id: DownstreamId,
	) -> PoolResult<(), error::ChannelManager> {
	self.channel_manager_data.super_safe_lock(\|cm_data\| {
	cm_data.downstream.remove(&downstream_id);
	cm_data
	.vardiff
	.retain(\|key, _\| key.downstream_id != downstream_id);
	});
	Ok(())
	}

	message = status_receiver.recv() => {
	if let Ok(status) = message {
	match status.state {
	State::DownstreamShutdown{downstream_id,..} => {
	warn!("Downstream {downstream_id:?} disconnected — cleaning up channel manager.");
	// Remove downstream from channel manager to prevent memory leak
	if let Err(e) = channel_manager_for_cleanup.remove_downstream(downstream_id) {
	error!("Failed to remove downstream {downstream_id:?}: {e:?}");
	cancellation_token.cancel();
	break;
	}
	}

	fn get_sv2_clients(&self) -> Vec<Sv2ClientInfo> {
	// Clone Downstream references and release lock immediately to avoid contention
	// with template distribution and message handling
	let downstream_refs: Vec<Downstream> = self
	.channel_manager_data
	.safe_lock(\|data\| data.downstream.values().cloned().collect())
	.unwrap_or_default();

	downstream_refs
	.iter()
	.filter_map(downstream_to_sv2_client_info)
	.collect()
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`pool_sv2`: Stale client entries in `/api/v1/clients` after TCP disconnection #319

Reproduction Evidence

Hypothesis (needs deeper analysis)

Impact

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

pool_sv2: Stale client entries in /api/v1/clients after TCP disconnection #319

Description

Reproduction Evidence

Hypothesis (needs deeper analysis)

Impact

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`pool_sv2`: Stale client entries in `/api/v1/clients` after TCP disconnection #319