Describe the bug
After switching from reqwest to bitreq in #82 (0.10.0), this crate's retry
loop in src/client/mod.rs classifies IoError/AddressNotFound/
RustlsCreateConnection as recoverable and retries up to max_retries times,
but the retries are functionally dead: each retry calls
self.http_client.send_async(request) on the same BitreqClient, which
caches a dead Arc<AsyncConnection> and returns it again. All retries fail on
the same dead socket, MaxRetriesExceeded bubbles up to the caller, and every
subsequent RPC call on that Client also fails indefinitely — only process
restart recovers.
The root cause is in bitreq: its connection pool never evicts on error. See
rust-bitcoin/corepc#562. But this crate's retry loop is
effectively load-bearing for users, so it's worth addressing here too.
Steps to reproduce
Against a stock bitcoind with the default rpcservertimeout=30:
- Client starts, opens a connection, issues an RPC, pool caches the socket.
- No RPC traffic for 30+ seconds.
bitcoind closes the idle socket server-side.
- Next RPC call: transport error (dead socket). Retries hit the same cached
dead socket. MaxRetriesExceeded returned.
- Every subsequent call: same failure, forever.
In our logs this looks like:
WARN Error calling bitcoin client err=IoError(…)
WARN connection error, retrying... err=IoError(…)
WARN Error calling bitcoin client err=IoError(…)
WARN connection error, retrying... err=IoError(…)
...
bitcoin-cli from the same host continues to work throughout, because each
bitcoin-cli invocation is a fresh process with a fresh connection. This
isolates the bug to the pooled HTTP client.
Offending code
src/client/mod.rs:197-244:
let response = self.http_client.send_async(request).await;
match response {
Ok(resp) => {
// …parse, handle status, return…
}
Err(err) => {
warn!(err = %err, "Error calling bitcoin client");
// Classify bitreq errors for retry logic
let should_retry = Self::is_error_recoverable(&err);
if !should_retry {
return Err(err.into());
}
}
}
retries += 1;
if retries >= self.max_retries {
return Err(ClientError::MaxRetriesExceeded(self.max_retries));
}
sleep(Duration::from_millis(self.retry_interval)).await;
There is nothing between the error and the next iteration that would force
self.http_client to discard the pooled connection. The retry reuses the same
BitreqClient, which returns the same dead Arc<AsyncConnection> from its
cache.
Platform(s)
Linux (x86)
Code of Conduct
Describe the bug
After switching from
reqwesttobitreqin #82 (0.10.0), this crate's retryloop in
src/client/mod.rsclassifiesIoError/AddressNotFound/RustlsCreateConnectionas recoverable and retries up tomax_retriestimes,but the retries are functionally dead: each retry calls
self.http_client.send_async(request)on the sameBitreqClient, whichcaches a dead
Arc<AsyncConnection>and returns it again. All retries fail onthe same dead socket,
MaxRetriesExceededbubbles up to the caller, and everysubsequent RPC call on that
Clientalso fails indefinitely — only processrestart recovers.
The root cause is in
bitreq: its connection pool never evicts on error. Seerust-bitcoin/corepc#562. But this crate's retry loop is
effectively load-bearing for users, so it's worth addressing here too.
Steps to reproduce
Against a stock
bitcoindwith the defaultrpcservertimeout=30:bitcoindcloses the idle socket server-side.dead socket.
MaxRetriesExceededreturned.In our logs this looks like:
bitcoin-clifrom the same host continues to work throughout, because eachbitcoin-cliinvocation is a fresh process with a fresh connection. Thisisolates the bug to the pooled HTTP client.
Offending code
src/client/mod.rs:197-244:There is nothing between the error and the next iteration that would force
self.http_clientto discard the pooled connection. The retry reuses the sameBitreqClient, which returns the same deadArc<AsyncConnection>from itscache.
Platform(s)
Linux (x86)
Code of Conduct