-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a health check for connection pools, redis being connected #256
Comments
The error in question seems to be a startup error affecting the built-in crypto/x509 libraries, at least in go 1.13.7. The failure to load the system roots is cached permanently on unix/linux. This is not used on Windows. During a period of networking outage, this could happen if the uniqush-push process was restarted (automatically?) when there were no available file descriptors - the lack of available file descriptors probably caused the singleton system certificate pool to fail to get loaded. Checking for a failure to establish connections to APNS would catch that. Alternatively, trying to verify all APNs client certificates in a health check might be more reliable // src/crypto/x509/verify.go
// Use Windows's own verification and chain building.
if opts.Roots == nil && runtime.GOOS == "windows" {
return c.systemVerify(&opts)
}
if opts.Roots == nil {
opts.Roots = systemRootsPool()
if opts.Roots == nil {
return nil, SystemRootsError{systemRootsErr}
}
}
// src/crypto/x509/root.go
var (
once sync.Once
systemRoots *CertPool
systemRootsErr error
)
func systemRootsPool() *CertPool {
once.Do(initSystemRoots)
return systemRoots
}
func initSystemRoots() {
systemRoots, systemRootsErr = loadSystemRoots()
if systemRootsErr != nil {
systemRoots = nil
}
}
// src/crypto/x509/verify.g
func (se SystemRootsError) Error() string {
msg := "x509: failed to load system roots and no roots provided"
if se.Err != nil {
return msg + "; " + se.Err.Error()
}
return msg
} |
https://golang.org/pkg/crypto/tls/#Certificate contains an https://golang.org/pkg/crypto/x509/#Certificate - https://golang.org/pkg/crypto/x509/#Certificate.Verify can be used to check if the certificate is still valid?
https://golang.org/pkg/crypto/x509/#InvalidReason has many possible reasons, such as expiry |
A straightforward fix for this failure mode would be to refuse to start up on non-windows OSes if loadSystemRoots failed. |
http://uniqush.org/documentation/usage.html does not have a health check. The closest is /version or /psps, but those don't establish that the connections are active.
For example, in extraordinary networking outages (outgoing traffic is unacknowledged, and no error or response is received), the available file handles for a process can be used up, and calls to tls.Dial would fail for the APNs binary protocol.
The affected code:
Possible checks:
It'd also be useful to know what leaked connections, if anything (apns, gcm, http clients to uniqush (probably not), or redis). Sadly, I didn't record this, but it may be possible to reproduce artificially (e.g. change all host names and dns servers to be networking black holes)
The text was updated successfully, but these errors were encountered: