Get rid of Pod was terminated in response to imminent node shutdown.
Pods forever.
Table of Contents
In Kubernetes, pods can remain in a broken state for a long time if graceful shutdown is enabled. This state results in an alert getting fired by kube-prometheus-stack.
[FIRING:1] Pod has been in a non-ready state for more than 15 minutes.
Severity: warning
Description: Pod default/some-container-7fb4c4fbc5-gbjwm has been in a non-ready state for longer than 15 minutes.
Details:
• alertname: KubePodNotReady \
• namespace: default \
• pod: some-container-7fb4c4fbc5-gbjwm \
• prometheus: observability/kube-prometheus-stack-prometheus \
• severity: warning
Most of the solutions
on the internet describe an uncontrolled deletion of all Pods in Error
or Terminated
state.
Wich I consider as a bad idea, because you will not see anymore if real Error
Pods are in your system.
These manifests provide a kubernetes CronJob
deleting constantly all Pods with given criterias.
You can apply the manifests manually:
kubectl apply -f https://raw.githubusercontent.com/tyriis/i-see-dead-pods/main/manifests/service-account.yaml
kubectl apply -f https://raw.githubusercontent.com/tyriis/i-see-dead-pods/main/manifests/rbac.yaml
kubectl apply -f https://raw.githubusercontent.com/tyriis/i-see-dead-pods/main/manifests/cronjob.yaml
or with kustomize
or with flux, see helmrelease.