-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pod restarts without clear reason (timeout when doing GET to configmap) #3126
Comments
@gals-ma, can you provide more info on this error? Before it occurred, was there any upgrade, deletions or something else? Can you provide more logs before the error lines, so we can better understand the situation? You can also send the logs to k8s-alb-controller-triage AT amazon.com |
@oliviassss nothing specific happened at the same time, it happens to us from time to time In addition, is there any reason to have 2 replicas of lb-contoller? isn't it a problem with quarom when electing a leader? |
Can you also please share more information about this error? |
@gals-ma, the two replicas are in active-standby mode. The issue is not with the controller itself, but the API server is not responding to the controller request. It could either be due to network connectivity issues between the controller and the API server or SG permissions preventing access. Does your controller recover eventually? |
@kishorj @oliviassss So after talking with AWS it was found out that the issue was actually due to the etcd being defragmented and the load-balancer-controller is getting timed out reaching to the etcd server. so my questions are-
Thanks again. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
I also encounter this issue. It will be restarted by k8s, but I'm not sure what is affected. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
Every k8s controller that uses leader election relies on apiserver to elect leader and renew lease. APIServer uses etcd as backing store.
ALB controller uses controller-runtime which support setting the lease duration and retry period. It's expected to see a restart when leader loses lease. Related discussion: kubernetes-sigs/controller-runtime#1774 (comment) |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
I'm also facing this issue in the ALB controller here is the logs I found before restart. (Thanks to pod-restart-info-collector)
Could someone help in this? |
Same here! any update on this? My log BTW :
|
@oliviassss can you help us in this issues if there is anything in AWS EKS we have to perform do something in configuration then suggest. thnx in advance! |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Any Updates? |
@hong539 , I’ve tried several approaches, but it’s still unclear to me what caused the pod to restart. please consider reopening the ticket, as the issue is still unresolved. |
Describe the bug
aws-lb-contoller restarts unexpectedly (happened multiple times already) when doing GET to configmap.
Steps to reproduce
Unknown
Expected outcome
Retry mechanism
Environment
Additional Context:
Pod logs before the restart:
The text was updated successfully, but these errors were encountered: