You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
Discovered that the cronjob stops running while in a running state. The following logs were generated in kubelet:
Nov 20 06:50:36 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: E1120 06:50:36.124178 3828 remote_runtime.go:366] "StopContainer from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" containerID="60d9681fc1f51cca1dd96c6694b145587c0a0ebd1faea39eed9eb209634bba9e"
Nov 20 06:50:36 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: E1120 06:50:36.124233 3828 kuberuntime_container.go:784] "Container termination failed with gracePeriod" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" pod="hbt/hbt-cronjob-x-28867922-nh7sp" podUID="1db2521d-ce5b-4cf9-94c8-b6cdf7450e4c" containerName="x-test" containerID="containerd://60d9681fc1f51cca1dd96c6694b145587c0a0ebd1faea39eed9eb209634bba9e" gracePeriod=30
Nov 20 06:50:36 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: E1120 06:50:36.124253 3828 kuberuntime_container.go:822] "Kill container failed" err="rpc error: code =DeadlineExceeded desc = context deadline exceeded" pod="hbt/hbt-cronjob-x-28867922-nh7sp" podUID="1db2521d-ce5b-4cf9-94c8-b6cdf7450e4c" containerName="x-test" containerID={"Type":"containerd","ID":"60d9681fc1f51cca1dd96c6694b145587c0a0ebd1faea39eed9eb209634bba9e"}
Nov 20 06:51:22 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: E1120 06:51:22.265655 3828 remote_runtime.go:366] "StopContainer from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" containerID="bf0bf19f3e0dde9cbb488a5a0badc5815c95cac630aa4050e664339e1e1be263"
Nov 20 06:51:22 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: E1120 06:51:22.265720 3828 kuberuntime_container.go:784] "Container termination failed with gracePeriod" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" pod="hbt/hbt-cronjob-x-28867982-pnbpf" podUID="78c53ed8-d7bf-4f1a-a0b6-4a22cbb0623b" containerName="x-test" containerID="containerd://bf0bf19f3e0dde9cbb488a5a0badc5815c95cac630aa4050e664339e1e1be263" gracePeriod=30
Nov 20 06:51:22 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: E1120 06:51:22.265745 3828 kuberuntime_container.go:822] "Kill container failed" err="rpc error: code =DeadlineExceeded desc = context deadline exceeded" pod="hbt/hbt-cronjob-x-28867982-pnbpf" podUID="78c53ed8-d7bf-4f1a-a0b6-4a22cbb0623b" containerName="x-test" containerID={"Type":"containerd","ID":"bf0bf19f3e0dde9cbb488a5a0badc5815c95cac630aa4050e664339e1e1be263"}
Nov 20 06:51:22 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: E1120 06:51:22.891077 3828 remote_runtime.go:222] "StopPodSandbox from runtime service failed" err="rpcerror: code = DeadlineExceeded desc = context deadline exceeded" podSandboxID="fa8984ad470a4a35388386e60c3e4dc250761b61f25bc3a0ddced413e677f264"
Nov 20 06:51:22 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: E1120 06:51:22.891134 3828 kuberuntime_manager.go:1389] "Failed to stop sandbox" podSandboxID={"Type":"containerd","ID":"fa8984ad470a4a35388386e60c3e4dc250761b61f25bc3a0ddced413e677f264"}
Nov 20 06:51:22 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: E1120 06:51:22.891189 3828 kubelet.go:2058] [failed to "KillContainer" for "node" with KillContainerError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded", failed to "KillPodSandbox" for "16bdee7d-159d-4344-b70c-d6cdd133520d" with KillPodSandboxError: "rpc error: code = DeadlineExceeded desc = context deadline exceeded"]
Nov 20 06:51:22 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: E1120 06:51:22.891202 3828 pod_workers.go:1298] "Error syncing pod, skipping" err="[failed to \"KillContainer\" for \"node\" with KillContainerError: \"rpc error: code = DeadlineExceeded desc = context deadline exceeded\", failed to \"KillPodSandbox\" for \"16bdee7d-159d-4344-b70c-d6cdd133520d\" with KillPodSandboxError: \"rpc error: code = DeadlineExceeded desc = context deadline exceeded\"]" pod="jenkins/node" podUID="16bdee7d-159d-4344-b70c-d6cdd133520d"
Nov 20 06:51:23 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: I1120 06:51:23.740012 3828 kuberuntime_container.go:779] "Killing container with a grace period" pod="jenkins/node" podUID="16bdee7d-159d-4344-b70c-d6cdd133520d" containerName="node" containerID="containerd://f7067ff850c7940e3c0c963e506810966ada83c1ed4b86760b421b5136734df7" gracePeriod=30
Nov 20 06:51:23 ip-10-130-21-94.ap-northeast-2.compute.internal kubelet[3828]: I1120 06:51:23.743690 3828 status_manager.go:863] "Pod was deleted and then recreated, skipping status update" pod="jenkins/node" oldPodUID="16bdee7d-159d-4344-b70c-d6cdd133520d" podUID="32f2e071-f648-46e7-b00c-ff2b1fc9258f"
When attempting to remove the container using crictl, the following logs were generated by containerd:
Nov 21 01:53:00 ip-10-130-3-113.ap-northeast-2.compute.internal containerd[1111853]: time="2024-11-21T01:53:00.777374390Z" level=info msg="Kill container \"298d33cc227f7cfe87259d109e904d026120e4c74332ed00c80e08648cc050d3\""
Nov 21 01:53:02 ip-10-130-3-113.ap-northeast-2.compute.internal containerd[1111853]: time="2024-11-21T01:53:02.777460445Z" level=error msg="StopContainer for \"298d33cc227f7\" failed" error="rpc error: code = DeadlineExceeded desc = an error occurs during waiting for container \"298d33cc227f7cfe87259d109e904d026120e4c74332ed00c80e08648cc050d3\" to be killed: wait container \"298d33cc227f7cfe87259d109e904d026120e4c74332ed00c80e08648cc050d3\": context deadline exceeded"
Nov 21 01:53:04 ip-10-130-3-113.ap-northeast-2.compute.internal containerd[1111853]: time="2024-11-21T01:53:04.284946616Z" level=error msg="StopPodSandbox for \"f793baa40a74beedd902140d007d16bab953a0c4ae1c8005f9216481f97db1df\" failed" error="rpc error: code = DeadlineExceeded desc = failed to stop container \"a30c5fecddce111da50a3cd3689d6b08f6e6d7e33f2428bebf5a64fbf3d0f22f\": an error occurs during waiting for container \"a30c5fecddce111da50a3cd3689d6b08f6e6d7e33f2428bebf5a64fbf3d0f22f\" to be killed: wait container \"a30c5fecddce111da50a3cd3689d6b08f6e6d7e33f2428bebf5a64fbf3d0f22f\": context deadline exceeded"
Nov 21 01:53:04 ip-10-130-3-113.ap-northeast-2.compute.internal containerd[1111853]: time="2024-11-21T01:53:04.311928723Z" level=info msg="StopContainer for \"a30c5fecddce111da50a3cd3689d6b08f6e6d7e33f2428bebf5a64fbf3d0f22f\" with timeout 180 (s)"
Nov 21 01:53:04 ip-10-130-3-113.ap-northeast-2.compute.internal containerd[1111853]: time="2024-11-21T01:53:04.312501683Z" level=info msg="Skipping the sending of signal terminated to container \"a30c5fecddce111da50a3cd3689d6b08f6e6d7e33f2428bebf5a64fbf3d0f22f\" because a prior stop with timeout>0 request already sent the signal"
What you expected to happen:
Containers should be terminated and created normally without interrupting the cronjob's execution. How to reproduce it (as minimally and precisely as possible):
Set up a Kubernetes cluster with containerd versions 1.7.22 or 1.7.23
Deploy a cronjob and wait few hours
Observe container termination and creation processes (cronjob lifecycle)
Look for DeadlineExceeded errors in kubelet and containerd logs Environment:
AWS Region: ap-northeast-2
Instance Type(s): m7i-flex
Cluster Kubernetes version: 1.30
Node Kubernetes version: v1.30.6-eks-94953ac
AMI Version: 1.30.6-20241115
The text was updated successfully, but these errors were encountered:
What happened:
Discovered that the cronjob stops running while in a running state. The following logs were generated in kubelet:
When attempting to remove the container using crictl, the following logs were generated by containerd:
What you expected to happen:
Containers should be terminated and created normally without interrupting the cronjob's execution.
How to reproduce it (as minimally and precisely as possible):
Set up a Kubernetes cluster with containerd versions 1.7.22 or 1.7.23
Deploy a cronjob and wait few hours
Observe container termination and creation processes (cronjob lifecycle)
Look for DeadlineExceeded errors in kubelet and containerd logs
Environment:
AWS Region: ap-northeast-2
Instance Type(s): m7i-flex
Cluster Kubernetes version: 1.30
Node Kubernetes version: v1.30.6-eks-94953ac
AMI Version: 1.30.6-20241115
The text was updated successfully, but these errors were encountered: