-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dgraph errors on restart #1909
Comments
Without involving dgraph, locally, if I create a PVC and a pod using it with the standard storage class, write some data to it, and then restart docker I see the pod come back with the data still there. Since the pods are running, I don't think this is a kind bug or even a local-path-storage bug, this sounds like either an issue with your specific setup (the preemptible VMs?) or dgraph. |
@BenTheElder Hmm. Thanks for confirming. I am not sure where to start looking. In my case I am using the same boot disk and attached disk with the pre-emptible VM, but I don't think the pre-emptible VM should be the issue cause this also occurs just during Is there some place you would typically go to get the logs for debugging such failures? (I have already shared the Dgraph pod logs with the Dgraph team, anything else that would help in solving this?) Btw, the data within the PVC does exist and is bound to the pod. For instance, if I exec inside, I find this: If you notice the date, those are the folders from the PVC mounted to Dgraph pod which were created 1 day ago (before the pre-emptible VM restart). So, the data does exist in the PVC as you rightly mentioned but Dgraph fails. Thanks. I will keep this open till the Dgraph team comes up with their side of the puzzle after which we can probably get an idea on where the issue is. Will continue debugging. |
A rolling restart will normally not involve all of the pods simultaneously going down, something more like one at a time. The data in the PVC is ultimately backed by a docker volume at the node level, which should persist with the container through restarts as long as docker's storage is persisted on the host.
I don't think there's anything to debug on the kind end, and I don't work with dgraph. |
This comment https://discuss.dgraph.io/t/dgraph-fails-to-start-on-restarts-with-kind-kubernetes/11104/13 makes it sound like dgraph cannot handly abruptly being shut down. There's not a lot kind can do there if that's the case, we have no control over your manipulation of the docker service. KIND will not itself shut things down abruptly (unless you delete the cluster), and we can't exactly catch KILL from docker or similar so ... KIND can guarantee that the cluster survives restarts just fine, but if applications break under less than graceful restarts, there's nothing we can do. We never force such a restart ourselves. |
@BenTheElder Thanks for looking at this issue. With some help from Dgraph team as well, I have managed to get it working. Looks like the issue is neither with Kind nor with the Local path provisioner as you rightly said. I have updated the details here: https://discuss.dgraph.io/t/dgraph-fails-to-start-on-restarts-with-kind-kubernetes/11104/14?u=tvvignesh Thanks again. Closing this. |
What happened:
Hi. I am using kind within a pre-emptible VM from GCP (so, my VM and thereby kind restarts every 24 hours). I was trying to get Dgraph setup with Kind and it worked great. But, on restart of the VM or Docker service, the pod throws errors.
I am using the standard storage class (I guess it uses rancher's local path provisioner). I am not sure if it is an issue with Kind or Dgraph, so I have added all details here with all the details and logs: https://discuss.dgraph.io/t/dgraph-fails-to-start-on-restarts-with-kind-kubernetes/11104
What you expected to happen:
Dgraph works consistently even after restarts. I guess according to #148 this should work.
How to reproduce it (as minimally and precisely as possible):
standard
storage class in kind which uses rancher's provisioner (https://github.com/rancher/local-path-provisioner) withReadWriteOnce
setsudo service docker restart
It works again only after I destroy the entire cluster and create it again.
Just to validate if normal pod restart works, I tried running
kubectl -n db rollout restart statefulset dgraph-dgraph-alpha
and everything was great.Anything else we need to know?:
More details, logs and other details have been added here: https://discuss.dgraph.io/t/dgraph-fails-to-start-on-restarts-with-kind-kubernetes/11104
Environment:
kind version
): kind v0.9.0 go1.15.2 linux/amd64kubectl version
):v1.19.3
docker info
):19.03.13
/etc/os-release
): Ubuntu 20.04.1 LTSThe text was updated successfully, but these errors were encountered: