-
Notifications
You must be signed in to change notification settings - Fork 134
Description
Context
In Kubernetes, from the moment you issue kubectl delete pod command and until the Pod is deleted there are a few steps that happen:
- endpoint is being removed from the Endpoints (k8s object);
- if the pod has a
preStophook set, it will run it beforeSIGTERMis invoked; - control-plane fires an event to Kube-proxy, CoreDNS, Ingress controller to deregister the Pod's IP address, and no further traffic should be sent to it
AND, in parallel, after thepreStophook finishes,
app receivesSIGTERMand, if it's able to process it, it starts graceful shutdown; otherwise - wait for theterminationGracePeriodSeconds(default 30s) period to pass, and thenSIGKILLis fired; - pod is deleted;
Graceful shutdown
In order to achieve a graceful shutdown, we must satisfy the following condition(s): no traffic is sent to the non-existent IP (pod already deleted). That should be done in Step 3 described above, but, since those components (kube-proxy, coredns, ingress controller) might be busy with something else, there is no guarantee that the IP will be removed from their state before the Pod is gone. How long would it take? It depends; some of them might take less than a second, the others a bit longer.
Race condition
As mentioned, deregistration of the IP from kube-proxy, CoreDNS, ingress controller, and SIGTERM sent to the APP happens in parallel, which can cause a few race conditions, one of them is: what if pod is deleted before the IP is deregistered? That could be a problem, since traffic might be sent to a non-existent IP.
Issue statement
Graceful shutdown of SolrCloud. Currently, we use preStop hook where we run solr stop -p 8983 (which kernel behind the scenes sends SIGQUIT to the process) which stop solr instances on port 8983 that run in the background.
But, as we already know, preStop hook (step 2) is executed before kube-proxy, coredns, ingress controller received the event to deregister the IP address from their local state (step 3) and it will stop the Solr instance before deregistering its IP, , thus, traffic will be sent to a non-existent IP.
A few ways to handle that:
- custom input for
lifecycle.preStopparameter that can be passed by the user and will overwrite the defaultsolr stop -p 8983; - additional command before
solr stop -p 8983inpreStophook that can be passed by the user, i.e.lifecycle.preStop.cmd: sleep 30, when merged, we'll get something likesleep 30 && solr stop -p 8983; - a bit harsh: do nothing and wait for
SIGKILL; this way we'll have better chances that withinterminationGracePeriodSeconds(default 30s) the kube-proxy, coredns, ingress controller will deregister the IP and no traffic is sent to the Pod, and whenSIGKILLfires - pod gets forcefully deleted.
What could be the other available options?