-
Notifications
You must be signed in to change notification settings - Fork 42k
Description
When defining a liveness or readiness probe of type exec with a timeout, like this
readinessProbe:
exec:
command:
- "curl"
- "-f"
- "localhost:8080/slow?timeInSecs=10"
initialDelaySeconds: 5
periodSeconds: 3
timeoutSeconds: 1
where the response is taking longer than the defined timeoutSeconds, the pod is still considered running (in case of a liveness probe) resp. resp. reaches ready state (in case of a readiness probe).
What I would expect is that the command is aborted after the timeout is reached and that the probe is considered failed.
Instead, what I am observing when I exec into the container is that the readinessProbe command "hangs" for ten seconds, then completes
Every 0.2s: ps aux 2020-08-18 11:35:03
PID USER TIME COMMAND
1 root 0:07 /opt/openjdk-16/bin/java -jar /slow-1.0.0-exec.jar
50 root 0:00 bash
62 root 0:00 watch -d -n 0.2 ps aux
5368 root 0:00 curl -f localhost:8080/slow?timeInSecs=10
5386 root 0:00 ps aux
and the pod reaches ready state:
slow-deployment-5bbc7f667d-w2lgr 1/1 Running 0 14m
This was first observed on an EKS cluster (v1.15.11-eks), but I also tried various other Kubernetes versions via minikube, e.g., 1.16.14, 1.17.11, and 1.18.8, same behavior.
Attached you can find a YAML file (slow-deployment.zip) with a sample deployment, a ZIP with the very simple SpringBoot application (app-src.zip), and a ZIP with a Dockerfile (slow-docker.zip, plus a shell script which runs the same healthcheck).
For convenience, the image is also available on Dockerhub.
Note that the Dockerfile also defines a Docker healthcheck. Running this container on plain Docker gives the expected behavior:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS
e58e15f84fca slow:1.0.0 "/opt/openjdk-16/bin…" 51 seconds ago Up 22 seconds (unhealthy)