Skip to content

Pods termination grace period seconds are not executed as expected. #109352

@Daryl-He

Description

@Daryl-He

What happened?

I'm using minikube to set up a cluster locally and trying to test the grace period termination of pods, relating this doc https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination, the value of terminationGracePeriodSeconds should be the total(including prestop time) value and decided how long the container can proceed before killed, the issue I found is the pod termination grace period seconds are not executed as expected.

I have configured the 10s for prestop and also 30s terminationGracePeriodSeconds.

What I expect is
when the pod is being killed, the container should have maximum time 30s to finish the in-flight requests before termination.

What happened is
No matter how many times I tested, the pod was terminated after around 40s, from what I observed is that this value is the sum of the prestop and terminationGracePeriodSeconds, this is not matched with this doc https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
it was mentioned that

The Pod's termination grace period countdown begins before the PreStop hook is executed, so regardless of the outcome of the handler, the container will eventually terminate within the Pod's termination grace period.

What did you expect to happen?

when I try to delete this pod, this pod should be killed after 30s as configured in terminationGracePeriodSeconds

How can we reproduce it (as minimally and precisely as possible)?

I have a very simple torando application and deployed it to the local Kubernetes cluster, there is only 1 get requests was handed, the logic is passing a number to the path and then it will sleep number seconds, after that, hello world will be printed, this logic is to simulate the application is processing the requests, the detail steps are:

  1. create a python project with a main.py.
import tornado.web
import tornado.ioloop
from tornado import gen


class IndexHandler(tornado.web.RequestHandler):
    """Homepage processing class """

    async def get(self, num):
        """get request """
        await gen.sleep(int(num))
        self.write(f'Hello world, sleep {num} seconds')


if __name__ == '__main__':
    app = tornado.web.Application([(r"/(?P<num>[0-9]+)", IndexHandler)])
    app.listen(8000)
    tornado.ioloop.IOLoop.current().start()
  1. exec eval $(minikube docker-env) to solve the issue of minikube cannot use the local images, more details can refer to https://serverfault.com/questions/964307/kubernetes-deployment-failed-to-pull-image-with-local-registry-minikube

  2. Build this project, exec docker build -t tornado_8000:local .

FROM python:3
USER root
RUN apt-get update && apt-get install -y sudo
COPY ./ ./
RUN pip install tornado
EXPOSE 8000
CMD ["python", "main.py"]
  1. create the deployment yaml file with name tornado_8000.yml.
apiVersion: apps/v1
kind: Deployment
metadata:
  name: tornado-8000
  labels:
    app: tornado-8000
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tornado-8000
  template:
    metadata:
      labels:
        app: tornado-8000
    spec:
      containers:
      - name: tornado-8000
        image: tornado_8000:local
        ports:
        - containerPort: 8000
        imagePullPolicy: Never
        lifecycle:
            preStop:
              exec:
                command: [ "/bin/sleep", "10" ] 
      terminationGracePeriodSeconds: 30
  1. create the kubenetes cluster
    minikube start

  2. deploy the application
    kubectl apply -f tornado_8000.yml

  3. expose the service to local machine
    kubectl expose deployment tornado_8000 --type=LoadBalancer --name=tornado-service --port 8000

  4. access the service from local machine
    curl http://localhost:8000/35
    result is: the hello world str will be printed after 35s.

  5. Delete the pod
    result is: the pod will be restarted after around 40s no matter there are have in-flight requests or not, the grace termination
    time was expected 30s but I always see the pod was terminated around 40s.
    request

Anything else we need to know?

No response

Kubernetes version

Details
$ kubectl version
# paste output here

Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:48:33Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:19:12Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

Details no cloud provider

OS version

Details
# On Linux:
$ cat /etc/os-release
I don't have this file on my mac but I can give you the detailed hardware info.
Hardware Overview:

  Model Name:	MacBook Pro
  Model Identifier:	MacBookPro15,1
  Processor Name:	6-Core Intel Core i9
  Processor Speed:	2.9 GHz
  Number of Processors:	1
  Total Number of Cores:	6
  L2 Cache (per Core):	256 KB
  L3 Cache:	12 MB
  Hyper-Threading Technology:	Enabled
  Memory:	32 GB
  System Firmware Version:	1554.140.20.0.0 (iBridge: 18.16.14759.0.1,0)
  Serial Number (system):	C02X94NRJGH5
  Hardware UUID:	D89C85F3-C8DE-5BA2-BA0F-191C75F0405D
  Provisioning UDID:	D89C85F3-C8DE-5BA2-BA0F-191C75F0405D
  Activation Lock Status:	Disabled

$ uname -a
Darwin DARYHE02M 20.6.0 Darwin Kernel Version 20.6.0: Mon Aug 30 06:12:21 PDT 2021; root:xnu-7195.141.6~3/RELEASE_X86_64 x86_64

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Details

Container runtime (CRI) and version (if applicable)

Details

Related plugins (CNI, CSI, ...) and versions (if applicable)

Details

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.sig/nodeCategorizes an issue or PR as relevant to SIG Node.

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions