Skip to content

Proposal to simplify kubectl reapers #50340

@erictune

Description

@erictune

Summary of proposed code change

  1. remove selector-overlap detection code from reapers that have it because ownerRef changes in 1.5 effectively prevent selector-overlap.
  2. remove scale down code from reapers since foreground GC can do it for you
  3. change default timeout to either num pods time pod grace period, or infinity.

Changes in kubectl behavior

A Relnote will call out this change in behavior as seen by users:

In 1.7 and before, when you use kubectl delete to delete a replicaSet, replicationController, daemonSet, deployment, statefulSet or job, it would return after all the pods had begun deleting, but before they completed deleting. In 1.8, it will wait for all the pods to complete deleting. With this new behavior, the involved pod names (which are reused by statefulSet) and the pod's resources are guaranteed to be freed when the command returns. The default timeout is increased to blah. Use a shorter timeout if you do not want to wait, using the --blah flag.

After the Change:

Ideally each reaper code is just:

c := Blah client
c.Delete(name, foregroundGCOption, veryLargeTimeout)

Summary of current reapers

All code in pkg/kubectl/delete.go in functions called Stop.

ReplicaSet Reaper

  • Make a replicaSet client.
  • Set timeout based on number of replicas
  • detect overlap and error out if found.
  • scale while waiting for scale to zero with timeout
  • delete the RS

ReplicationController Reaper

Same as RS

DaemonSet Reaper

  • make DS client
  • hacky scale to zero
  • wait for zero scheduled
  • delete DS

StatefulSet Reaper

  • make SS client
  • scale to zero
  • delete SS

TODO: check that PVCs are not owned, so that they are orphaned when you GC them.

Job reaper

  • make job and pods clients
  • set timeout based on number of pods
  • scale down
  • delete dead pods
  • delete job

TODO: check that dead pods still have ownerRef to job so GC gets them.

Deployment Reaper

  • make Deployment and RS clients
  • pause it, set replicas to zero, and revision history to zero
  • wait for it to observe the pause
  • stop all RSes using RS reaper
  • delete Deployment

TODO: will Deployment controller consider itself paused when it gets a deletion timestamp? If so, don't need "paused".

TODO: will revision history get GC'ed?

Pod Reaper

  • Get pod's grace period
  • delete pod with grace period.

Version considerations

Kubectl supports 1 version of skew.
Foreground GC and all needed GC features are supported in 1.6, according to @caesarxuchao so we could just entirely replace the reaper code with the new simpler code.

TODO: Need to verify that in 1.6 the controllers-loops all know to stop reconciling when a controller-object has a deletion timestamp (this gets delivered as an update, not a delete, via watch, so need to make sure it is handling the right way.

Client Failure or Timeout

If kubectl fails or times out before Foreground GC is done, the GC proceeds anyways (according to the expert, @caesarxuchao). If the user wants rerun kubectl, kubectl will try the delete again, and the second foreground delete should again wait on the deletion of that same object (again according to the expert).

TODO: make sure the delete command will allow re-deleting an object that is in a deleting state.

Metadata

Metadata

Labels

lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.sig/appsCategorizes an issue or PR as relevant to SIG Apps.sig/cliCategorizes an issue or PR as relevant to SIG CLI.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions