-
Notifications
You must be signed in to change notification settings - Fork 539
PodDisruptionBudget "maxUnavailable" doesn't prevent from downtime #1706
Description
Description
Currently, PodDisruptionBudget allows configuring only maxUnavailable, which doesn't prevent downtime while evicting shard pods. This is reproducible with a simple example having a single shard and 2 replicas and default maxUnavailable: 1.
layout:
shards:
- internalReplication: "true"
replicas:
- templates:
podTemplate: clickhouse-in-zone-a
- templates:
podTemplate: clickhouse-in-zone-b
While evicting both pods at a time, there is a short gap when 2nd pod can be evicted while the first pod is gone.
The status of pdb before start of the eviction:
{
"conditions": [
{
"lastTransitionTime": "2025-05-11T16:05:17Z",
"message": "",
"observedGeneration": 1,
"reason": "SufficientPods",
"status": "True",
"type": "DisruptionAllowed"
}
],
"currentHealthy": 2, // <- main part
"desiredHealthy": 1, // <- main part
"disruptionsAllowed": 1, // <- main part
"expectedPods": 2,
"observedGeneration": 1
}
The status of pdb when 1st pod was deleted:
{
"conditions": [
{
"lastTransitionTime": "2025-05-11T16:05:58Z",
"message": "",
"observedGeneration": 1,
"reason": "SufficientPods",
"status": "True",
"type": "DisruptionAllowed"
}
],
"currentHealthy": 1, // <- main part
"desiredHealthy": 0, // <- main part
"disruptionsAllowed": 1, // <- main part
"expectedPods": 1,
"observedGeneration": 1
}
Notice that at some point, pdb observes"expectedPods": 1. While it happens, 2nd pod is allowed to be evicted as it doesn't violate the pod's disruption budget based on the current state.
Based on k8s docs, we can't be using maxUnavailable with the pod setup we have:
You can use a PDB with pods controlled by another resource, by an "operator", or bare pods, but with these restrictions:
- only .spec.minAvailable can be used, not .spec.maxUnavailable.
- only an integer value can be used with .spec.minAvailable, not a percentage.
Additionally, The eviction API will disallow eviction of any pod covered by multiple PDBs, so most users will want to avoid overlapping selectors so we can't have multiple PDBs defined.
Problem
maxUnavailablecannot be used reliably while managing arbitrary pods- default PDB creation is not configurable, preventing from customization
Solution
- allow to disable default PDB creation, enabling the user to define PDBs manually
Or - create
minAvailablePDB per shard instead