Skip to content

Scale-out cannot be triggered by autoscaler for pods in the pipeline state #3000

@wangyang0616

Description

@wangyang0616

What happened:
In a large-scale cluster, multiple pods in the terminating state exist and exist for a long time. (Node faults, application faults, and untimely manual handling) The pod to be scheduled is in the pending state. Volcano enters the pipeline state and waits for the terminating pod to release resources. The pod cannot run for a long time.

After the autoscaler scale-out function is enabled and the capacity is expanded by cluster pending pod, the scale-out function cannot be triggered.

What you expected to happen:
During Volcano scheduling, if pending pods exist in the cluster, nodes can be automatically expanded to schedule pending pods after autoscaler is enabled, regardless of whether the cluster enters the pipeline state.

How to reproduce it (as minimally and precisely as possible):

  1. Install the autoscaler component and expand the capacity based on the pending pod.
  2. Manually set a terminating pod.
  3. Schedule a pod and apply for the same amount of resources as the terminating pod.

Anything else we need to know?:
When the autoscaler scales out a pending pod, the condition for watching the pod is that the pod is in the pending state and the reason is Unschedulable.

Environment:

  • Volcano Version: master
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions