Skip to content

Tasks set to queued by a backfill get cleared and rescheduled by the kubernetes executor, breaking the backfill #23356

@leeft95

Description

@leeft95

Apache Airflow version

2.2.5 (latest released)

What happened

A backfill launched from the scheduler pod, queues tasks as it should but while they are in the process of starting the kubernentes executor loop running in the scheduler clears these tasks and reschedules them via this function

def clear_not_launched_queued_tasks(self, session=None) -> None:

This causes the backfill to not queue any more tasks and enters an endless loop of waiting for the task it has queued to complete.

The way I have mitigated this is to set the AIRFLOW__KUBERNETES__WORKER_PODS_QUEUED_CHECK_INTERVAL to 3600, which is not ideal

What you think should happen instead

The function clear_not_launched_queued_tasks should respect tasks launched by a backfill process and not clear them.

How to reproduce

start a backfill with large number of tasks and watch as they get queued and then subsequently rescheduled by the kubernetes executor running in the scheduler pod

Operating System

Debian GNU/Linux 10 (buster)

Versions of Apache Airflow Providers


apache-airflow            2.2.5            py38h578d9bd_0    
apache-airflow-providers-cncf-kubernetes 3.0.2              pyhd8ed1ab_0    
apache-airflow-providers-docker 2.4.1              pyhd8ed1ab_0    
apache-airflow-providers-ftp 2.1.2              pyhd8ed1ab_0    
apache-airflow-providers-http 2.1.2              pyhd8ed1ab_0    
apache-airflow-providers-imap 2.2.3              pyhd8ed1ab_0    
apache-airflow-providers-postgres 3.0.0              pyhd8ed1ab_0    
apache-airflow-providers-sqlite 2.1.3              pyhd8ed1ab_0    

Deployment

Other 3rd-party Helm chart

Deployment details

Deployment is running the latest helm chart of Airflow Community Edition

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions