[KubernetesPodOperator] Dectection of different timeouts for schedule and startup state#49784
Merged
jscheffl merged 1 commit intoapache:mainfrom May 12, 2025
Conversation
jscheffl
reviewed
Apr 25, 2025
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py
Outdated
Show resolved
Hide resolved
46520b1 to
f489bca
Compare
jscheffl
approved these changes
Apr 26, 2025
Contributor
jscheffl
left a comment
There was a problem hiding this comment.
Looking good for me. But would like to have another pair of eyes on this.
The failed compose test is a problem on main and seems to be un-related to this PR.
f489bca to
1ab9178
Compare
Contributor
nevcohen
reviewed
May 6, 2025
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/operators/pod.py
Show resolved
Hide resolved
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/utils/pod_manager.py
Show resolved
Hide resolved
providers/cncf/kubernetes/src/airflow/providers/cncf/kubernetes/utils/pod_manager.py
Show resolved
Hide resolved
Contributor
Smaller PRs == easier review :-D |
Contributor
I totally agree, but in this case they are really dependent on each other and there isn't really much extra code. Anyway, it's not critical. |
1ab9178 to
20f005e
Compare
…imeout and startup timeout
20f005e to
2d3bca9
Compare
This was referenced May 14, 2025
sanederchik
pushed a commit
to sanederchik/airflow
that referenced
this pull request
Jun 7, 2025
…imeout and startup timeout (apache#49784) Co-authored-by: AutomationDev85 <AutomationDev85>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
The idea behind this PR is to enable the KubernetesPodOperator with detection of different timeouts.
For this we introduce the schedule_timeout_seconds parameter. This parameter defines the time from creating the Pod till arriving the scheduled state. With this timeout if is possible to catch e.g. scale up of Kubernetes nodes more detailed.
The startup_timeout_seconds timeout is then used to check for the time from entering the scheduled state till POD enters the running state. With that it is possible to specify the time for pulling an image more detailed.
With these 2 parameters it is possible to control the startup time of the Pod more detailed. A long running scale up of a node in the cluster does not affect the timeout during pulling of a huge image.
As this can break current timeout settings of the user -> Idea is to define the new parameter schedule_timeout_seconds with None instead of a default int value. If the user does not set this parameter the same value as startup_timeout_seconds is used again. This can double the timeout in worst case but we think it is worse for the moment to have no breaking change in the timeout behavior of the operator. What do you think about this?
Details of change: