Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Mar 14, 2021

Sometimes (very rarely) some 'wait for image' pulling steps
loop forever (while other steps from parallell jobs pulling the
same image have no problems).

Example here:

Failed step here:

Another similar step in parallel job had no problems with retrieving the
same image earlier:

Both images pulled the same image:

docker.pkg.github.com/apache/airflow/master-python3.6-ci-v2:651461418

This change adds diagnostics information that might provide more
information in case this happens again so that we can understand
what is going on and mitigate the issue.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@potiuk potiuk requested a review from ashb as a code owner March 14, 2021 14:59
@potiuk potiuk force-pushed the better-diagnostics-for-image-waiting branch 2 times, most recently from 6756614 to 9d36ca0 Compare March 14, 2021 15:02
Sometimes (very rarely) some 'wait for image' pulling steps
loop forever (while other steps from parallell jobs pulling the
same image have no problems).

Example here:

Failed step here:

* https://github.com/apache/airflow/runs/2106723280?check_suite_focus=true#step:5:349

Another similar step in parallel job had no problems with retrieving the
same image earlier:

* https://github.com/apache/airflow/runs/2106723269?check_suite_focus=true#step:5:119

Both images pulled the same image:

docker.pkg.github.com/apache/airflow/master-python3.6-ci-v2:651461418

This change adds diagnostics information that might provide more
information in case this happens again so that we can understand
what is going on and mitigate the issue.
@potiuk potiuk force-pushed the better-diagnostics-for-image-waiting branch from 9d36ca0 to e80b2fc Compare March 14, 2021 16:46
Copy link
Contributor

@dstandish dstandish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Mar 14, 2021
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@potiuk potiuk merged commit 4cde47b into apache:master Mar 14, 2021
@potiuk potiuk deleted the better-diagnostics-for-image-waiting branch March 14, 2021 19:16
potiuk added a commit that referenced this pull request Mar 23, 2021
Sometimes (very rarely) some 'wait for image' pulling steps
loop forever (while other steps from parallell jobs pulling the
same image have no problems).

Example here:

Failed step here:

* https://github.com/apache/airflow/runs/2106723280?check_suite_focus=true#step:5:349

Another similar step in parallel job had no problems with retrieving the
same image earlier:

* https://github.com/apache/airflow/runs/2106723269?check_suite_focus=true#step:5:119

Both images pulled the same image:

docker.pkg.github.com/apache/airflow/master-python3.6-ci-v2:651461418

This change adds diagnostics information that might provide more
information in case this happens again so that we can understand
what is going on and mitigate the issue.

(cherry picked from commit 4cde47b)
ashb pushed a commit that referenced this pull request Apr 15, 2021
Sometimes (very rarely) some 'wait for image' pulling steps
loop forever (while other steps from parallell jobs pulling the
same image have no problems).

Example here:

Failed step here:

* https://github.com/apache/airflow/runs/2106723280?check_suite_focus=true#step:5:349

Another similar step in parallel job had no problems with retrieving the
same image earlier:

* https://github.com/apache/airflow/runs/2106723269?check_suite_focus=true#step:5:119

Both images pulled the same image:

docker.pkg.github.com/apache/airflow/master-python3.6-ci-v2:651461418

This change adds diagnostics information that might provide more
information in case this happens again so that we can understand
what is going on and mitigate the issue.

(cherry picked from commit 4cde47b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:dev-tools full tests needed We need to run full set of tests for this PR to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants