-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Add docker health check to integrations #13446
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docker health check to integrations #13446
Conversation
| interval: 10s | ||
| timeout: 10s | ||
| retries: 120 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any specific reason why restart: always is not added for mysql and postgres?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I missed it. Added.
XD-DENG
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me.
But I would like to understand a bit more how the values for retries are determined , especially it varies quite much among different cases. For example, for mysql it is 120, which may be too big to me.
|
The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest master at your convenience, or amend the last commit of the PR, and push it with --force-with-lease. |
potiuk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice. Much better than the in-container check (though the in-container one should stay as a 'sanity check's)
This value was copied from the internet. All health checks in interval 5s - breeze is used by users so make it responsive. timeout 30s has no reason whatsoever, but that makes sense to me. retries 50 is is the value copied from (in-container health checks)[https://github.com/apache/airflow/blob/6ef23aff802032e85ec42dabda83907bfd812b2c/scripts/in_container/check_environment.sh#L158-L174]. This value was determined experimentally and so far I have the impression that it works. |
|
@XD-DENG I updated the PR. Now all healt checks have set |
Thanks👍 |
|
Does docker do anything with these health checks now? Last time I looked (which was 12+ months ago) Docker itself didn't do anything with these. |
|
Oh yes I see in your screenshot it does. Cool. |
(cherry picked from commit 3341d21)
In order to avoid initialisation of database and other integrations when you are entering/leaving breeze, those are started when breeze starts but not stopped when you leave. Then stopping such running auxiliary containers should be done with `breeze stop` after you are done. However those who do not do it, and will restart their machine will find that the containers get restarted. This has been added as part of apache#13446 where health-checks are added. However "always" was not a good choice. It should have been "on-failure"
In order to avoid initialisation of database and other integrations when you are entering/leaving breeze, those are started when breeze starts but not stopped when you leave. Then stopping such running auxiliary containers should be done with `breeze stop` after you are done. However those who do not do it, and will restart their machine will find that the containers get restarted. This has been added as part of #13446 where health-checks are added. However "always" was not a good choice. It should have been "on-failure"
In order to avoid initialisation of database and other integrations when you are entering/leaving breeze, those are started when breeze starts but not stopped when you leave. Then stopping such running auxiliary containers should be done with `breeze stop` after you are done. However those who do not do it, and will restart their machine will find that the containers get restarted. This has been added as part of #13446 where health-checks are added. However "always" was not a good choice. It should have been "on-failure" (cherry picked from commit 337146d)
In order to avoid initialisation of database and other integrations when you are entering/leaving breeze, those are started when breeze starts but not stopped when you leave. Then stopping such running auxiliary containers should be done with `breeze stop` after you are done. However those who do not do it, and will restart their machine will find that the containers get restarted. This has been added as part of #13446 where health-checks are added. However "always" was not a good choice. It should have been "on-failure" (cherry picked from commit 337146d)
To constantly respond to container failures, I add a health check to all integrations. Thanks to this, Docker is able to detect the problem earlier and restart the container if necessary.
Information about the state of the containers is also available in

docker pscommand.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.