Skip to content

Commit 4762396

Browse files
authored
Only rebuilds base python image when upgrading to newer deps (#14783)
The base python image is only updated when manually triggered and in case of checking for upgraded dependencies in master build. While automated upgrade to latest Python image is good for security, it can cause a number of problems when run automatically in the CI: * cache invalidation - thus longer builds * sudden test failures This happened in the past already quite a number of times so it is time to switch to a bit different mode. Python images will only be automatically upgraded in those cases: 1) When Master CI build is run in scheduled nightly build - to check that tests still pass for latest version of the image 2) When manually refreshed with --force-pull-base-python-image 3) When DockerHub official images (from tags) are built. The procedure to refresh the images manually in our CI has been added to the documentation.
1 parent 4cde47b commit 4762396

File tree

7 files changed

+141
-43
lines changed

7 files changed

+141
-43
lines changed

.github/workflows/build-images-workflow-run.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -278,6 +278,8 @@ jobs:
278278
UPGRADE_TO_NEWER_DEPENDENCIES: ${{ needs.build-info.outputs.upgradeToNewerDependencies }}
279279
CONTINUE_ON_PIP_CHECK_FAILURE: "true"
280280
DOCKER_CACHE: ${{ needs.cancel-workflow-runs.outputs.cacheDirective }}
281+
FORCE_PULL_BASE_PYTHON_IMAGE: >
282+
${{ needs.cancel-workflow-runs.sourceEvent == 'schedule' && 'true' || 'false' }}
281283
steps:
282284
- name: >
283285
Checkout [${{ needs.cancel-workflow-runs.outputs.sourceEvent }}]
@@ -405,6 +407,8 @@ jobs:
405407
GITHUB_REGISTRY_PULL_IMAGE_TAG: ${{ github.event.workflow_run.id }}
406408
UPGRADE_TO_NEWER_DEPENDENCIES: ${{ needs.build-info.outputs.upgradeToNewerDependencies }}
407409
DOCKER_CACHE: ${{ needs.cancel-workflow-runs.outputs.cacheDirective }}
410+
FORCE_PULL_BASE_PYTHON_IMAGE: >
411+
${{ needs.cancel-workflow-runs.sourceEvent == 'schedule' && 'true' || 'false' }}
408412
VERSION_SUFFIX_FOR_PYPI: "dev"
409413
VERSION_SUFFIX_FOR_SVN: "dev"
410414
steps:

BREEZE.rst

Lines changed: 36 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -313,7 +313,7 @@ can check whether your problem is fixed.
313313

314314
1. If you are on macOS, check if you have enough disk space for Docker.
315315
2. Restart Breeze with ``./breeze restart``.
316-
3. Delete the ``.build`` directory and run ``./breeze build-image --force-pull-images``.
316+
3. Delete the ``.build`` directory and run ``./breeze build-image``.
317317
4. Clean up Docker images via ``breeze cleanup-image`` command.
318318
5. Restart your Docker Engine and try again.
319319
6. Restart your machine and try again.
@@ -1272,16 +1272,24 @@ This is the current syntax for `./breeze <./breeze>`_:
12721272
breeze build-image [FLAGS]
12731273
12741274
Builds docker image (CI or production) without entering the container. You can pass
1275-
additional options to this command, such as '--force-build-image',
1276-
'--force-pull-image', '--python', '--build-cache-local' or '-build-cache-pulled'
1277-
in order to modify build behaviour.
1275+
additional options to this command, such as:
1276+
1277+
Choosing python version:
1278+
'--python'
1279+
1280+
Choosing cache option:
1281+
'--build-cache-local' or '-build-cache-pulled', or '--build-cache-none'
1282+
1283+
Choosing whether to force pull images or force build the image:
1284+
'--force-build-image',
1285+
'--force-pull-image', '--force-pull-base-python-image'
12781286
12791287
You can also pass '--production-image' flag to build production image rather than CI image.
12801288
1281-
For DockerHub pull --dockerhub-user and --dockerhub-repo flags can be used to specify
1282-
the repository to pull from. For GitHub repository, the --github-repository
1289+
For DockerHub pull. '--dockerhub-user' and '--dockerhub-repo' flags can be used to specify
1290+
the repository to pull from. For GitHub repository, the '--github-repository'
12831291
flag can be used for the same purpose. You can also use
1284-
--github-image-id <COMMIT_SHA>|<RUN_ID> in case you want to pull the image with
1292+
'--github-image-id <COMMIT_SHA>|<RUN_ID>' in case you want to pull the image with
12851293
specific COMMIT_SHA tag or RUN_ID.
12861294
12871295
Flags:
@@ -1351,6 +1359,13 @@ This is the current syntax for `./breeze <./breeze>`_:
13511359
images are pulled by default only for the first time you run the
13521360
environment, later the locally build images are used as cache.
13531361
1362+
--force-pull-base-python-image
1363+
Forces pulling of Python base image from DockerHub before building to
1364+
populate cache. This should only be run in case we need to update to latest available
1365+
Python base image. This should be a rare and manually triggered event. Also this flag
1366+
is used in the scheduled run in CI when we rebuild all the images from the scratch
1367+
and run the tests to see if the latest python images do not fail our tests.
1368+
13541369
Customization options:
13551370
13561371
-E, --extras EXTRAS
@@ -1999,6 +2014,13 @@ This is the current syntax for `./breeze <./breeze>`_:
19992014
images are pulled by default only for the first time you run the
20002015
environment, later the locally build images are used as cache.
20012016
2017+
--force-pull-base-python-image
2018+
Forces pulling of Python base image from DockerHub before building to
2019+
populate cache. This should only be run in case we need to update to latest available
2020+
Python base image. This should be a rare and manually triggered event. Also this flag
2021+
is used in the scheduled run in CI when we rebuild all the images from the scratch
2022+
and run the tests to see if the latest python images do not fail our tests.
2023+
20022024
Customization options:
20032025
20042026
-E, --extras EXTRAS
@@ -2587,6 +2609,13 @@ This is the current syntax for `./breeze <./breeze>`_:
25872609
images are pulled by default only for the first time you run the
25882610
environment, later the locally build images are used as cache.
25892611
2612+
--force-pull-base-python-image
2613+
Forces pulling of Python base image from DockerHub before building to
2614+
populate cache. This should only be run in case we need to update to latest available
2615+
Python base image. This should be a rare and manually triggered event. Also this flag
2616+
is used in the scheduled run in CI when we rebuild all the images from the scratch
2617+
and run the tests to see if the latest python images do not fail our tests.
2618+
25902619
Customization options:
25912620
25922621
-E, --extras EXTRAS

IMAGES.rst

Lines changed: 38 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -757,12 +757,9 @@ significant changes have been made to apt packages or even the base Python image
757757
Pulling the Latest Images
758758
-------------------------
759759

760-
Sometimes the image needs to be rebuilt from scratch. This is required, for example,
761-
when there is a security update of the Python version that all the images are based on and new version
762-
of the image is pushed to the repository. In this case it is usually faster to pull the latest
763-
images rather than rebuild them from scratch.
764-
765-
You can do it via the ``--force-pull-images`` flag to force pulling the latest images from the Docker Hub.
760+
Sometimes the image needs to be refreshed from the registry in DockerHub - because you have an outdated
761+
version. You can do it via the ``--force-pull-images`` flag to force pulling the latest images from the
762+
DockerHub.
766763

767764
For production image:
768765

@@ -777,6 +774,41 @@ however uou can also force it with the same flag.
777774
778775
./breeze build-image --force-pull-images
779776
777+
Refreshing Base Python images
778+
=============================
779+
780+
Python base images are updated from time-to-time, usually as a result of implementing security fixes.
781+
When you build your image locally using ``docker build`` you use the version of image that you have locally.
782+
For the CI builds using ``breeze`` we use the image that is stored in our repository in order to use cache
783+
efficiently. However we can refresh the image to latest available by specifying
784+
``--force-pull-base-python-image`` and running it manually (you need to have access to DockerHub and our
785+
GitHub Registies in order to be able to do that.
786+
787+
.. code-block:: bash
788+
789+
#/bin/bash
790+
export DOCKERHUB_USER="apache"
791+
export GITHUB_REPOSITORY="apache/airflow"
792+
export FORCE_ANSWER_TO_QUESTIONS="true"
793+
export CI="true"
794+
795+
for python_version in "3.6" "3.7" "3.8"
796+
do
797+
./breeze build-image --python ${python_version} --build-cache-local \
798+
--force-pull-base-python-image --verbose
799+
./breeze build-image --python ${python_version} --build-cache-local \
800+
--production-image --verbose
801+
./breeze push-image
802+
./breeze push-image --github-registry ghcr.io
803+
./breeze push-image --github-registry docker.pkg.github.com
804+
./breeze push-image --production-image
805+
./breeze push-image --github-registry ghcr.io --production-image
806+
./breeze push-image --github-registry docker.pkg.github.com --production-image
807+
done
808+
809+
810+
811+
780812
Embedded image scripts
781813
======================
782814

breeze

Lines changed: 34 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,10 @@ function breeze::setup_default_breeze_constants() {
116116
# This can be overridden by '--force-pull-images' flag
117117
export FORCE_PULL_IMAGES="false"
118118

119+
# By default we do not pull python base image. We should do that only when we run upgrade check in
120+
# CI master and when we manually refresh the images to latest versions
121+
export FORCE_PULL_BASE_PYTHON_IMAGE="false"
122+
119123
# Forward common host credentials to docker (gcloud, aws etc.).
120124
export FORWARD_CREDENTIALS="false"
121125

@@ -983,6 +987,15 @@ function breeze::parse_arguments() {
983987
export FORCE_ANSWER_TO_QUESTIONS="yes"
984988
shift
985989
;;
990+
--force-pull-base-python-image)
991+
echo "Force pulling base python image. Uses pulled images as cache."
992+
echo
993+
export FORCE_PULL_BASE_PYTHON_IMAGE="true"
994+
export FORCE_BUILD_IMAGES="true"
995+
# if you want to force build an image - assume you want to build it :)
996+
export FORCE_ANSWER_TO_QUESTIONS="yes"
997+
shift
998+
;;
986999
-I | --production-image)
9871000
export PRODUCTION_IMAGE="true"
9881001
export SQLITE_URL=
@@ -1719,16 +1732,24 @@ ${CMDNAME} build-docs [-- <EXTRA_ARGS>]
17191732
${CMDNAME} build-image [FLAGS]
17201733
17211734
Builds docker image (CI or production) without entering the container. You can pass
1722-
additional options to this command, such as '--force-build-image',
1723-
'--force-pull-image', '--python', '--build-cache-local' or '-build-cache-pulled'
1724-
in order to modify build behaviour.
1735+
additional options to this command, such as:
1736+
1737+
Choosing python version:
1738+
'--python'
1739+
1740+
Choosing cache option:
1741+
'--build-cache-local' or '-build-cache-pulled', or '--build-cache-none'
1742+
1743+
Choosing whether to force pull images or force build the image:
1744+
'--force-build-image',
1745+
'--force-pull-image', '--force-pull-base-python-image'
17251746
17261747
You can also pass '--production-image' flag to build production image rather than CI image.
17271748
1728-
For DockerHub pull --dockerhub-user and --dockerhub-repo flags can be used to specify
1729-
the repository to pull from. For GitHub repository, the --github-repository
1749+
For DockerHub pull. '--dockerhub-user' and '--dockerhub-repo' flags can be used to specify
1750+
the repository to pull from. For GitHub repository, the '--github-repository'
17301751
flag can be used for the same purpose. You can also use
1731-
--github-image-id <COMMIT_SHA>|<RUN_ID> in case you want to pull the image with
1752+
'--github-image-id <COMMIT_SHA>|<RUN_ID>' in case you want to pull the image with
17321753
specific COMMIT_SHA tag or RUN_ID.
17331754
17341755
Flags:
@@ -2580,6 +2601,13 @@ function breeze::flag_build_docker_images() {
25802601
images are pulled by default only for the first time you run the
25812602
environment, later the locally build images are used as cache.
25822603
2604+
--force-pull-base-python-image
2605+
Forces pulling of Python base image from DockerHub before building to
2606+
populate cache. This should only be run in case we need to update to latest available
2607+
Python base image. This should be a rare and manually triggered event. Also this flag
2608+
is used in the scheduled run in CI when we rebuild all the images from the scratch
2609+
and run the tests to see if the latest python images do not fail our tests.
2610+
25832611
Customization options:
25842612
25852613
-E, --extras EXTRAS

breeze-complete

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -165,7 +165,7 @@ help python: backend: integration:
165165
kubernetes-mode: kubernetes-version: helm-version: kind-version:
166166
skip-mounting-local-sources mount-all-local-sources install-airflow-version: install-airflow-reference: db-reset
167167
verbose assume-yes assume-no assume-quit forward-credentials init-script:
168-
force-build-images force-pull-images production-image extras: force-clean-images skip-rebuild-check
168+
force-build-images force-pull-base-python-image production-image extras: force-clean-images skip-rebuild-check
169169
build-cache-local build-cache-pulled build-cache-disabled disable-pip-cache
170170
dockerhub-user: dockerhub-repo: use-github-registry github-registry: github-repository: github-image-id: generate-constraints-mode:
171171
postgres-version: mysql-version:

scripts/ci/images/ci_build_dockerhub.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ else
111111
export INSTALL_PROVIDERS_FROM_SOURCES="false"
112112
export AIRFLOW_PRE_CACHED_PIP_PACKAGES="false"
113113
export DOCKER_CACHE="local"
114+
export FORCE_PULL_BASE_PYTHON_IMAGE="true"
114115
# Name the image based on the TAG rather than based on the branch name
115116
export FORCE_AIRFLOW_PROD_BASE_TAG="${DOCKER_TAG}"
116117
export INSTALL_AIRFLOW_VERSION="${DOCKER_TAG%-python*}"

scripts/ci/libraries/_push_pull_remove_images.sh

Lines changed: 27 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -104,53 +104,56 @@ function push_pull_remove_images::pull_image_github_dockerhub() {
104104
set -e
105105
}
106106

107-
# Force pulls the python base image
108-
function push_pull_remove_images::force_pull_python_base_image() {
107+
# Rebuilds python base image from the latest available Python version
108+
function push_pull_remove_images::rebuild_python_base_image() {
109+
echo
110+
echo "Rebuilding ${AIRFLOW_PYTHON_BASE_IMAGE} from latest ${PYTHON_BASE_IMAGE}"
111+
echo
109112
docker pull "${PYTHON_BASE_IMAGE}"
110-
echo "FROM ${PYTHON_BASE_IMAGE}" | \
113+
echo "FROM ${PYTHON_BASE_IMAGE}" | \
111114
docker build \
112115
--label "org.opencontainers.image.source=https://github.com/${GITHUB_REPOSITORY}" \
113116
-t "${AIRFLOW_PYTHON_BASE_IMAGE}" -
114117
}
115118

116119
# Pulls the base Python image. This image is used as base for CI and PROD images, depending on the parameters used:
117120
#
118-
# * if FORCE_PULL_IMAGES is true or UPGRADE_TO_NEWER_DEPENDENCIES != false, then it pulls the latest Python image available first and
119-
# adds `org.opencontainers.image.source` label to it, so that it is linked to Airflow repository when
120-
# we push it to GHCR registry
121+
# * if FORCE_PULL_BASE_PYTHON_IMAGE != false, then it rebuild the image using latest Python image available
122+
# and adds `org.opencontainers.image.source` label to it, so that it is linked to Airflow
123+
# repository when we push it to GHCR registry
121124
# * Otherwise it pulls the Python base image from either GitHub registry or from DockerHub
122125
# depending on USE_GITHUB_REGISTRY variable. In case we pull specific build image (via suffix)
123126
# it will pull the right image using the specified suffix
124127
function push_pull_remove_images::pull_base_python_image() {
128+
if [[ ${FORCE_PULL_BASE_PYTHON_IMAGE} == "true" ]] ; then
129+
push_pull_remove_images::rebuild_python_base_image
130+
return
131+
fi
125132
echo
126-
echo "Force pull python base image ${AIRFLOW_PYTHON_BASE_IMAGE}. Upgrade to newer dependencies: ${UPGRADE_TO_NEWER_DEPENDENCIES}"
133+
echo "Docker pulling base python image. Upgrade to newer deps: ${UPGRADE_TO_NEWER_DEPENDENCIES}"
127134
echo
128135
if [[ -n ${DETECTED_TERMINAL=} ]]; then
129-
echo -n "
130-
Docker pulling ${AIRFLOW_PYTHON_BASE_IMAGE}. Upgrade to newer dependencies ${UPGRADE_TO_NEWER_DEPENDENCIES}
136+
echo -n "Docker pulling base python image. Upgrade to newer deps: ${UPGRADE_TO_NEWER_DEPENDENCIES}
131137
" > "${DETECTED_TERMINAL}"
132138
fi
133-
if [[ "${FORCE_PULL_IMAGES}" == "true" || ${UPGRADE_TO_NEWER_DEPENDENCIES} != "false" ]]; then
134-
push_pull_remove_images::force_pull_python_base_image
135-
else
136-
if [[ ${USE_GITHUB_REGISTRY} == "true" ]]; then
137-
PYTHON_TAG_SUFFIX=""
138-
if [[ ${GITHUB_REGISTRY_PULL_IMAGE_TAG} != "latest" ]]; then
139-
PYTHON_TAG_SUFFIX="-${GITHUB_REGISTRY_PULL_IMAGE_TAG}"
140-
fi
141-
push_pull_remove_images::pull_image_github_dockerhub "${AIRFLOW_PYTHON_BASE_IMAGE}" \
142-
"${GITHUB_REGISTRY_PYTHON_BASE_IMAGE}${PYTHON_TAG_SUFFIX}"
143-
else
144-
docker pull "${AIRFLOW_PYTHON_BASE_IMAGE}"
139+
if [[ ${USE_GITHUB_REGISTRY} == "true" ]]; then
140+
PYTHON_TAG_SUFFIX=""
141+
if [[ ${GITHUB_REGISTRY_PULL_IMAGE_TAG} != "latest" ]]; then
142+
PYTHON_TAG_SUFFIX="-${GITHUB_REGISTRY_PULL_IMAGE_TAG}"
145143
fi
144+
push_pull_remove_images::pull_image_github_dockerhub "${AIRFLOW_PYTHON_BASE_IMAGE}" \
145+
"${GITHUB_REGISTRY_PYTHON_BASE_IMAGE}${PYTHON_TAG_SUFFIX}"
146+
else
147+
docker pull "${AIRFLOW_PYTHON_BASE_IMAGE}"
146148
fi
147149
}
148150

149151
# Pulls CI image in case caching strategy is "pulled" and the image needs to be pulled
150152
function push_pull_remove_images::pull_ci_images_if_needed() {
151153
local python_image_hash
152154
python_image_hash=$(docker images -q "${AIRFLOW_PYTHON_BASE_IMAGE}" 2> /dev/null || true)
153-
if [[ -z "${python_image_hash=}" || "${FORCE_PULL_IMAGES}" == "true" ]]; then
155+
if [[ -z "${python_image_hash=}" || "${FORCE_PULL_IMAGES}" == "true" || \
156+
${FORCE_PULL_BASE_PYTHON_IMAGE} == "true" ]]; then
154157
push_pull_remove_images::pull_base_python_image
155158
fi
156159
if [[ "${DOCKER_CACHE}" == "pulled" ]]; then
@@ -168,7 +171,8 @@ function push_pull_remove_images::pull_ci_images_if_needed() {
168171
function push_pull_remove_images::pull_prod_images_if_needed() {
169172
local python_image_hash
170173
python_image_hash=$(docker images -q "${AIRFLOW_PYTHON_BASE_IMAGE}" 2> /dev/null || true)
171-
if [[ -z "${python_image_hash=}" || "${FORCE_PULL_IMAGES}" == "true" ]]; then
174+
if [[ -z "${python_image_hash=}" || "${FORCE_PULL_IMAGES}" == "true" || \
175+
${FORCE_PULL_BASE_PYTHON_IMAGE} == "true" ]]; then
172176
push_pull_remove_images::pull_base_python_image
173177
fi
174178
if [[ "${DOCKER_CACHE}" == "pulled" ]]; then

0 commit comments

Comments
 (0)