Skip to content

Commit ad28f69

Browse files
authored
Switch to 'buildkit' to build Airflow images (#20664)
The "buildkit" is much more modern docker build mechanism and supports multiarchitecture builds which makes it suitable for our future ARM support, it also has nicer UI and much more sophisticated caching mechanisms as well as supports better multi-segment builds. BuildKit has been promoted to official for quite a while and it is rather stable now. Also we can now install BuildKit Plugin to docker that add capabilities of building and managin cache using dedicated builders (previously BuildKit cache was managed using rather complex external tools). This gives us an opportunity to vastly simplify our build scripts, because it has now much more robust caching mechanism than the old docker build (which forced us to pull images before using them as cache). We had a lot of complexity involved in efficient caching but with BuildKit all that can be vastly simplified and we can get rid of: * keeping base python images in our registry * keeping build segments for prod image in our registry * keeping manifest images in our registry * deciding when to pull or pull&build image (not needed now, we can always build image with --cache-from and buildkit will pull cached layers as needed * building the image when performing pre-commit (rather than that we simply encourage users to rebuild the image via breeze command) * pulling the images before building * separate 'build' cache kept in our registry (not needed any more as buildkit allows to keep cache for all segments of multi-segmented build in a single cache * the nice animated tty UI of buildkit eliminates the need of manual spinner * and a number of other complexities. Depends on #20238
1 parent 730db3f commit ad28f69

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+387
-1007
lines changed

.github/workflows/build-images.yml

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,6 @@ permissions:
2929
env:
3030
MOUNT_SELECTED_LOCAL_SOURCES: "false"
3131
FORCE_ANSWER_TO_QUESTIONS: "yes"
32-
FORCE_PULL_IMAGES: "false"
3332
CHECK_IMAGE_FOR_REBUILD: "true"
3433
SKIP_CHECK_REMOTE_IMAGE: "true"
3534
DB_RESET: "true"
@@ -179,8 +178,6 @@ jobs:
179178
PYTHON_MAJOR_MINOR_VERSION: ${{ matrix.python-version }}
180179
UPGRADE_TO_NEWER_DEPENDENCIES: ${{ needs.build-info.outputs.upgradeToNewerDependencies }}
181180
DOCKER_CACHE: ${{ needs.build-info.outputs.cacheDirective }}
182-
CHECK_IF_BASE_PYTHON_IMAGE_UPDATED: >
183-
${{ github.event_name == 'pull_request_target' && 'false' || 'true' }}
184181
outputs: ${{toJSON(needs.build-info.outputs) }}
185182
steps:
186183
- uses: actions/checkout@v2
@@ -256,8 +253,6 @@ jobs:
256253
PYTHON_MAJOR_MINOR_VERSION: ${{ matrix.python-version }}
257254
UPGRADE_TO_NEWER_DEPENDENCIES: ${{ needs.build-info.outputs.upgradeToNewerDependencies }}
258255
DOCKER_CACHE: ${{ needs.build-info.outputs.cacheDirective }}
259-
CHECK_IF_BASE_PYTHON_IMAGE_UPDATED: >
260-
${{ github.event_name == 'pull_request_target' && 'false' || 'true' }}
261256
VERSION_SUFFIX_FOR_PYPI: ".dev0"
262257
INSTALL_PROVIDERS_FROM_SOURCES: >
263258
${{ needs.build-info.outputs.defaultBranch == 'main' && 'true' || 'false' }}

.github/workflows/ci.yml

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,6 @@ permissions:
3030
env:
3131
MOUNT_SELECTED_LOCAL_SOURCES: "false"
3232
FORCE_ANSWER_TO_QUESTIONS: "yes"
33-
FORCE_PULL_IMAGES: "false"
3433
CHECK_IMAGE_FOR_REBUILD: "true"
3534
SKIP_CHECK_REMOTE_IMAGE: "true"
3635
DB_RESET: "true"
@@ -1380,12 +1379,11 @@ ${{ hashFiles('.pre-commit-config.yaml') }}"
13801379
branch: ${{ steps.constraints-branch.outputs.branch }}
13811380
directory: "repo"
13821381

1383-
# Push images to GitHub Registry in Apache repository, if all tests are successful and build
1384-
# is executed as result of direct push to "main" or one of the "test" branches
1385-
# It actually rebuilds all images using just-pushed constraints if they changed
1386-
# It will also check if a new python image was released and will pull the latest one if needed
1387-
# Same as build-images.yaml
1388-
push-images-to-github-registry:
1382+
# Push BuildX cache to GitHub Registry in Apache repository, if all tests are successful and build
1383+
# is executed as result of direct push to "main" or one of the "vX-Y-test" branches
1384+
# It rebuilds all images using just-pushed constraints using buildx and pushes them to registry
1385+
# It will automatically check if a new python image was released and will pull the latest one if needed
1386+
push-buildx-cache-to-github-registry:
13891387
permissions:
13901388
packages: write
13911389
timeout-minutes: 40
@@ -1396,7 +1394,9 @@ ${{ hashFiles('.pre-commit-config.yaml') }}"
13961394
- constraints
13971395
- docs
13981396
# Only run it for direct pushes and scheduled builds
1399-
if: github.event_name == 'push' || github.event_name == 'schedule'
1397+
if: >
1398+
(github.event_name == 'push' || github.event_name == 'schedule')
1399+
&& github.repository == 'apache/airflow'
14001400
strategy:
14011401
matrix:
14021402
python-version: ${{ fromJson(needs.build-info.outputs.pythonVersions) }}
@@ -1410,11 +1410,9 @@ ${{ hashFiles('.pre-commit-config.yaml') }}"
14101410
# a new python image, we will rebuild it from scratch (same as during the "build-images.ci")
14111411
GITHUB_REGISTRY_PULL_IMAGE_TAG: "latest"
14121412
GITHUB_REGISTRY_PUSH_IMAGE_TAG: "latest"
1413-
PUSH_PYTHON_BASE_IMAGE: "true"
1414-
FORCE_PULL_IMAGES: "true"
1415-
CHECK_IF_BASE_PYTHON_IMAGE_UPDATED: "true"
14161413
GITHUB_REGISTRY_WAIT_FOR_IMAGE: "false"
14171414
UPGRADE_TO_NEWER_DEPENDENCIES: "false"
1415+
PREPARE_BUILDX_CACHE: "true"
14181416
steps:
14191417
- name: "Checkout ${{ github.ref }} ( ${{ github.sha }} )"
14201418
uses: actions/checkout@v2
@@ -1435,7 +1433,3 @@ ${{ hashFiles('.pre-commit-config.yaml') }}"
14351433
run: ./scripts/ci/images/ci_prepare_prod_image_on_ci.sh
14361434
env:
14371435
VERSION_SUFFIX_FOR_PYPI: ".dev0"
1438-
- name: "Push CI image ${{ env.PYTHON_MAJOR_MINOR_VERSION }}:latest"
1439-
run: ./scripts/ci/images/ci_push_ci_images.sh
1440-
- name: "Push PROD images ${{ env.PYTHON_MAJOR_MINOR_VERSION }}:latest"
1441-
run: ./scripts/ci/images/ci_push_production_images.sh

BREEZE.rst

Lines changed: 70 additions & 93 deletions
Original file line numberDiff line numberDiff line change
@@ -1146,10 +1146,10 @@ This is the current syntax for `./breeze <./breeze>`_:
11461146
shell [Default] Enters interactive shell in the container
11471147
build-docs Builds documentation in the container
11481148
build-image Builds CI or Production docker image
1149+
prepare-build-cache Prepares CI or Production build cache
11491150
cleanup-image Cleans up the container image created
11501151
exec Execs into running breeze container in new terminal
11511152
generate-constraints Generates pinned constraint files
1152-
push-image Pushes images to registry
11531153
initialize-local-virtualenv Initializes local virtualenv
11541154
prepare-airflow-packages Prepares airflow packages
11551155
setup-autocomplete Sets up autocomplete for breeze
@@ -1254,10 +1254,7 @@ This is the current syntax for `./breeze <./breeze>`_:
12541254
'--build-cache-local' or '-build-cache-pulled', or '--build-cache-none'
12551255
12561256
Choosing whether to force pull images or force build the image:
1257-
'--force-build-image', '--force-pull-image'
1258-
1259-
Checking if the base python image has been updated:
1260-
'--check-if-base-python-image-updated'
1257+
'--force-build-image'
12611258
12621259
You can also pass '--production-image' flag to build production image rather than CI image.
12631260
@@ -1300,17 +1297,6 @@ This is the current syntax for `./breeze <./breeze>`_:
13001297
automatically for the first time or when changes are detected in
13011298
package-related files, but you can force it using this flag.
13021299
1303-
-P, --force-pull-images
1304-
Forces pulling of images from GitHub Container Registry before building to populate cache.
1305-
The images are pulled by default only for the first time you run the
1306-
environment, later the locally build images are used as cache.
1307-
1308-
--check-if-base-python-image-updated
1309-
Checks if Python base image from DockerHub has been updated vs the current python base
1310-
image we store in GitHub Container Registry. Python images are updated regularly with
1311-
security fixes, this switch will check if a new one has been released and will pull and
1312-
prepare a new base python based on the latest one.
1313-
13141300
--cleanup-docker-context-files
13151301
Removes whl and tar.gz files created in docker-context-files before running the command.
13161302
In case there are some files there it unnecessarily increases the context size and
@@ -1458,6 +1444,74 @@ This is the current syntax for `./breeze <./breeze>`_:
14581444
####################################################################################################
14591445
14601446
1447+
Detailed usage for command: prepare-build-cache
1448+
1449+
1450+
breeze prepare-build-cache [FLAGS]
1451+
1452+
Prepares build cache (CI or production) without entering the container. You can pass
1453+
additional options to this command, such as:
1454+
1455+
Choosing python version:
1456+
'--python'
1457+
1458+
You can also pass '--production-image' flag to build production image rather than CI image.
1459+
1460+
For GitHub repository, the '--github-repository' can be used to choose repository
1461+
to pull/push images. Cleanup docker context files and pull cache are forced. This command
1462+
requires buildx to be installed.
1463+
1464+
Flags:
1465+
1466+
-p, --python PYTHON_MAJOR_MINOR_VERSION
1467+
Python version used for the image. This is always major/minor version.
1468+
1469+
One of:
1470+
1471+
3.7 3.8 3.9
1472+
1473+
-a, --install-airflow-version INSTALL_AIRFLOW_VERSION
1474+
Uses different version of Airflow when building PROD image.
1475+
1476+
2.0.2 2.0.1 2.0.0 wheel sdist
1477+
1478+
-t, --install-airflow-reference INSTALL_AIRFLOW_REFERENCE
1479+
Installs Airflow directly from reference in GitHub when building PROD image.
1480+
This can be a GitHub branch like main or v2-2-test, or a tag like 2.2.0rc1.
1481+
1482+
--installation-method INSTALLATION_METHOD
1483+
Method of installing Airflow in PROD image - either from the sources ('.')
1484+
or from package 'apache-airflow' to install from PyPI.
1485+
Default in Breeze is to install from sources. One of:
1486+
1487+
. apache-airflow
1488+
1489+
--upgrade-to-newer-dependencies
1490+
Upgrades PIP packages to latest versions available without looking at the constraints.
1491+
1492+
-I, --production-image
1493+
Use production image for entering the environment and builds (not for tests).
1494+
1495+
-g, --github-repository GITHUB_REPOSITORY
1496+
GitHub repository used to pull, push images.
1497+
Default: apache/airflow.
1498+
1499+
-v, --verbose
1500+
Show verbose information about executed docker, kind, kubectl, helm commands. Useful for
1501+
debugging - when you run breeze with --verbose flags you will be able to see the commands
1502+
executed under the hood and copy&paste them to your terminal to debug them more easily.
1503+
1504+
Note that you can further increase verbosity and see all the commands executed by breeze
1505+
by running 'export VERBOSE_COMMANDS="true"' before running breeze.
1506+
1507+
--dry-run-docker
1508+
Only show docker commands to execute instead of actually executing them. The docker
1509+
commands are printed in yellow color.
1510+
1511+
1512+
####################################################################################################
1513+
1514+
14611515
Detailed usage for command: cleanup-image
14621516
14631517
@@ -1559,61 +1613,6 @@ This is the current syntax for `./breeze <./breeze>`_:
15591613
####################################################################################################
15601614
15611615
1562-
Detailed usage for command: push-image
1563-
1564-
1565-
breeze push_image [FLAGS]
1566-
1567-
Pushes images to GitHub registry.
1568-
1569-
You can add --github-repository to push to a different repository/organisation.
1570-
You can add --github-image-id <COMMIT_SHA> in case you want to push image with specific
1571-
SHA tag.
1572-
You can also add --production-image flag to switch to production image (default is CI one)
1573-
1574-
Examples:
1575-
1576-
'breeze push-image' or
1577-
'breeze push-image --production-image' - to push production image or
1578-
'breeze push-image \
1579-
--github-repository user/airflow' - to push to your user's fork
1580-
'breeze push-image \
1581-
--github-image-id 9a621eaa394c0a0a336f8e1b31b35eff4e4ee86e' - to push with COMMIT_SHA
1582-
1583-
Flags:
1584-
1585-
-g, --github-repository GITHUB_REPOSITORY
1586-
GitHub repository used to pull, push images.
1587-
Default: apache/airflow.
1588-
1589-
1590-
1591-
1592-
-s, --github-image-id COMMIT_SHA
1593-
<COMMIT_SHA> of the image. Images in GitHub registry are stored with those
1594-
to be able to easily find the image for particular CI runs. Once you know the
1595-
<COMMIT_SHA>, you can specify it in github-image-id flag and Breeze will
1596-
automatically pull and use that image so that you can easily reproduce a problem
1597-
that occurred in CI.
1598-
1599-
Default: latest.
1600-
1601-
-v, --verbose
1602-
Show verbose information about executed docker, kind, kubectl, helm commands. Useful for
1603-
debugging - when you run breeze with --verbose flags you will be able to see the commands
1604-
executed under the hood and copy&paste them to your terminal to debug them more easily.
1605-
1606-
Note that you can further increase verbosity and see all the commands executed by breeze
1607-
by running 'export VERBOSE_COMMANDS="true"' before running breeze.
1608-
1609-
--dry-run-docker
1610-
Only show docker commands to execute instead of actually executing them. The docker
1611-
commands are printed in yellow color.
1612-
1613-
1614-
####################################################################################################
1615-
1616-
16171616
Detailed usage for command: initialize-local-virtualenv
16181617
16191618
@@ -1903,17 +1902,6 @@ This is the current syntax for `./breeze <./breeze>`_:
19031902
automatically for the first time or when changes are detected in
19041903
package-related files, but you can force it using this flag.
19051904
1906-
-P, --force-pull-images
1907-
Forces pulling of images from GitHub Container Registry before building to populate cache.
1908-
The images are pulled by default only for the first time you run the
1909-
environment, later the locally build images are used as cache.
1910-
1911-
--check-if-base-python-image-updated
1912-
Checks if Python base image from DockerHub has been updated vs the current python base
1913-
image we store in GitHub Container Registry. Python images are updated regularly with
1914-
security fixes, this switch will check if a new one has been released and will pull and
1915-
prepare a new base python based on the latest one.
1916-
19171905
--cleanup-docker-context-files
19181906
Removes whl and tar.gz files created in docker-context-files before running the command.
19191907
In case there are some files there it unnecessarily increases the context size and
@@ -2498,17 +2486,6 @@ This is the current syntax for `./breeze <./breeze>`_:
24982486
automatically for the first time or when changes are detected in
24992487
package-related files, but you can force it using this flag.
25002488
2501-
-P, --force-pull-images
2502-
Forces pulling of images from GitHub Container Registry before building to populate cache.
2503-
The images are pulled by default only for the first time you run the
2504-
environment, later the locally build images are used as cache.
2505-
2506-
--check-if-base-python-image-updated
2507-
Checks if Python base image from DockerHub has been updated vs the current python base
2508-
image we store in GitHub Container Registry. Python images are updated regularly with
2509-
security fixes, this switch will check if a new one has been released and will pull and
2510-
prepare a new base python based on the latest one.
2511-
25122489
--cleanup-docker-context-files
25132490
Removes whl and tar.gz files created in docker-context-files before running the command.
25142491
In case there are some files there it unnecessarily increases the context size and

CI.rst

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -149,22 +149,6 @@ You can use those variables when you try to reproduce the build locally.
149149
+-----------------------------------------+-------------+--------------+------------+-------------------------------------------------+
150150
| Force variables |
151151
+-----------------------------------------+-------------+--------------+------------+-------------------------------------------------+
152-
| ``FORCE_PULL_IMAGES`` | true | true | true | Determines if images are force-pulled, |
153-
| | | | | no matter if they are already present |
154-
| | | | | locally. This includes not only the |
155-
| | | | | CI/PROD images but also the Python base |
156-
| | | | | images. Note that if Python base images |
157-
| | | | | change, also the CI and PROD images |
158-
| | | | | need to be fully rebuild unless they were |
159-
| | | | | already built with that base Python |
160-
| | | | | image. This is false for local development |
161-
| | | | | to avoid often pulling and rebuilding |
162-
| | | | | the image. It is true for CI workflow in |
163-
| | | | | case waiting from images is enabled |
164-
| | | | | as the images needs to be force-pulled from |
165-
| | | | | GitHub Registry, but it is set to |
166-
| | | | | false when waiting for images is disabled. |
167-
+-----------------------------------------+-------------+--------------+------------+-------------------------------------------------+
168152
| ``FORCE_BUILD_IMAGES`` | false | false | false | Forces building images. This is generally not |
169153
| | | | | very useful in CI as in CI environment image |
170154
| | | | | is built or pulled only once, so there is no |

Dockerfile

Lines changed: 2 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,8 @@
3333
# all the build essentials. This makes the image
3434
# much smaller.
3535
#
36+
# Use the same builder frontend version for everyone
37+
# syntax=docker/dockerfile:1.3
3638
ARG AIRFLOW_VERSION="2.2.2"
3739
ARG AIRFLOW_EXTRAS="amazon,async,celery,cncf.kubernetes,dask,docker,elasticsearch,ftp,google,google_auth,grpc,hashicorp,http,ldap,microsoft.azure,mysql,odbc,pandas,postgres,redis,sendgrid,sftp,slack,ssh,statsd,virtualenv"
3840
ARG ADDITIONAL_AIRFLOW_EXTRAS=""
@@ -327,34 +329,6 @@ RUN if [[ -f /docker-context-files/requirements.txt ]]; then \
327329
pip install --no-cache-dir --user -r /docker-context-files/requirements.txt; \
328330
fi
329331

330-
ARG BUILD_ID
331-
ARG COMMIT_SHA
332-
ARG AIRFLOW_IMAGE_REPOSITORY
333-
ARG AIRFLOW_IMAGE_DATE_CREATED
334-
335-
ENV BUILD_ID=${BUILD_ID} COMMIT_SHA=${COMMIT_SHA}
336-
337-
LABEL org.apache.airflow.distro="debian" \
338-
org.apache.airflow.distro.version="buster" \
339-
org.apache.airflow.module="airflow" \
340-
org.apache.airflow.component="airflow" \
341-
org.apache.airflow.image="airflow-build-image" \
342-
org.apache.airflow.version="${AIRFLOW_VERSION}" \
343-
org.apache.airflow.build-image.build-id=${BUILD_ID} \
344-
org.apache.airflow.build-image.commit-sha=${COMMIT_SHA} \
345-
org.opencontainers.image.source=${AIRFLOW_IMAGE_REPOSITORY} \
346-
org.opencontainers.image.created=${AIRFLOW_IMAGE_DATE_CREATED} \
347-
org.opencontainers.image.authors="[email protected]" \
348-
org.opencontainers.image.url="https://airflow.apache.org" \
349-
org.opencontainers.image.documentation="https://airflow.apache.org/docs/docker-stack/index.html" \
350-
org.opencontainers.image.version="${AIRFLOW_VERSION}" \
351-
org.opencontainers.image.revision="${COMMIT_SHA}" \
352-
org.opencontainers.image.vendor="Apache Software Foundation" \
353-
org.opencontainers.image.licenses="Apache-2.0" \
354-
org.opencontainers.image.ref.name="airflow-build-image" \
355-
org.opencontainers.image.title="Build Image Segment for Production Airflow Image" \
356-
org.opencontainers.image.description="Reference build-time dependencies image for production-ready Apache Airflow image"
357-
358332
##############################################################################################
359333
# This is the actual Airflow image - much smaller than the build one. We copy
360334
# installed Airflow and all it's dependencies from the build image to make it smaller.

0 commit comments

Comments
 (0)