Skip to content

Commit ac9b224

Browse files
authored
Speed up Breeze experience on Mac OS (#23866)
This change should significantly speed up Breeze experience (and especially iterating over a change in Breeze for MacOS users - independently if you are using x86 or arm architecture. The problem with MacOS with docker is particularly slow filesystem used to map sources from Host to Docker VM. It is particularly bad when there are multiple small files involved. The improvement come from two areas: * removing duplicate pycache cleaning * moving MyPy cache to docker volume When entering breeze we are - just in case - cleaning .pyc and __pychache__ files potentially generated outside of the docker container - this is particularly useful if you use local IDE and you do not have bytecode generation disabled (we have it disabled in Breeze). Generating python bytecode might lead to various problems when you are switching branches and Python versions, so for Breeze development where the files change often anyway, disabling them and removing when they are found is important. This happens at entering breeze and it might take a second or two depending if you have locally generated. It could happen that __init script was called twice (depending which script was called - therefore the time could be double the one that was actually needed. Also if you ever generated provider packages, the time could be much longer, because node_modules generated in provider sources were not excluded from searching (and on MacOS it takes a LOT of time). This also led to duplicate time of exit as the initialization code installed traps that were also run twice. The traps however were rather fast so had no negative influence on performance. The change adds a guard so that initialization is only ever executed once. Second part of the change is moving the cache of mypy to a docker volume rather than being used from local source folder (default when complete sources are mounted). We were already using selective mount to make sure MacOS filesystem slowness affects us in minimal way - but with this change, the cache will be stored in docker volume that does not suffer from the same problems as mounting volumes from host. The Docker volume is preserved until the `docker stop` command is run - which means that iterating over a change should be WAY faster now - observed speed-up were around 5x speedups for MyPy pre-commit.
1 parent eff697a commit ac9b224

File tree

16 files changed

+86
-35
lines changed

16 files changed

+86
-35
lines changed

BREEZE.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -258,6 +258,8 @@ If you have several checked out Airflow sources, Breeze will warn you if you are
258258
source tree and will offer you to re-install from those sources - to make sure that you are using the right
259259
version.
260260

261+
You can skip Breeze's upgrade check by setting ``SKIP_BREEZE_UPGRADE_CHECK`` variable to non empty value.
262+
261263
By default Breeze works on the version of Airflow that you run it in - in case you are outside of the
262264
sources of Airflow and you installed Breeze from a directory - Breeze will be run on Airflow sources from
263265
where it was installed.
@@ -1052,6 +1054,12 @@ command but it is very similar to current ``breeze`` command):
10521054
</a>
10531055
</div>
10541056

1057+
.. note::
1058+
1059+
When you run static checks, some of the artifacts (mypy_cache) is stored in docker-compose volume
1060+
so that it can speed up static checks execution significantly. However, sometimes, the cache might
1061+
get broken, in which case you should run ``breeze stop`` to clean up the cache.
1062+
10551063

10561064
Building the Documentation
10571065
--------------------------

dev/breeze/src/airflow_breeze/commands/developer_commands.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -523,7 +523,7 @@ def stop(verbose: bool, dry_run: bool, preserve_volumes: bool):
523523
command_to_execute = ['docker-compose', 'down', "--remove-orphans"]
524524
if not preserve_volumes:
525525
command_to_execute.append("--volumes")
526-
shell_params = ShellParams(verbose=verbose)
526+
shell_params = ShellParams(verbose=verbose, backend="all")
527527
env_variables = get_env_variables_for_docker_commands(shell_params)
528528
run_command(command_to_execute, verbose=verbose, dry_run=dry_run, env=env_variables)
529529

dev/breeze/src/airflow_breeze/params/shell_params.py

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -165,19 +165,26 @@ def print_badge_info(self):
165165
get_console().print(f'[info]Backend: {self.backend} {self.backend_version}[/]')
166166
get_console().print(f'[info]Airflow used at runtime: {self.use_airflow_version}[/]')
167167

168+
def get_backend_compose_files(self, backend: str):
169+
if backend == "mssql":
170+
backend_docker_compose_file = (
171+
f"{str(SCRIPTS_CI_DIR)}/docker-compose/backend-{backend}-{self.debian_version}.yml"
172+
)
173+
else:
174+
backend_docker_compose_file = f"{str(SCRIPTS_CI_DIR)}/docker-compose/backend-{backend}.yml"
175+
backend_port_docker_compose_file = f"{str(SCRIPTS_CI_DIR)}/docker-compose/backend-{backend}-port.yml"
176+
return backend_docker_compose_file, backend_port_docker_compose_file
177+
168178
@property
169179
def compose_files(self):
170180
compose_ci_file = []
171181
main_ci_docker_compose_file = f"{str(SCRIPTS_CI_DIR)}/docker-compose/base.yml"
172-
if self.backend == "mssql":
173-
backend_docker_compose_file = (
174-
f"{str(SCRIPTS_CI_DIR)}/docker-compose/backend-{self.backend}-{self.debian_version}.yml"
175-
)
182+
if self.backend != "all":
183+
backend_files = self.get_backend_compose_files(self.backend)
176184
else:
177-
backend_docker_compose_file = f"{str(SCRIPTS_CI_DIR)}/docker-compose/backend-{self.backend}.yml"
178-
backend_port_docker_compose_file = (
179-
f"{str(SCRIPTS_CI_DIR)}/docker-compose/backend-{self.backend}-port.yml"
180-
)
185+
backend_files = []
186+
for backend in ALLOWED_BACKENDS:
187+
backend_files.extend(self.get_backend_compose_files(backend))
181188
local_docker_compose_file = f"{str(SCRIPTS_CI_DIR)}/docker-compose/local.yml"
182189
local_all_sources_docker_compose_file = f"{str(SCRIPTS_CI_DIR)}/docker-compose/local-all-sources.yml"
183190
files_docker_compose_file = f"{str(SCRIPTS_CI_DIR)}/docker-compose/files.yml"
@@ -194,17 +201,14 @@ def compose_files(self):
194201
compose_ci_file.append(
195202
f"{str(SCRIPTS_CI_DIR)}/docker-compose/backend-mssql-docker-volume.yml"
196203
)
197-
compose_ci_file.extend(
198-
[main_ci_docker_compose_file, backend_docker_compose_file, files_docker_compose_file]
199-
)
204+
compose_ci_file.extend([main_ci_docker_compose_file, *backend_files, files_docker_compose_file])
200205

201206
if self.mount_sources == MOUNT_SELECTED:
202207
compose_ci_file.extend([local_docker_compose_file])
203208
elif self.mount_sources == MOUNT_ALL:
204209
compose_ci_file.extend([local_all_sources_docker_compose_file])
205210
else: # none
206211
compose_ci_file.extend([remove_sources_docker_compose_file])
207-
compose_ci_file.extend([backend_port_docker_compose_file])
208212
if self.forward_credentials:
209213
compose_ci_file.append(forward_credentials_docker_compose_file)
210214
if self.use_airflow_version is not None:

dev/breeze/src/airflow_breeze/utils/docker_command_utils.py

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
"""Various utils to prepare docker and docker compose commands."""
1818
import os
1919
import re
20+
import subprocess
2021
import sys
2122
from copy import deepcopy
2223
from random import randint
@@ -97,6 +98,16 @@
9798
]
9899

99100

101+
def create_volume_if_missing(volume_name: str):
102+
res_inspect = run_command(cmd=["docker", "inspect", volume_name], stdout=subprocess.DEVNULL, check=False)
103+
if res_inspect.returncode != 0:
104+
run_command(cmd=["docker", "volume", "create", volume_name], check=True)
105+
106+
107+
def create_static_check_volumes():
108+
create_volume_if_missing("docker-compose_mypy-cache-volume")
109+
110+
100111
def get_extra_docker_flags(mount_sources: str) -> List[str]:
101112
"""
102113
Returns extra docker flags based on the type of mounting we want to do for sources.
@@ -110,6 +121,7 @@ def get_extra_docker_flags(mount_sources: str) -> List[str]:
110121
elif mount_sources == MOUNT_SELECTED:
111122
for flag in NECESSARY_HOST_VOLUMES:
112123
extra_docker_flags.extend(["-v", str(AIRFLOW_SOURCES_ROOT) + flag])
124+
extra_docker_flags.extend(['-v', "docker-compose_mypy-cache-volume:/opt/airflow/.mypy_cache/"])
113125
else: # none
114126
extra_docker_flags.extend(["-v", f"{AIRFLOW_SOURCES_ROOT / 'empty'}:/opt/airflow/airflow"])
115127
extra_docker_flags.extend(["-v", f"{AIRFLOW_SOURCES_ROOT}/files:/files"])

dev/breeze/src/airflow_breeze/utils/path_utils.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -62,7 +62,13 @@ def in_help() -> bool:
6262

6363

6464
def skip_upgrade_check():
65-
return in_self_upgrade() or in_autocomplete() or in_help() or hasattr(sys, '_called_from_test')
65+
return (
66+
in_self_upgrade()
67+
or in_autocomplete()
68+
or in_help()
69+
or hasattr(sys, '_called_from_test')
70+
or os.environ.get('SKIP_BREEZE_UPGRADE_CHECK')
71+
)
6672

6773

6874
def get_package_setup_metadata_hash() -> str:
@@ -235,7 +241,6 @@ def find_airflow_sources_root_to_operate_on() -> Path:
235241
BUILD_CACHE_DIR = AIRFLOW_SOURCES_ROOT / '.build'
236242
FILES_DIR = AIRFLOW_SOURCES_ROOT / 'files'
237243
MSSQL_DATA_VOLUME = AIRFLOW_SOURCES_ROOT / 'tmp_mssql_volume'
238-
MYPY_CACHE_DIR = AIRFLOW_SOURCES_ROOT / '.mypy_cache'
239244
LOGS_DIR = AIRFLOW_SOURCES_ROOT / 'logs'
240245
DIST_DIR = AIRFLOW_SOURCES_ROOT / 'dist'
241246
SCRIPTS_CI_DIR = AIRFLOW_SOURCES_ROOT / 'scripts' / 'ci'
@@ -253,7 +258,6 @@ def create_directories() -> None:
253258
BUILD_CACHE_DIR.mkdir(parents=True, exist_ok=True)
254259
FILES_DIR.mkdir(parents=True, exist_ok=True)
255260
MSSQL_DATA_VOLUME.mkdir(parents=True, exist_ok=True)
256-
MYPY_CACHE_DIR.mkdir(parents=True, exist_ok=True)
257261
LOGS_DIR.mkdir(parents=True, exist_ok=True)
258262
DIST_DIR.mkdir(parents=True, exist_ok=True)
259263
OUTPUT_LOG.mkdir(parents=True, exist_ok=True)

scripts/ci/docker-compose/backend-mssql-docker-volume.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,5 @@ services:
2020
mssql:
2121
volumes:
2222
- mssql-db-volume:/var/opt/mssql
23+
volumes:
24+
mssql-db-volume:

scripts/ci/docker-compose/backend-mysql.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,3 +44,5 @@ services:
4444
restart: always
4545
command: ['mysqld', '--character-set-server=utf8mb4',
4646
'--collation-server=utf8mb4_unicode_ci']
47+
volumes:
48+
mysql-db-volume:

scripts/ci/docker-compose/backend-postgres.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,3 +42,5 @@ services:
4242
timeout: 10s
4343
retries: 5
4444
restart: always
45+
volumes:
46+
postgres-db-volume:

scripts/ci/docker-compose/backend-sqlite.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,5 @@ services:
2525
volumes:
2626
- /dev/urandom:/dev/random # Required to get non-blocking entropy source
2727
- sqlite-db-volume:/root/airflow
28+
volumes:
29+
sqlite-db-volume:

scripts/ci/docker-compose/base.yml

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -90,8 +90,3 @@ services:
9090
- "${FLOWER_HOST_PORT}:5555"
9191
cap_add:
9292
- SYS_PTRACE
93-
volumes:
94-
sqlite-db-volume:
95-
postgres-db-volume:
96-
mysql-db-volume:
97-
mssql-db-volume:

0 commit comments

Comments
 (0)