Skip to content

Conversation

@potiuk
Copy link
Member

@potiuk potiuk commented Oct 14, 2020

In Airflow 2.0 we decided to split Airlow into separate providers.
this means that when you prepare core airflow package, providers
are not installed by default. This is not very convenient for
local development though and for docker images built from sources,
where you would like to install all providers by default.

A new INSTALL_PROVIDERS_FROM_SOURCES environment variable controls
this behaviour now. It is is set to "true", all packages including
provider packages are installed. If missing or set to false, only
the core provider package is installed.

For Breeze, the default is set to "true", as for those cases you
want to install all providers in your environment. Similarly if you
build the production image from sources. However when you build
image using github tag or pip package, you should specify
appropriate extras to install the required provider packages.

Note that if you install Airflow via 'pip install .' from sources
in local virtualenv, provider packages are not going to be
installed unless you set INSTALL_PROVIDERS_FROM_SOURCES to "true".

Fixes #11489


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

@potiuk
Copy link
Member Author

potiuk commented Oct 14, 2020

Hey @ashb @kaxil @turbaszek, others. I looked how to solve the installation problem described in #11489 and failing Kubernetes builds (caused likely by the provider split) and I believe i found the best approach. I think we want to keep this:

  • in any setup where we use packages, we want to install separately airflow and providers
  • in any setup where we use sources, we want to install airflow + providers together for development convenience

The proposal I have is a variable INSTALL_PROVIDERS_FROM_SOURCES. When this flag is "true", airflow will install providers from sources, when it is missing or anything else but "true", it will install only airflow core. By default in Breeze and when building images from sources I set this variable to "true" (but it can be set to false by --skip-installing-airflow-providers flag so that you can also install a "bare" airflow in Breeze or prepare a "bare" image without any providers easily. Together with in-progess #11464 (I will have to add conditional dependencies there and rebase on top of this) it will have exactly the desired effect:

  • when installing airflow with Breeze from sources (both prod and CI), we continue installing all providers from sources like we have in 1.10. Extras will not cause installing of apache-airflow-providers-* as those dependencies will be disabled in this case. You can also disable installing all providers by -skip-install0ing-airlfow-providers

  • when installing airflow with Breeze from PyPI or GitHub, only the providers required by "extras" will be installed

  • when installing airflow locally with "." (without -e) only bare Airflow will be installed, provders will only be available if you happen to be in the airflow directory or if you install provider packages manually. You can also set INSTALL_PROVIDER_SOURCES="true" before installation and then all providers will be installed as in 1.10.

  • when installing airflow locally with -e '.' - providers will be automatically installed and available. Since 'airflow' is taken direclty from sources, providers are there and they will be importable.

I tested all the above scenarios and I think it makes perfect sense. Let me know what you think.

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestions left

@potiuk potiuk force-pushed the add-install-all-airflow-providers-flag branch from 429530f to f4a2aa4 Compare October 14, 2020 21:00
@potiuk
Copy link
Member Author

potiuk commented Oct 14, 2020

All resolved @kaxil!

@potiuk potiuk force-pushed the add-install-all-airflow-providers-flag branch from f4a2aa4 to 6730b9c Compare October 14, 2020 21:08
@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks$,^Build docs$,^Spell check docs$,^Backport packages$,^Checks: Helm tests$,^Test OpenAPI*.

@potiuk potiuk force-pushed the add-install-all-airflow-providers-flag branch from 6730b9c to 4bdb570 Compare October 14, 2020 22:07
In Airflow 2.0 we decided to split Airlow into separate providers.
this means that when you prepare core airflow package, providers
are not installed by default. This is not very convenient for
local development though and for docker images built from sources,
where you would like to install all providers by default.

A new INSTALL_ALL_AIRFLOW_PROVIDERS environment variable controls
this behaviour now. It is is set to "true", all packages including
provider packages are installed. If missing or set to false, only
the core provider package is installed.

For Breeze, the default is set to "true", as for those cases you
want to install all providers in your environment. Similarly if you
build the production image from sources. However when you build
image using github tag or pip package, you should specify
appropriate extras to install the required provider packages.

Note that if you install Airflow via 'pip install .' from sources
in local virtualenv, provider packages are not going to be
installed unless you set INSTALL_ALL_AIRFLOW_PROVIDERS to "true".

Fixes apache#11489
@potiuk potiuk force-pushed the add-install-all-airflow-providers-flag branch from 4bdb570 to a46fac6 Compare October 16, 2020 20:01
@potiuk potiuk merged commit 925f761 into apache:master Oct 17, 2020
@potiuk potiuk deleted the add-install-all-airflow-providers-flag branch October 17, 2020 09:16
@potiuk potiuk linked an issue Oct 18, 2020 that may be closed by this pull request
potiuk added a commit to PolideaInternal/airflow that referenced this pull request Oct 22, 2020
There was a typo in the original file when review was made in
the apache#11529 but apparently this typo was still left in one place
and as the result, providers have not been installed in the
master Dockerfile.

Fixes apache#11695
potiuk added a commit that referenced this pull request Oct 22, 2020
There was a typo in the original file when review was made in
the #11529 but apparently this typo was still left in one place
and as the result, providers have not been installed in the
master Dockerfile.

Fixes #11695
michalmisiewicz pushed a commit to michalmisiewicz/airflow that referenced this pull request Oct 30, 2020
There was a typo in the original file when review was made in
the apache#11529 but apparently this typo was still left in one place
and as the result, providers have not been installed in the
master Dockerfile.

Fixes apache#11695
potiuk added a commit that referenced this pull request Nov 14, 2020
There was a typo in the original file when review was made in
the #11529 but apparently this typo was still left in one place
and as the result, providers have not been installed in the
master Dockerfile.

Fixes #11695

(cherry picked from commit eba1d91)
potiuk added a commit that referenced this pull request Nov 16, 2020
There was a typo in the original file when review was made in
the #11529 but apparently this typo was still left in one place
and as the result, providers have not been installed in the
master Dockerfile.

Fixes #11695

(cherry picked from commit eba1d91)
potiuk added a commit that referenced this pull request Nov 16, 2020
There was a typo in the original file when review was made in
the #11529 but apparently this typo was still left in one place
and as the result, providers have not been installed in the
master Dockerfile.

Fixes #11695

(cherry picked from commit eba1d91)
kaxil pushed a commit that referenced this pull request Nov 18, 2020
There was a typo in the original file when review was made in
the #11529 but apparently this typo was still left in one place
and as the result, providers have not been installed in the
master Dockerfile.

Fixes #11695

(cherry picked from commit eba1d91)
cfei18 pushed a commit to cfei18/incubator-airflow that referenced this pull request Mar 5, 2021
There was a typo in the original file when review was made in
the apache#11529 but apparently this typo was still left in one place
and as the result, providers have not been installed in the
master Dockerfile.

Fixes apache#11695

(cherry picked from commit eba1d91)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make sure Docker works with installing core + providers. Make sure that local installation works with installing airflow + all providers.

3 participants