Skip to content

Conversation

@mjpieters
Copy link
Contributor

Later versions either add small features not used in Airflow, packaging updates and bug fixes or perf improvements. Moreover, newer Python releases already include the functionality needed.

It's generally a better idea to be permissive in dependency versions to prevent
accidentally limiting what other packages can be used with Airflow.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a use of importlib_resources that should be correct too.

setup.cfg Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need to up the min requirements for argcomplete too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. argcomplete~=1.10 is the same thing as argcomplete>=1.10,<2, and so 1.22 is included in that range.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put differently: that'd be a separate task if someone wanted to update this explicitly. But unless you depend on the bugfixes in later releases, I'd leave this be. No need to make the dependency link tighter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historical we've updated the intermediary rep, otherwise I thought:

Pip would leave argparse at 1.10, which has a dep of importlib_metadata, but then this new dep would install a newer importlib_metadata, leading to incompatible version errors.

Particularly for "older" pips. Am I misremembering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you use pip install -U Apache-airflow to upgrade airflow, a newer argcomplete would be installed, automatically.

In my experience it takes installing additional packages with incompatible dependencies for things to break; newer pip versions warn when this happens however. Running pip check lets you list the problem packages.

Copy link
Member

@potiuk potiuk Nov 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed: @mjpieters @ashb

importlib_resources~=1.4 is a requirement of cnf-lint which is requirement of moto 1.3.16 which we have. So anything above 1 will cause version conflict. mot 1.3.16 is the latest available version, so I do not think we can do anything about it.

moto==1.3.16
  - aws-xray-sdk [required: >=0.93,!=0.96, installed: 2.6.0]
    - botocore [required: >=1.11.3, installed: 1.18.18]
      - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
      - python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.1]
        - six [required: >=1.5, installed: 1.15.0]
      - urllib3 [required: >=1.20,<1.26, installed: 1.25.11]
    - future [required: Any, installed: 0.18.2]
    - jsonpickle [required: Any, installed: 1.4.1]
      - importlib-metadata [required: Any, installed: 1.7.0]
        - zipp [required: >=0.5, installed: 3.4.0]
    - wrapt [required: Any, installed: 1.12.1]
  - boto [required: >=2.36.0, installed: 2.49.0]
  - boto3 [required: >=1.9.201, installed: 1.15.18]
    - botocore [required: >=1.18.18,<1.19.0, installed: 1.18.18]
      - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
      - python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.1]
        - six [required: >=1.5, installed: 1.15.0]
      - urllib3 [required: >=1.20,<1.26, installed: 1.25.11]
    - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
    - s3transfer [required: >=0.3.0,<0.4.0, installed: 0.3.3]
      - botocore [required: >=1.12.36,<2.0a.0, installed: 1.18.18]
        - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
        - python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.1]
          - six [required: >=1.5, installed: 1.15.0]
        - urllib3 [required: >=1.20,<1.26, installed: 1.25.11]
  - botocore [required: >=1.12.201, installed: 1.18.18]
    - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
    - python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.1]
      - six [required: >=1.5, installed: 1.15.0]
    - urllib3 [required: >=1.20,<1.26, installed: 1.25.11]
  - cfn-lint [required: >=0.4.0, installed: 0.42.0]
    - aws-sam-translator [required: >=1.25.0, installed: 1.31.0]
      - boto3 [required: ~=1.5, installed: 1.15.18]
        - botocore [required: >=1.18.18,<1.19.0, installed: 1.18.18]
          - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
          - python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.1]
            - six [required: >=1.5, installed: 1.15.0]
          - urllib3 [required: >=1.20,<1.26, installed: 1.25.11]
        - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
        - s3transfer [required: >=0.3.0,<0.4.0, installed: 0.3.3]
          - botocore [required: >=1.12.36,<2.0a.0, installed: 1.18.18]
            - jmespath [required: >=0.7.1,<1.0.0, installed: 0.10.0]
            - python-dateutil [required: >=2.1,<3.0.0, installed: 2.8.1]
              - six [required: >=1.5, installed: 1.15.0]
            - urllib3 [required: >=1.20,<1.26, installed: 1.25.11]
      - jsonschema [required: ~=3.2, installed: 3.2.0]
        - attrs [required: >=17.4.0, installed: 20.3.0]
        - importlib-metadata [required: Any, installed: 1.7.0]
          - zipp [required: >=0.5, installed: 3.4.0]
        - pyrsistent [required: >=0.14.0, installed: 0.17.3]
        - setuptools [required: Any, installed: 50.3.2]
        - six [required: >=1.11.0, installed: 1.15.0]
      - six [required: ~=1.15, installed: 1.15.0]
    - importlib-resources [required: ~=1.4, installed: 1.5.0]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet again pip's lack of a fully featured dependency solver bites us

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to test it out, I've added the labels "upgrade to latest dependencies" and "full tests needed". Can you please @mjpieters amend and push --force-with-lease to test my theory?

I rebased and pushed before I noticed that a bot had removed the tags again.

I've pushed a change now that ensures we only allow importlib-resources 1.x (starting with 1.4)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah - full tests needed only works after the approval status is "Green" (whch is actually good - we can see if it wiorks for 3.6 first and once it does, we approve and the label will be added and we rerun it again with full matrix of tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see if it works. I will approve it tentatively, and please rebase @mjpieters - let's see if it will work.

@mjpieters
Copy link
Contributor Author

mjpieters commented Nov 29, 2020

I think there's a use of importlib_resources that should be correct too.

The single import I found is already guarded. Or were you thinking of something else?

setup.cfg Outdated
Copy link
Member

@potiuk potiuk Nov 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem might be that in our process (in master merges) we automatically do 'pip install .[all] --upgrade --upgrade-strategy eager` to automatically find set of latest constraints that are matching setup.py + setup.cfg limitations, Then we run pip check and then we run all tests and if it all passes we automatically commit the new constraints.

As far as I remember without ~1.* for importlib_resources in install_requires caused a problem as pip first tries to install the highest version matching in install_requires only - without looking at extras. Once it installs this version, it will only then try to do eager update on all other extras and one of our dependencies already has importib_resources ~= 1.4. Eager upgrade will never downgrade the packages installed by install_requires, so this results in version conflict.

@potiuk potiuk added upgrade to latest dependencies full tests needed We need to run full set of tests for this PR to merge labels Nov 29, 2020
@github-actions github-actions bot removed the full tests needed We need to run full set of tests for this PR to merge label Nov 29, 2020
@ashb
Copy link
Member

ashb commented Nov 29, 2020

I think there's a use of importlib_resources that should be correct too.

The single import I found is already guarded. Or were you thinking of something else?

Oh no that was it. Thought it wasn't guarded for some reason

@potiuk potiuk added the full tests needed We need to run full set of tests for this PR to merge label Nov 29, 2020
@github-actions github-actions bot removed the full tests needed We need to run full set of tests for this PR to merge label Nov 29, 2020
@ashb
Copy link
Member

ashb commented Nov 29, 2020

Opened aws-cloudformation/cfn-lint#1808 -- not sure how quickly that will get merged and released.

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

No broken requirements found. for 3.6 :)

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved to get the label working. But let's not merge it until IP check succeeds for all python versions.

@potiuk potiuk added the full tests needed We need to run full set of tests for this PR to merge label Nov 30, 2020
@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

Can you please rebase @jpieters ?

@github-actions
Copy link

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@mjpieters
Copy link
Contributor Author

@potiuk I don't know if the Google provider package import failure is related?

Traceback (most recent call last):
  File "/opt/airflow/dev/import_all_classes.py", line 71, in import_all_classes
    _module = importlib.import_module(modinfo.name)
  File "/usr/local/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/opt/airflow/airflow/providers/google/cloud/sensors/pubsub.py", line 25, in <module>
    from airflow.providers.google.cloud.hooks.pubsub import PubSubHook
  File "/opt/airflow/airflow/providers/google/cloud/hooks/pubsub.py", line 29, in <module>
    from google.cloud.pubsub_v1.types import (
ImportError: cannot import name 'RetryPolicy'

Would that be caused by an ancient version having been pulled in?

@mjpieters
Copy link
Contributor Author

Nope, it's that Google updated that package and moved the class, see googleapis/python-pubsub@6b9eec8

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

Nope, it's that Google updated that package and moved the class, see googleapis/python-pubsub@6b9eec8

I think we had <2.0.0 limit on all Google Python libraries. They released all libs in their 2.0 version - those are not backwards-compatibe and we have (or I thought so) limited those libs to <2.0 for all google lib (until we rewrite the operators)

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

Nope, it's that Google updated that package and moved the class, see googleapis/python-pubsub@6b9eec8

I think we had <2.0.0 limit on all Google Python libraries. They released all libs in their 2.0 version - those are not backwards-compatibe and we have (or I thought so) limited those libs to <2.0 for all google lib (until we rewrite the operators):

'google-cloud-pubsub>=1.0.0,<2.0.0',

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

I just downloaded the image built for your PR and inspecting it:

The good thing is that we are storing all the images from cI in the registry and we can always reproduce things locally:

./breeze --github-image-id 392290560 --python 3.6 in this case drops you inside the very image that was built in your PR

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

pip freeze |grep pubsub
google-cloud-pubsub==1.7.0

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

That works:

from google.cloud.pubsub_v1.types import RetryPolicy

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

Same here:

from airflow.providers.google.cloud.hooks.pubsub import PubSubHook

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

Interesting thing that in the same image all those work too:

import importlib
importlib.import_module('airflow.providers.google.cloud.example_dags.example_pubsub')
<module 'airflow.providers.google.cloud.example_dags.example_pubsub' from '/opt/airflow/airflow/providers/google/cloud/example_dags/example_pubsub.py'>
exit
Use exit() or Ctrl-D (i.e. EOF) to exit
exit()
root@02a0c7718f03:/opt/airflow# python
Python 3.6.12 (default, Oct 13 2020, 21:45:01)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
import importlib
importlib.import_module('airflow.providers.google.cloud.example_dags.example_pubsub')
<module 'airflow.providers.google.cloud.example_dags.example_pubsub' from '/opt/airflow/airflow/providers/google/cloud/example_dags/example_pubsub.py'>
from airflow.providers.google.cloud.operators.pubsub import (
... PubSubCreateSubscriptionOperator,
... PubSubCreateTopicOperator,
... PubSubDeleteSubscriptionOperator,
... PubSubDeleteTopicOperator,
... PubSubPublishMessageOperator,
... PubSubPullOperator,
... )

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

OK. I reproduced it : docker run -it --entrypoint /usr/local/bin/dumb-init docker.pkg.github.com/apache/airflow/master-python3.6-ci:392290560 -- /opt/airflow/scripts/in_container/run_prepare_provider_readme.sh

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

The problem is this line: "Installing remaining packages from 'all' extras"

We are running all remaining packages from "devel_all" extras: pip install -e ".[devel_all]". which installs "apache-beam" -but apparently this causes a host of downgrades:

  Attempting uninstall: cachetools
    Found existing installation: cachetools 4.1.1
    Uninstalling cachetools-4.1.1:
      Successfully uninstalled cachetools-4.1.1
  Attempting uninstall: httplib2
    Found existing installation: httplib2 0.18.1
    Uninstalling httplib2-0.18.1:
      Successfully uninstalled httplib2-0.18.1
  Attempting uninstall: google-resumable-media
    Found existing installation: google-resumable-media 1.1.0
    Uninstalling google-resumable-media-1.1.0:
      Successfully uninstalled google-resumable-media-1.1.0
  Attempting uninstall: pyarrow
    Found existing installation: pyarrow 2.0.0
    Uninstalling pyarrow-2.0.0:
      Successfully uninstalled pyarrow-2.0.0
  Attempting uninstall: mock
    Found existing installation: mock 4.0.2
    Uninstalling mock-4.0.2:
      Successfully uninstalled mock-4.0.2
  Attempting uninstall: google-cloud-bigquery-storage
    Found existing installation: google-cloud-bigquery-storage 2.1.0
    Uninstalling google-cloud-bigquery-storage-2.1.0:
      Successfully uninstalled google-cloud-bigquery-storage-2.1.0
  Attempting uninstall: google-cloud-bigquery
    Found existing installation: google-cloud-bigquery 2.4.0
    Uninstalling google-cloud-bigquery-2.4.0:
      Successfully uninstalled google-cloud-bigquery-2.4.0
  Attempting uninstall: fastavro
    Found existing installation: fastavro 1.2.0
    Uninstalling fastavro-1.2.0:
      Successfully uninstalled fastavro-1.2.0
  Attempting uninstall: dill
    Found existing installation: dill 0.3.3
    Uninstalling dill-0.3.3:
      Successfully uninstalled dill-0.3.3
  Attempting uninstall: google-cloud-vision
    Found existing installation: google-cloud-vision 1.0.0
    Uninstalling google-cloud-vision-1.0.0:
      Successfully uninstalled google-cloud-vision-1.0.0
  Attempting uninstall: google-cloud-videointelligence
    Found existing installation: google-cloud-videointelligence 1.16.1
    Uninstalling google-cloud-videointelligence-1.16.1:
      Successfully uninstalled google-cloud-videointelligence-1.16.1
  Attempting uninstall: google-cloud-spanner
    Found existing installation: google-cloud-spanner 1.19.1
    Uninstalling google-cloud-spanner-1.19.1:
      Successfully uninstalled google-cloud-spanner-1.19.1
  Attempting uninstall: google-cloud-pubsub
    Found existing installation: google-cloud-pubsub 1.7.0
    Uninstalling google-cloud-pubsub-1.7.0:
      Successfully uninstalled google-cloud-pubsub-1.7.0
  Attempting uninstall: google-cloud-dlp
    Found existing installation: google-cloud-dlp 1.0.0
    Uninstalling google-cloud-dlp-1.0.0:
      Successfully uninstalled google-cloud-dlp-1.0.0
  Attempting uninstall: google-cloud-bigtable
    Found existing installation: google-cloud-bigtable 1.6.0
    Uninstalling google-cloud-bigtable-1.6.0:
      Successfully uninstalled google-cloud-bigtable-1.6.0
  Attempting uninstall: google-cloud-storage
    Found existing installation: google-cloud-storage 1.33.0
    Uninstalling google-cloud-storage-1.33.0:
      Successfully uninstalled google-cloud-storage-1.33.0
  Attempting uninstall: apache-airflow
    Found existing installation: apache-airflow 2.0.0b3
    Uninstalling apache-airflow-2.0.0b3:
      Successfully uninstalled apache-airflow-2.0.0b3
  Running setup.py develop for apache-airflow
Successfully installed apache-airflow apache-beam-2.23.0 avro-python3-1.9.2.1 cachetools-3.1.1 crcmod-1.7 dill-0.3.1.1 
fastavro-0.23.6 fasteners-0.15 google-apitools-0.5.31 google-cloud-bigquery-1.24.0 google-cloud-bigquery-storage-1.1.0 
google-cloud-bigtable-1.0.0 google-cloud-datastore-1.7.4 google-cloud-dlp-0.13.0 google-cloud-pubsub-1.0.2 
google-cloud-spanner-1.13.0 google-cloud-storage-1.29.0 google-cloud-videointelligence-1.13.0
google-cloud-vision-0.42.0 google-resumable-media-0.5.1 httplib2-0.17.4 mock-2.0.0 monotonic-1.5 
oauth2client-3.0.0 pyarrow-0.17.1 pydot-1.4.1

Including google-cloud-pubsub-1.0.2

Apache-Beam is the only excluded package for CI image as it used to be known by really bad requirements, forcing many packages to downgrade.

# Those packages are excluded because they break tests (downgrading mock) and they are
# not needed to run our test suite.
PACKAGES_EXCLUDED_FOR_CI = [
    'apache-beam',
]

We are however installing it for provider package check to be able to import DataProc.

I was actually going to take a look at apache-beam for 2.0 now. I discussed it today with @mik-laj that the recent 'apache-beam' has better requirements and we should be able to install it without breaking too much - obviously not that version.

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

I will take a look at not excluding apache-beam tomorrow.

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

Also what might help is this PR which I have to finish: #12693

We are not using constraints in this install, which is the reason why so many packages got downgraded.

I will see to it tomorrow. Stay tuned.

@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

BTW. It is super easy to reproduce it all thanks to the image storage in registry. This is exactly why I implemented this: #12718 - it will tell you exactly what command you should run to be dropped to the same image that is used in CI and you can run exactly what was run in CI and get perfect reproduction capabilities (I did it now and once we merge this PR, you will get those instructions how to get there:

See this: https://github.com/apache/airflow/pull/12718/checks?check_run_id=1475794405#step:4:564.

potiuk added a commit to PolideaInternal/airflow that referenced this pull request Nov 30, 2020
Previously the output of instaling remaining packges when testing
provider imports was only shown on error. However it is useful
to know what's going on even if it clutters the log.

Note that this installation is only needed until we include
apache-beam in the installed packages on CI.

Related to apache#12703

This PR shows the output always .
@potiuk
Copy link
Member

potiuk commented Nov 30, 2020

Added #12703 to show output of this "devel_all" installation in logs also on success (previously it was only shown on failure)

potiuk added a commit that referenced this pull request Dec 1, 2020
)

Previously the output of instaling remaining packges when testing
provider imports was only shown on error. However it is useful
to know what's going on even if it clutters the log.

Note that this installation is only needed until we include
apache-beam in the installed packages on CI.

Related to #12703

This PR shows the output always .
@ashb
Copy link
Member

ashb commented Dec 1, 2020

The change to cfn-lint has merged, from previous history it looks like they release every week or two.

Later versions either add small features not used in Airflow, packaging updates and bug fixes or perf improvements. Moreover, newer Python releases already include the functionality needed.
No point in keeping this here, as it is already part of the main dependencies.
There are provider dependencies that can't use anything newer, and pip is not sophisticated enough to downgrade packages based on new information.
@github-actions
Copy link

github-actions bot commented Dec 2, 2020

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@github-actions
Copy link

github-actions bot commented Dec 2, 2020

The Workflow run is cancelling this PR. Building image for the PR has been cancelled

@github-actions
Copy link

github-actions bot commented Dec 2, 2020

The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Feb 28, 2021
@github-actions github-actions bot closed this Mar 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:plugins full tests needed We need to run full set of tests for this PR to merge stale Stale PRs per the .github/workflows/stale.yml policy file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants