-
Notifications
You must be signed in to change notification settings - Fork 16.3k
Replace pkg_resources with importlib.metadata to avoid VersionConflict errors #12694
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace pkg_resources with importlib.metadata to avoid VersionConflict errors #12694
Conversation
|
Please test this @potiuk -- I wasn't able to get my providers erroring even without this change, but was with plugins. Both have been changed. |
|
Will do timorrow. Cool if we can have it working. |
|
The Workflow run is cancelling this PR. It has some failed jobs matching ^Pylint$,^Static checks,^Build docs$,^Spell check docs$,^Backport packages$,^Provider packages,^Checks: Helm tests$,^Test OpenAPI*. |
…t errors Using `pkg_resources.iter_entry_points` validates the version constraints, and if any fail it will throw an Exception for that entrypoint. This sounds nice, but is a huge mis-feature. So instead of that, switch to using importlib.metadata (well, it's backport importlib_metadata) that just gives us the entrypoints - no other verification of requirements is performed. This has two advantages: 1. providers and plugins load much more reliably. 2. it's faster too Closes apache#12692
44709ec to
c136067
Compare
| self.assertTrue('test_plugin' in self.app.blueprints) | ||
| self.assertEqual(self.app.blueprints['test_plugin'].name, bp.name) | ||
|
|
||
| @mock.patch('airflow.plugins_manager.pkg_resources.iter_entry_points') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was in the "TestPluginsRBAC" class -- I moved to the "right" place.
| class TestPluginsDirectorySource(unittest.TestCase): | ||
| def test_should_provide_correct_attribute_values(self): | ||
| source = PluginsDirectorySource("./test_views.py") | ||
| self.assertEqual("$PLUGINS_FOLDER/../../test_views.py", str(source)) | ||
| self.assertEqual("<em>$PLUGINS_FOLDER/</em>../../test_views.py", source.__html__()) | ||
| self.assertEqual("../../test_views.py", source.path) | ||
|
|
||
|
|
||
| class TestEntryPointSource(unittest.TestCase): | ||
| def test_should_provide_correct_attribute_values(self): | ||
| mock_entrypoint = mock.Mock() | ||
| mock_entrypoint.dist = 'test-entrypoint-dist==1.0.0' | ||
| source = EntryPointSource(mock_entrypoint) | ||
| self.assertEqual("test-entrypoint-dist==1.0.0", source.dist) | ||
| self.assertEqual(str(mock_entrypoint), source.entrypoint) | ||
| self.assertEqual("test-entrypoint-dist==1.0.0: " + str(mock_entrypoint), str(source)) | ||
| self.assertEqual("<em>test-entrypoint-dist==1.0.0:</em> " + str(mock_entrypoint), source.__html__()) | ||
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were duplicating tests in test_plugins_manager.py -- they don't belong here.
|
How fast is it now? |
mjpieters
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this change, this will help with startup times! Because we have to load plugins early in airflow celery worker we are paying a plugin scan price everywhere else. :-/
| funcsigs>=1.0.0, <2.0.0 | ||
| graphviz>=0.12 | ||
| gunicorn>=19.5.0, <20.0 | ||
| importlib_metadata~=1.7 # We could work with 3.1, but argparse needs <2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you meant argcomplete perhaps? I note that argcomplete has been updated to support importlib_metadata 3.x (and otherwise only has bugfixes, looks like a desireable upgrade).
And if 3.1 works, please widen the pin to >=1.7,<4; the argcomplete~=1.10 pin above already lets you use importlib_metadata 3.1 if you have the newer argcomplete installed.
Also, Python 3.8 doesn't need importlib_metadata anymore, and argcomplete correctly won't depend on it in 3.8 and up. Please do so here too:
importlib_metadata>=1.7,<4;python_version<="3.7"
and make the imports conditional:
try:
# 3.8 and up
from importlib import metadata as importlib_metadata
except ImportError:
# use the backport
import importlib_metadataIf you update the argcomplete dependency too, then you'd not even have to set the >=1.7 lower bound, ~=3.1 would do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whops, yes I did mean argcomplete. Though I also found virtualenv (dep of pylint) was also pinning this, so it may not help us yet.
This line in the readme https://github.com/python/importlib_metadata/ :
As of Python 3.8, this functionality has been added to the Python standard library. This package supplies backports of that functionality including improvements added to subsequent Python versions
Makes me wonder if/when we should continue to use it anyway on Py 3.8?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whops, yes I did mean argcomplete. Though I also found virtualenv (dep of pylint) was also pinning this, so it may not help us yet.
Ah though virtualenv has been updated too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes me wonder if/when we should continue to use it anyway on Py 3.8?
Your not using the 3.9 / importlib_metadata 1.6 feature: Added module and attr attributes to EntryPoint; everything else is bug fixing or perf improvements, which all have been backported to 3.8.
Even if you were using those extra attributes, using python_version<="3.8" (and inverting the import guard) would be a good idea to keep the dependencies flexible and avoid limiting what packages people can use with Airflow.
I don't see where virtualenv depends on importlib_metadata? Nor does pylint depend on virtualenv. Might be in older versions. pre-commit depends on virtualenv though. And jsonlint depends on importlib_metadata but sets no pin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
God in getting my dist names all in a mix today. I did mean pre-commit -> virtualenv. I'll have to check what version of virtualenv pre-commit specifies.
I'll open a new PR to add the guard as you suggest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've already opened a PR :-) See #12703
…t errors (#12694) Using `pkg_resources.iter_entry_points` validates the version constraints, and if any fail it will throw an Exception for that entrypoint. This sounds nice, but is a huge mis-feature. So instead of that, switch to using importlib.metadata (well, it's backport importlib_metadata) that just gives us the entrypoints - no other verification of requirements is performed. This has two advantages: 1. providers and plugins load much more reliably. 2. it's faster too Closes #12692 (cherry picked from commit 7ef9aa7)
…t errors (#12694) Using `pkg_resources.iter_entry_points` validates the version constraints, and if any fail it will throw an Exception for that entrypoint. This sounds nice, but is a huge mis-feature. So instead of that, switch to using importlib.metadata (well, it's backport importlib_metadata) that just gives us the entrypoints - no other verification of requirements is performed. This has two advantages: 1. providers and plugins load much more reliably. 2. it's faster too Closes #12692 (cherry picked from commit 7ef9aa7)
…t errors (#12694) Using `pkg_resources.iter_entry_points` validates the version constraints, and if any fail it will throw an Exception for that entrypoint. This sounds nice, but is a huge mis-feature. So instead of that, switch to using importlib.metadata (well, it's backport importlib_metadata) that just gives us the entrypoints - no other verification of requirements is performed. This has two advantages: 1. providers and plugins load much more reliably. 2. it's faster too Closes #12692 (cherry picked from commit 7ef9aa7)
- BugFix: Tasks with ``depends_on_past`` or ``task_concurrency`` are stuck (apache#12663) - Fix issue with empty Resources in executor_config (apache#12633) - Fix: Deprecated config ``force_log_out_after`` was not used (apache#12661) - Fix empty asctime field in JSON formatted logs (apache#10515) - [AIRFLOW-2809] Fix security issue regarding Flask SECRET_KEY (apache#3651) - [AIRFLOW-2884] Fix Flask SECRET_KEY security issue in www_rbac (apache#3729) - [AIRFLOW-2886] Generate random Flask SECRET_KEY in default config (apache#3738) - Add missing comma in setup.py (apache#12790) - Bugfix: Unable to import Airflow plugins on Python 3.8 (apache#12859) - Fix setup.py missing comma in ``setup_requires`` (apache#12880) - Don't emit first_task_scheduling_delay metric for only-once dags (apache#12835) - Update setup.py to get non-conflicting set of dependencies (apache#12636) - Rename ``[scheduler] max_threads`` to ``[scheduler] parsing_processes`` (apache#12605) - Add metric for scheduling delay between first run task & expected start time (apache#9544) - Add new-style 2.0 command names for Airflow 1.10.x (apache#12725) - Add Kubernetes cleanup-pods CLI command for Helm Chart (apache#11802) - Don't let webserver run with dangerous config (apache#12747) - Replace pkg_resources with importlib.metadata to avoid VersionConflict errors (apache#12694) - Clarified information about supported Databases
…t errors (apache#12694) Using `pkg_resources.iter_entry_points` validates the version constraints, and if any fail it will throw an Exception for that entrypoint. This sounds nice, but is a huge mis-feature. So instead of that, switch to using importlib.metadata (well, it's backport importlib_metadata) that just gives us the entrypoints - no other verification of requirements is performed. This has two advantages: 1. providers and plugins load much more reliably. 2. it's faster too Closes apache#12692 (cherry picked from commit 7ef9aa7)


Using
pkg_resources.iter_entry_pointsvalidates the versionconstraints, and if any fail it will throw an Exception for that
entrypoint.
This sounds nice, but is a huge mis-feature.
So instead of that, switch to using importlib.metadata (well, it's
backport importlib_metadata) that just gives us the entrypoints - no
other verification of requirements is performed.
This has two advantages:
Closes #12692
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.