00353: Original names for architectures with different names downstream by frenzymadness · Pull Request #15 · fedora-python/cpython

frenzymadness · 2020-08-03T10:40:59Z

Pythons in RHEL/Fedora use different names for some architectures
than upstream and other distros (for example ppc64 vs. powerpc64).
See patch 274.
That means that an extension built with the default upstream settings
(on other distro or as an manylinux wheel) cannot be found by Python
on RHEL/Fedora because it has a different suffix.
This patch adds the original names to importlib so Python is able
to import extensions with an original architecture name in its
file name.

Tested on ppc64le with a wheel from simple-manylinux-demo:

# pip3 install simple-manylinux-demo
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
Collecting simple-manylinux-demo
  Downloading simple_manylinux_demo-1.0-cp39-cp39-manylinux2014_ppc64le.whl (17 kB)
Installing collected packages: simple-manylinux-demo
Successfully installed simple-manylinux-demo-1.0
# default Python
# python3 -m unittest discover dummyextension.tests
E
======================================================================
ERROR: dummyextension.tests.test_extension (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: dummyextension.tests.test_extension
Traceback (most recent call last):
  File "/usr/lib64/python3.9/unittest/loader.py", line 436, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/lib64/python3.9/unittest/loader.py", line 377, in _get_module_from_name
    __import__(name)
  File "/usr/local/lib64/python3.9/site-packages/dummyextension/tests/test_extension.py", line 2, in <module>
    from dummyextension.extension import hello
ModuleNotFoundError: No module named 'dummyextension.extension'


----------------------------------------------------------------------
Ran 1 test in 0.000s

FAILED (errors=1)
# ll /usr/local/lib64/python3.9/site-packages/dummyextension/
total 88
-rwxr-xr-x. 1 root root 88832 Aug  3 04:14 extension.cpython-39-powerpc64le-linux-gnu.so
-rw-r--r--. 1 root root     0 Aug  3 04:14 __init__.py
drwxr-xr-x. 2 root root    37 Aug  3 04:14 __pycache__
drwxr-xr-x. 3 root root    69 Aug  3 04:14 tests
# Patched version
# python3 -m unittest discover dummyextension.tests
.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK
# New loader registered for the longer name
>>> import sys
>>> sys.path_importer_cache
{'/usr/lib64/python39.zip': None, '/usr/lib64/python3.9': FileFinder('/usr/lib64/python3.9'), '/usr/lib64/python3.9/encodings': FileFinder('/usr/lib64/python3.9/encodings'), '/usr/lib64/python3.9/lib-dynload': FileFinder('/usr/lib64/python3.9/lib-dynload'), '/usr/local/lib64/python3.9/site-packages': FileFinder('/usr/local/lib64/python3.9/site-packages'), '/usr/lib64/python3.9/site-packages': FileFinder('/usr/lib64/python3.9/site-packages'), '/usr/lib/python3.9/site-packages': FileFinder('/usr/lib/python3.9/site-packages'), '/root': FileFinder('/root')}
>>> ff = sys.path_importer_cache['/usr/lib64/python3.9/site-packages']
>>> ff._loaders
[('.cpython-39-ppc64le-linux-gnu.so', <class '_frozen_importlib_external.ExtensionFileLoader'>), ('.abi3.so', <class '_frozen_importlib_external.ExtensionFileLoader'>), ('.so', <class '_frozen_importlib_external.ExtensionFileLoader'>), ('.cpython-39-powerpc64le-linux-gnu.so', <class '_frozen_importlib_external.ExtensionFileLoader'>), ('.py', <class '_frozen_importlib_external.SourceFileLoader'>), ('.pyc', <class '_frozen_importlib_external.SourcelessFileLoader'>)]

This might be also implemented on C level somewhere around _PyImport_DynLoadFiletab but I think that this version is much clearer and easy to understand.

hroncok · 2020-08-03T11:17:51Z

I wanted to check on armv7hl, but apparently, there is no such wheel in simple-manylinux-demo. Can you please make one, or is there some obstacle?

hroncok · 2020-08-03T11:23:27Z

Oh, there is no manylinux image for armv7hl on https://quay.io/organization/pypa :/

frenzymadness · 2020-08-03T11:24:36Z

Exactly. I'd have to find a machine with that architecture and build the wheel manually there.

frenzymadness · 2020-08-03T11:26:10Z

manylinux2014 itself should support it: https://www.python.org/dev/peps/pep-0599/#the-manylinux2014-policy

frenzymadness · 2020-08-03T11:27:43Z

And to test it, I'd have to compile my own Python or build the wheel outside RHEL/Centos so the suffix is not configured by us.

frenzymadness · 2020-08-03T18:27:40Z

According to pep8, constants should be defined at the module level.

Another speed improvement I can think about is to return from the function immediately when it discovers a platform is a one we don't have an alternative name for. Because it isn't possible to do an import (platform) here, I'd implement it this way:

for suffix in suffixes:
        if "-x86_64-" in suffix or "-i386-" in suffix:
            return suffixes

But I don't think it's worth it because running this function thousand times with randomly shuffled suffixes takes only around 0,052 s (interpreter startup time included).

hroncok · 2020-08-03T18:40:56Z

Don't optimize it (yet). What do you measure? I think we need to measure interpreter startup with and without this patch.

frenzymadness · 2020-08-04T04:49:13Z

Don't optimize it (yet). What do you measure? I think we need to measure interpreter startup with and without this patch.

I've measured how long it takes for interpreter to start and call this function thousand times.

frenzymadness · 2020-08-04T05:54:21Z

Python has a special benchmark for importlib measuring how many times you can do a specific type of import within one second. The results for old (original Python) and new (with this change) are:

Comparing new vs. old

sys.modules : 67,151 vs. 69,773 (96.242099%)
Built-in module : 24,552 vs. 24,898 (98.610330%)
Source writing bytecode: small : 5,641 vs. 5,668 (99.523641%)
Source w/o bytecode: small : 7,692 vs. 7,796 (98.665983%)
Source w/ bytecode: small : 10,807 vs. 10,922 (98.947079%)
Source writing bytecode: tabnanny : 553 vs. 568 (97.359155%)
Source w/o bytecode: tabnanny : 684 vs. 535 (127.850467%)
Source w/ bytecode: tabnanny : 4,396 vs. 3,615 (121.604426%)
Source writing bytecode: decimal : 2,945 vs. 1,929 (152.669777%)
Source w/o bytecode: decimal : 3,799 vs. 3,014 (126.045123%)
Source w/ bytecode: decimal : 5,958 vs. 5,181 (114.997105%)

The numbers show how many times you can do an import within one second. I honestly think that the numbers are so high that the difference is hardly measurable. To be precise here would require a special isolated environment.

I've also tried python_startup benchmark from pyperformance project and the result is:

### python_startup ###
Mean +- std dev: 10.1 ms +- 0.9 ms -> 9.9 ms +- 0.4 ms: 1.02x faster
Significant (t=3.54)

Again, the startup time is so short that it's hard to measure the difference. Also, I've measured those numbers on x86_64 on which the function doesn't end sooner.

hroncok · 2020-08-04T08:55:57Z

The code looks reasonable. For easier backporting (and as a premature optimization), I'd consider having the strings in the dict already in the -{original}. form. I.e.:

_ARCH_MAP = {
    "-arm-linux-gnueabi.": "-arm-linux-gnueabihf.",

For the impact on speed, I'd very much like to hear @vstinner's opinion.

encukou · 2020-08-04T09:04:07Z

Will I do?
AFAICS, this runs once on startup. The impact on speed will be negligible.

frenzymadness · 2020-08-04T09:16:02Z

If you like the code now, I can do one more test on a machine with an affected architecture, and then we can merge this.

hroncok · 2020-08-04T09:22:49Z

Will I do?

I am afraid I don't understand this question.

hroncok · 2020-08-04T09:25:25Z

If you like the code now, I can do one more test on a machine with an affected architecture, and then we can merge this.

Plese rebase https://src.fedoraproject.org/rpms/python3.9/pull-request/24 to include this patch, so we get a build. You can use importpatches with --head fedora-3.9 (assuming your fedora-3.9 branch is up to date with this PR and not with the @fedora-python).

hroncok · 2020-08-04T10:50:49Z

Fox a while, the patch included the regenerated header file. Now it doesn't. Note that we need to be able to build Python --with bootstrap as well, so while ugly, this needs to stay.

frenzymadness · 2020-08-04T11:14:55Z

Fox a while, the patch included the regenerated header file. Now it doesn't. Note that we need to be able to build Python --with bootstrap as well, so while ugly, this needs to stay.

Could you please explain why do you need this feature to be enabled in a python interpreter between a bootstrap build and the first regular one? I thought that we can make this patch nicer and don't include the generated header file and let %make_build regenerate it.

hroncok · 2020-08-04T11:43:27Z

What do you mean by "between a bootstrap build and the first regular one"? The package needs to be able to build with bootstrap at any time. It is needed on several occasions (e.g. when the ABI compatibility is broken, or when we import the package to a RHEL module or SCL).

Pythons in RHEL/Fedora use different names for some architectures than upstream and other distros (for example ppc64 vs. powerpc64). See patch 274. That means that an extension built with the default upstream settings (on other distro or as an manylinux wheel) cannot be found by Python on RHEL/Fedora because it has a different suffix. This patch adds the original names to importlib so Python is able to import extensions with an original architecture name in its file name. WARNING: This patch has no effect on Python built with bootstrap enabled because Python/importlib_external.h is not regenerated and therefore Python during bootstrap contains importlib from upstream without this feature. It's possible to include Python/importlib_external.h to this patch but it'd make rebasing a nightmare because it's basically a binary file.

frenzymadness · 2020-08-04T12:05:40Z

Added a warning about zero effect of this patch during a bootstrap build.

frenzymadness · 2020-08-06T05:55:35Z

I am going to merge this and add fedora-3.9.0b5-5 tag.

frenzymadness · 2020-08-06T06:17:28Z

The tag is there. I see that there are no merge commits, should I remove the last one and force-push the fedora-3.9 branch? Do we want to disable merge commits in pull requests?

frenzymadness · 2020-08-06T07:08:53Z

The patch applies cleanly down to 3.6 and has a simple conflict on 3.5. May I tag and push all the branches or open pull requests for each one?

hroncok · 2020-08-06T08:45:51Z

The tag is there. I see that there are no merge commits, should I remove the last one and force-push the fedora-3.9 branch?

Yes and retag. I have no idea what does the automation do with such commits, it has never been tested and even if it accidentally works, it will break format-patch and other workflows. Please, merge without a merge commit next time.

Do we want to disable merge commits in pull requests?

If you think it will help to prevent this, make it so.

The patch applies cleanly down to 3.6 and has a simple conflict on 3.5. May I tag and push all the branches or open pull requests for each one?

I'd just put the commits in. The purpose of the PR here is to do patch review, not to waste time.

hroncok · 2020-08-06T08:46:26Z

And thanks for looking into the workflow even if it is more complicated for now ❤️

frenzymadness · 2020-08-06T10:15:48Z

The tag is there. I see that there are no merge commits, should I remove the last one and force-push the fedora-3.9 branch?

Yes and retag. I have no idea what does the automation do with such commits, it has never been tested and even if it accidentally works, it will break format-patch and other workflows. Please, merge without a merge commit next time.

Done, force-pushed with the appropriate tag

Do we want to disable merge commits in pull requests?

If you think it will help to prevent this, make it so.

Somebody did it already.

The patch applies cleanly down to 3.6 and has a simple conflict on 3.5. May I tag and push all the branches or open pull requests for each one?

I'd just put the commits in. The purpose of the PR here is to do patch review, not to waste time.

All branches down to 3.5 updated with appropriate tags.

encukou reviewed Aug 3, 2020

View reviewed changes

Comment thread Lib/importlib/_bootstrap_external.py Outdated

hroncok reviewed Aug 3, 2020

View reviewed changes

Comment thread Lib/importlib/_bootstrap_external.py Outdated

Comment thread Lib/importlib/_bootstrap_external.py Outdated

hroncok reviewed Aug 3, 2020

View reviewed changes

Comment thread Lib/importlib/_bootstrap_external.py Outdated

Comment thread Lib/importlib/_bootstrap_external.py Outdated

frenzymadness force-pushed the fedora-3.9 branch 2 times, most recently from 53f5ad5 to 9e83f1f Compare August 3, 2020 18:10

frenzymadness force-pushed the fedora-3.9 branch from 9e83f1f to 5823489 Compare August 4, 2020 09:14

frenzymadness force-pushed the fedora-3.9 branch from 5823489 to 03e858c Compare August 4, 2020 10:05

frenzymadness force-pushed the fedora-3.9 branch from 03e858c to b27679a Compare August 4, 2020 12:02

frenzymadness merged this pull request into fedora-python:fedora-3.9 Aug 6, 2020

frenzymadness deleted the fedora-3.9 branch August 6, 2020 05:56

Conversation

frenzymadness commented Aug 3, 2020

Uh oh!

hroncok commented Aug 3, 2020

Uh oh!

hroncok commented Aug 3, 2020

Uh oh!

frenzymadness commented Aug 3, 2020

Uh oh!

frenzymadness commented Aug 3, 2020

Uh oh!

frenzymadness commented Aug 3, 2020

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

frenzymadness commented Aug 3, 2020

Uh oh!

hroncok commented Aug 3, 2020

Uh oh!

frenzymadness commented Aug 4, 2020

Uh oh!

frenzymadness commented Aug 4, 2020

Uh oh!

hroncok commented Aug 4, 2020

Uh oh!

encukou commented Aug 4, 2020

Uh oh!

frenzymadness commented Aug 4, 2020

Uh oh!

hroncok commented Aug 4, 2020

Uh oh!

hroncok commented Aug 4, 2020

Uh oh!

hroncok commented Aug 4, 2020

Uh oh!

frenzymadness commented Aug 4, 2020

Uh oh!

hroncok commented Aug 4, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

frenzymadness commented Aug 4, 2020

Uh oh!

frenzymadness commented Aug 6, 2020

Uh oh!

frenzymadness commented Aug 6, 2020

Uh oh!

frenzymadness commented Aug 6, 2020

Uh oh!

hroncok commented Aug 6, 2020

Uh oh!

hroncok commented Aug 6, 2020

Uh oh!

frenzymadness commented Aug 6, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hroncok commented Aug 4, 2020 •

edited

Loading