Skip to content

00353: Original names for architectures with different names downstream#15

Merged
frenzymadness merged 1 commit intofedora-python:fedora-3.9from
frenzymadness:fedora-3.9
Aug 6, 2020
Merged

00353: Original names for architectures with different names downstream#15
frenzymadness merged 1 commit intofedora-python:fedora-3.9from
frenzymadness:fedora-3.9

Conversation

@frenzymadness
Copy link
Copy Markdown
Member

Pythons in RHEL/Fedora use different names for some architectures
than upstream and other distros (for example ppc64 vs. powerpc64).
See patch 274.
That means that an extension built with the default upstream settings
(on other distro or as an manylinux wheel) cannot be found by Python
on RHEL/Fedora because it has a different suffix.
This patch adds the original names to importlib so Python is able
to import extensions with an original architecture name in its
file name.

Tested on ppc64le with a wheel from simple-manylinux-demo:

# pip3 install simple-manylinux-demo
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
Collecting simple-manylinux-demo
  Downloading simple_manylinux_demo-1.0-cp39-cp39-manylinux2014_ppc64le.whl (17 kB)
Installing collected packages: simple-manylinux-demo
Successfully installed simple-manylinux-demo-1.0
# default Python
# python3 -m unittest discover dummyextension.tests
E
======================================================================
ERROR: dummyextension.tests.test_extension (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: dummyextension.tests.test_extension
Traceback (most recent call last):
  File "/usr/lib64/python3.9/unittest/loader.py", line 436, in _find_test_path
    module = self._get_module_from_name(name)
  File "/usr/lib64/python3.9/unittest/loader.py", line 377, in _get_module_from_name
    __import__(name)
  File "/usr/local/lib64/python3.9/site-packages/dummyextension/tests/test_extension.py", line 2, in <module>
    from dummyextension.extension import hello
ModuleNotFoundError: No module named 'dummyextension.extension'


----------------------------------------------------------------------
Ran 1 test in 0.000s

FAILED (errors=1)
# ll /usr/local/lib64/python3.9/site-packages/dummyextension/
total 88
-rwxr-xr-x. 1 root root 88832 Aug  3 04:14 extension.cpython-39-powerpc64le-linux-gnu.so
-rw-r--r--. 1 root root     0 Aug  3 04:14 __init__.py
drwxr-xr-x. 2 root root    37 Aug  3 04:14 __pycache__
drwxr-xr-x. 3 root root    69 Aug  3 04:14 tests
# Patched version
# python3 -m unittest discover dummyextension.tests
.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK
# New loader registered for the longer name
>>> import sys
>>> sys.path_importer_cache
{'/usr/lib64/python39.zip': None, '/usr/lib64/python3.9': FileFinder('/usr/lib64/python3.9'), '/usr/lib64/python3.9/encodings': FileFinder('/usr/lib64/python3.9/encodings'), '/usr/lib64/python3.9/lib-dynload': FileFinder('/usr/lib64/python3.9/lib-dynload'), '/usr/local/lib64/python3.9/site-packages': FileFinder('/usr/local/lib64/python3.9/site-packages'), '/usr/lib64/python3.9/site-packages': FileFinder('/usr/lib64/python3.9/site-packages'), '/usr/lib/python3.9/site-packages': FileFinder('/usr/lib/python3.9/site-packages'), '/root': FileFinder('/root')}
>>> ff = sys.path_importer_cache['/usr/lib64/python3.9/site-packages']
>>> ff._loaders
[('.cpython-39-ppc64le-linux-gnu.so', <class '_frozen_importlib_external.ExtensionFileLoader'>), ('.abi3.so', <class '_frozen_importlib_external.ExtensionFileLoader'>), ('.so', <class '_frozen_importlib_external.ExtensionFileLoader'>), ('.cpython-39-powerpc64le-linux-gnu.so', <class '_frozen_importlib_external.ExtensionFileLoader'>), ('.py', <class '_frozen_importlib_external.SourceFileLoader'>), ('.pyc', <class '_frozen_importlib_external.SourcelessFileLoader'>)]

This might be also implemented on C level somewhere around _PyImport_DynLoadFiletab but I think that this version is much clearer and easy to understand.

@hroncok
Copy link
Copy Markdown
Member

hroncok commented Aug 3, 2020

I wanted to check on armv7hl, but apparently, there is no such wheel in simple-manylinux-demo. Can you please make one, or is there some obstacle?

@hroncok
Copy link
Copy Markdown
Member

hroncok commented Aug 3, 2020

Oh, there is no manylinux image for armv7hl on https://quay.io/organization/pypa :/

@frenzymadness
Copy link
Copy Markdown
Member Author

Exactly. I'd have to find a machine with that architecture and build the wheel manually there.

@frenzymadness
Copy link
Copy Markdown
Member Author

@frenzymadness
Copy link
Copy Markdown
Member Author

And to test it, I'd have to compile my own Python or build the wheel outside RHEL/Centos so the suffix is not configured by us.

Comment thread Lib/importlib/_bootstrap_external.py Outdated
Comment thread Lib/importlib/_bootstrap_external.py Outdated
Comment thread Lib/importlib/_bootstrap_external.py Outdated
Comment thread Lib/importlib/_bootstrap_external.py Outdated
Comment thread Lib/importlib/_bootstrap_external.py Outdated
@frenzymadness frenzymadness force-pushed the fedora-3.9 branch 2 times, most recently from 53f5ad5 to 9e83f1f Compare August 3, 2020 18:10
@frenzymadness
Copy link
Copy Markdown
Member Author

According to pep8, constants should be defined at the module level.

Another speed improvement I can think about is to return from the function immediately when it discovers a platform is a one we don't have an alternative name for. Because it isn't possible to do an import (platform) here, I'd implement it this way:

for suffix in suffixes:
        if "-x86_64-" in suffix or "-i386-" in suffix:
            return suffixes

But I don't think it's worth it because running this function thousand times with randomly shuffled suffixes takes only around 0,052 s (interpreter startup time included).

@hroncok
Copy link
Copy Markdown
Member

hroncok commented Aug 3, 2020

Don't optimize it (yet). What do you measure? I think we need to measure interpreter startup with and without this patch.

@frenzymadness
Copy link
Copy Markdown
Member Author

Don't optimize it (yet). What do you measure? I think we need to measure interpreter startup with and without this patch.

I've measured how long it takes for interpreter to start and call this function thousand times.

@frenzymadness
Copy link
Copy Markdown
Member Author

Python has a special benchmark for importlib measuring how many times you can do a specific type of import within one second. The results for old (original Python) and new (with this change) are:

Comparing new vs. old

sys.modules : 67,151 vs. 69,773 (96.242099%)
Built-in module : 24,552 vs. 24,898 (98.610330%)
Source writing bytecode: small : 5,641 vs. 5,668 (99.523641%)
Source w/o bytecode: small : 7,692 vs. 7,796 (98.665983%)
Source w/ bytecode: small : 10,807 vs. 10,922 (98.947079%)
Source writing bytecode: tabnanny : 553 vs. 568 (97.359155%)
Source w/o bytecode: tabnanny : 684 vs. 535 (127.850467%)
Source w/ bytecode: tabnanny : 4,396 vs. 3,615 (121.604426%)
Source writing bytecode: decimal : 2,945 vs. 1,929 (152.669777%)
Source w/o bytecode: decimal : 3,799 vs. 3,014 (126.045123%)
Source w/ bytecode: decimal : 5,958 vs. 5,181 (114.997105%)

The numbers show how many times you can do an import within one second. I honestly think that the numbers are so high that the difference is hardly measurable. To be precise here would require a special isolated environment.

I've also tried python_startup benchmark from pyperformance project and the result is:

### python_startup ###
Mean +- std dev: 10.1 ms +- 0.9 ms -> 9.9 ms +- 0.4 ms: 1.02x faster
Significant (t=3.54)

Again, the startup time is so short that it's hard to measure the difference. Also, I've measured those numbers on x86_64 on which the function doesn't end sooner.

@hroncok
Copy link
Copy Markdown
Member

hroncok commented Aug 4, 2020

The code looks reasonable. For easier backporting (and as a premature optimization), I'd consider having the strings in the dict already in the -{original}. form. I.e.:

_ARCH_MAP = {
    "-arm-linux-gnueabi.": "-arm-linux-gnueabihf.",

For the impact on speed, I'd very much like to hear @vstinner's opinion.

@encukou
Copy link
Copy Markdown
Member

encukou commented Aug 4, 2020

Will I do?
AFAICS, this runs once on startup. The impact on speed will be negligible.

@frenzymadness
Copy link
Copy Markdown
Member Author

If you like the code now, I can do one more test on a machine with an affected architecture, and then we can merge this.

@hroncok
Copy link
Copy Markdown
Member

hroncok commented Aug 4, 2020

Will I do?

I am afraid I don't understand this question.

@hroncok
Copy link
Copy Markdown
Member

hroncok commented Aug 4, 2020

If you like the code now, I can do one more test on a machine with an affected architecture, and then we can merge this.

Plese rebase https://src.fedoraproject.org/rpms/python3.9/pull-request/24 to include this patch, so we get a build. You can use importpatches with --head fedora-3.9 (assuming your fedora-3.9 branch is up to date with this PR and not with the @fedora-python).

@hroncok
Copy link
Copy Markdown
Member

hroncok commented Aug 4, 2020

Fox a while, the patch included the regenerated header file. Now it doesn't. Note that we need to be able to build Python --with bootstrap as well, so while ugly, this needs to stay.

@frenzymadness
Copy link
Copy Markdown
Member Author

Fox a while, the patch included the regenerated header file. Now it doesn't. Note that we need to be able to build Python --with bootstrap as well, so while ugly, this needs to stay.

Could you please explain why do you need this feature to be enabled in a python interpreter between a bootstrap build and the first regular one? I thought that we can make this patch nicer and don't include the generated header file and let %make_build regenerate it.

@hroncok
Copy link
Copy Markdown
Member

hroncok commented Aug 4, 2020

What do you mean by "between a bootstrap build and the first regular one"? The package needs to be able to build with bootstrap at any time. It is needed on several occasions (e.g. when the ABI compatibility is broken, or when we import the package to a RHEL module or SCL).

Pythons in RHEL/Fedora use different names for some architectures
than upstream and other distros (for example ppc64 vs. powerpc64).
See patch 274.
That means that an extension built with the default upstream settings
(on other distro or as an manylinux wheel) cannot be found by Python
on RHEL/Fedora because it has a different suffix.
This patch adds the original names to importlib so Python is able
to import extensions with an original architecture name in its
file name.

WARNING: This patch has no effect on Python built with bootstrap
enabled because Python/importlib_external.h is not regenerated
and therefore Python during bootstrap contains importlib from
upstream without this feature. It's possible to include
Python/importlib_external.h to this patch but it'd make rebasing
a nightmare because it's basically a binary file.
@frenzymadness
Copy link
Copy Markdown
Member Author

Added a warning about zero effect of this patch during a bootstrap build.

@frenzymadness
Copy link
Copy Markdown
Member Author

I am going to merge this and add fedora-3.9.0b5-5 tag.

@frenzymadness frenzymadness merged this pull request into fedora-python:fedora-3.9 Aug 6, 2020
@frenzymadness frenzymadness deleted the fedora-3.9 branch August 6, 2020 05:56
@frenzymadness
Copy link
Copy Markdown
Member Author

The tag is there. I see that there are no merge commits, should I remove the last one and force-push the fedora-3.9 branch? Do we want to disable merge commits in pull requests?

@frenzymadness
Copy link
Copy Markdown
Member Author

The patch applies cleanly down to 3.6 and has a simple conflict on 3.5. May I tag and push all the branches or open pull requests for each one?

@hroncok
Copy link
Copy Markdown
Member

hroncok commented Aug 6, 2020

The tag is there. I see that there are no merge commits, should I remove the last one and force-push the fedora-3.9 branch?

Yes and retag. I have no idea what does the automation do with such commits, it has never been tested and even if it accidentally works, it will break format-patch and other workflows. Please, merge without a merge commit next time.

Do we want to disable merge commits in pull requests?

If you think it will help to prevent this, make it so.

The patch applies cleanly down to 3.6 and has a simple conflict on 3.5. May I tag and push all the branches or open pull requests for each one?

I'd just put the commits in. The purpose of the PR here is to do patch review, not to waste time.

@hroncok
Copy link
Copy Markdown
Member

hroncok commented Aug 6, 2020

And thanks for looking into the workflow even if it is more complicated for now ❤️

@frenzymadness
Copy link
Copy Markdown
Member Author

The tag is there. I see that there are no merge commits, should I remove the last one and force-push the fedora-3.9 branch?

Yes and retag. I have no idea what does the automation do with such commits, it has never been tested and even if it accidentally works, it will break format-patch and other workflows. Please, merge without a merge commit next time.

Done, force-pushed with the appropriate tag

Do we want to disable merge commits in pull requests?

If you think it will help to prevent this, make it so.

Somebody did it already.

The patch applies cleanly down to 3.6 and has a simple conflict on 3.5. May I tag and push all the branches or open pull requests for each one?

I'd just put the commits in. The purpose of the PR here is to do patch review, not to waste time.

All branches down to 3.5 updated with appropriate tags.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants