[2.7] bpo-35757: subprocess.Popen: optimize close_fds for Linux#11584
[2.7] bpo-35757: subprocess.Popen: optimize close_fds for Linux#11584kolyshkin wants to merge 1 commit intopython:2.7from
Conversation
|
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Our records indicate we have not received your CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. If you have recently signed the CLA, please wait at least one business day You can check yourself to see if the CLA has been received. Thanks again for your contribution, we look forward to reviewing it! |
asvetlov
left a comment
There was a problem hiding this comment.
Please avoid bare excepts.
I know old code uses it but there is no reason to follow this practice in new code.
In case close_fds=True is passed to subprocess.Popen()
or its users (subprocess.call() etc), it might spend
some considerable time closing non-opened file descriptors,
as demonstrated by the following snippet from strace:
close(3) = -1 EBADF (Bad file descriptor)
close(5) = -1 EBADF (Bad file descriptor)
close(6) = -1 EBADF (Bad file descriptor)
close(7) = -1 EBADF (Bad file descriptor)
...
close(1021) = -1 EBADF (Bad file descriptor)
close(1022) = -1 EBADF (Bad file descriptor)
close(1023) = -1 EBADF (Bad file descriptor)
This happens because the code in _close_fds() iterates from 3 up to
MAX_FDS = os.sysconf("SC_OPEN_MAX").
Now, syscalls are cheap, but SC_OPEN_MAX (also known as RLIMIT_NOFILE
or ulimit -n) can be quite high, for example:
$ docker run --rm python python3 -c \
$'import os\nprint(os.sysconf("SC_OPEN_MAX"))'
1048576
This means a million syscalls before spawning a child process, which
can result in major delay, like 0.1s as measured on my fast and mostly
idling laptop. Here is the comparison with python3 (which apparently
as well as according to strace do not have this problem):
$ docker run --rm python python3 -c $'import subprocess\nimport time\ns = time.time()\nsubprocess.check_call([\'/bin/true\'], close_fds=True)\nprint(time.time() - s)\n'
0.0009245872497558594
$ docker run --rm python python2 -c $'import subprocess\nimport time\ns = time.time()\nsubprocess.check_call([\'/bin/true\'], close_fds=True)\nprint(time.time() - s)\n'
0.0964419841766
The solution (for Linux, at least) is easy: instead of iterating through
all the _possibly_ opened files, get the list of _actually_ opened
files (from /proc/self/fd/) and iterate through these. In case it
doesn't work (for example when /proc not available or some draconian
selinux/apparmor restrictions), fall back to older method.
[v2: do not use bare exceptions]
Signed-off-by: Kir Kolyshkin <[email protected]>
|
@asvetlov Thanks for your review (as this is my first code in Python that is longer than two lines your review is very much appreciated)! I believe I have addressed your comments. Ideally, I'd write code without using exceptions at all if something like |
|
Oooh, missed the fact that the patch is for Python 2.7 @benjaminp would you accept it or the change doesn't worth potential problems? |
...which is very similar to this implementation (i.e. it also iterates through entries found in |
|
Thank you for the PR. Closing per comment on the issue. |
In case
close_fds=Trueis passed tosubprocess.Popen()or its users (
subprocess.call()etc), it might spendsome considerable time closing non-opened file descriptors,
as demonstrated by the following snippet from
strace:This happens because the code in
_close_fds()iterates from 3 up toMAX_FDS = os.sysconf("SC_OPEN_MAX").Now, syscalls are cheap, but
SC_OPEN_MAX(also known asRLIMIT_NOFILEor
ulimit -n) can be quite high, for example:This means a million syscalls before spawning a child process, which
can result in major delay, like 0.1s as measured on my fast and mostly
idling laptop. Here is the comparison with python3 (which do not have
this problem):
The solution (for Linux, at least) is easy: instead of iterating through
all the possibly opened files, get the list of actually opened
files (from
/proc/self/fd/) and iterate through these. In case itdoesn't work (for example when /proc not available or some draconian
selinux/apparmor restrictions), fall back to older method.
[v2: do not use bare exceptions]
https://bugs.python.org/issue35757