Skip to content

use the 'fork' start method for multiprocessing, even on macOS with Python>=3.8#18124

Merged
scheibelp merged 16 commits intospack:developfrom
r-xue:py38-macos-multiprocessing
Sep 2, 2020
Merged

use the 'fork' start method for multiprocessing, even on macOS with Python>=3.8#18124
scheibelp merged 16 commits intospack:developfrom
r-xue:py38-macos-multiprocessing

Conversation

@r-xue
Copy link
Copy Markdown
Contributor

@r-xue r-xue commented Aug 17, 2020

This is intended to fix #14102, only relevant to macOS with Python=3.8.

Python 3.8 in macOS switches to the spawn start method in multiprocessing as default, so the resources of the parent process may not be inherited by the child process. Forcing "fork" start might not be a permanent solution (see here), but at least get things working on macOS for now.

I came across a similar issue in a different setting, which was reported here.

@adamjstewart
Copy link
Copy Markdown
Member

For the macOS tests in .github/workflows, can you update them from Python 3.7 to 3.8? That way we can test all of Spack to make sure there are no other Python 3.8 + macOS bugs.

@scheibelp
Copy link
Copy Markdown
Member

I am actively working on #14187 and intend to finish it within a week, so I would be inclined to use that instead of this (for the reason mentioned in the link you provided).

@adamjstewart
Copy link
Copy Markdown
Member

@scheibelp my comment above applies to your PR as well. Let me know if I can help test or anything. Would really love to see this bug fixed soon.

@r-xue
Copy link
Copy Markdown
Contributor Author

r-xue commented Aug 18, 2020

@adamjstewart @scheibelp close this PR.
I didn't see the effort in #14187 at first. It's better to have something safer than forcing "fork" on macOS.
Look forward to your fix.

@r-xue r-xue closed this Aug 18, 2020
@scheibelp
Copy link
Copy Markdown
Member

#18205 is taking longer than desired (although it is still my priority) so I will be reopening this and following the suggestion in #18124 (comment) to run the Mac OS tests with python 3.8.

@scheibelp scheibelp reopened this Aug 28, 2020
@adamjstewart
Copy link
Copy Markdown
Member

Found and fixed a few more uses of multiprocessing.Process. There are also many uses of multiprocessing.Pool and multiprocessing.Queue. I don't know enough about how these work to know if we need to make the same change there.

@adamjstewart
Copy link
Copy Markdown
Member

adamjstewart commented Sep 1, 2020

I noticed an interesting side effect of this PR. When I run spack test, it appears to hang once it hits lib/spack/spack/test/install.py. In reality, it doesn't hang at all, it just stops reporting output. Then, once it hits the GPG tests, it prompts me for my passphrase, but since it doesn't output anything, I don't know that it's needed.

EDIT: Okay, it's not actually a side-effect of this PR. It only occurs with Python 3.8. It's much better than the existing behavior (tests crash).

@scheibelp
Copy link
Copy Markdown
Member

EDIT: Okay, it's not actually a side-effect of this PR. It only occurs with Python 3.8. It's much better than the existing behavior (tests crash).

I assume when you say "existing behavior" you mean "prior to 42a04f5"?

once it hits the GPG tests, it prompts me for my passphrase, but since it doesn't output anything, I don't know that it's needed.

Also to make sure, it looks like this issue is intermittent? It appears you were able to "make it go away" by just rerunning the tests.

@adamjstewart
Copy link
Copy Markdown
Member

adamjstewart commented Sep 1, 2020

I assume when you say "existing behavior" you mean "prior to 42a04f5"?

I mean on develop. But yes, before that commit the tests still crashed.

Also to make sure, it looks like this issue is intermittent? It appears you were able to "make it go away" by just rerunning the tests.

Although I can't see the prompt that tells me to enter my passphrase, I can still enter it. The issue does not occur in our CI as GPG is not installed, or doesn't require a passphrase. I had to remove gpg from my PATH to get the tests to work.

@adamjstewart
Copy link
Copy Markdown
Member

Hmm, I think we want to use this method for multiprocessing.Pool and friends too. Let me add that...

@adamjstewart
Copy link
Copy Markdown
Member

@scheibelp the reason I went crazy and wrapped everything was that I didn't trust the unit tests to cover every line of code. I'm happy to revert that commit and go with the bare minimum needed to get the tests to pass. I think most if not all of the calls to multiprocessing.Process need to be replaced. None of the calls to multiprocessing.* other than Process needed to be replaced to get the tests to pass.

@scheibelp
Copy link
Copy Markdown
Member

I'm happy to revert that commit and go with the bare minimum needed to get the tests to pass

That is my preference: I think that it's confusing to access cpu_count via fork_context for example. Also I think it's worth leaving invocations of multprocessing.Process that don't need to be touched (since otherwise a reader may assume that using the spawn method would fail in that case).


args = (b,) + tuple(kwargs.get('extra_args', ()))
procs = [Process(target=f, args=args, name=f.__name__)
procs = [fork_context.Process(target=f, args=args, name=f.__name__)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is definitely required:

AttributeError: Can't pickle local object 'test_lock_debug_output.<locals>.p2'

input_stream = None # just don't forward input if this fails

self.process = multiprocessing.Process(
self.process = fork_context.Process(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is needed:

E       AssertionError: assert False
E        +  where False = <function exists at 0x101dcbd40>('/private/var/folders/21/hwq39zyj4g36x6zjfyl5l8080000gn/T/pytest-of-Adam/pytest-79/test_foreground_background_tes7/log.txt')
E        +    where <function exists at 0x101dcbd40> = <module 'posixpath' from '/Users/Adam/.spack/darwin/.spack-env/view/lib/python3.7/posixpath.py'>.exists
E        +      where <module 'posixpath' from '/Users/Adam/.spack/darwin/.spack-env/view/lib/python3.7/posixpath.py'> = os.path

input_stream = os.fdopen(os.dup(sys.stdin.fileno()))

p = multiprocessing.Process(
p = fork_context.Process(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is definitely needed, otherwise you can't install anything:

AttributeError: Can't pickle local object 'fork.<locals>.child_process'


"""
self.proc = multiprocessing.Process(
self.proc = fork_context.Process(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is also needed:

E           RuntimeError: 
E                   An attempt has been made to start a new process before the
E                   current process has finished its bootstrapping phase.
E           
E                   This probably means that you are not using fork to start your
E                   child processes and you have forgotten to use the proper idiom
E                   in the main module:
E           
E                       if __name__ == '__main__':
E                           freeze_support()
E                           ...
E           
E                   The "freeze_support()" line can be omitted if the program
E                   is not going to be frozen to produce an executable.

@adamjstewart
Copy link
Copy Markdown
Member

@scheibelp alright, I think I removed all non-essential changes from this PR. This should be the smallest possible diff needed to get the tests to pass. Once this is merged, I plan on using Spack with Python 3.8 on macOS, so if there are any other places that need this fix, I'll probably notice them. I think this is ready to merge now if you want to re-review.

@ax3l
Copy link
Copy Markdown
Member

ax3l commented Sep 2, 2020

Just to keep you posted: since this PR ran a couple of very long-running macOS workflows on GH actions, I aborted all but the last commit to unblock other waiting PRs :)

@scheibelp scheibelp merged commit d9b945f into spack:develop Sep 2, 2020
@scheibelp
Copy link
Copy Markdown
Member

Thanks @r-xue, @adamjstewart, @tgamblin, and @ax3l for your work on this!

@ax3l
Copy link
Copy Markdown
Member

ax3l commented Sep 2, 2020

This is wonderful, thank you all! 🚀 ✨

@adamjstewart
Copy link
Copy Markdown
Member

Thanks @r-xue, what a valuable first-time contribution!

@tgamblin
Copy link
Copy Markdown
Member

tgamblin commented Sep 2, 2020

Thanks everyone!

capitalaslash pushed a commit to capitalaslash/spack that referenced this pull request Sep 3, 2020
…ack#18124)

As detailed in https://bugs.python.org/issue33725, starting new
processes with 'fork' on Mac OS is not guaranteed to work in general.
As of Python 3.8 the default process spawning mechanism was changed
to avoid this issue.

Spack depends on the fork-based method to preserve file descriptors
transparently, to preserve global state, and to avoid pickling some
objects. An effort is underway to remove dependence on fork-based
process spawning (see spack#18205). In the meantime, this allows Spack to
run with Python 3.8 on Mac OS by explicitly choosing to use 'fork'.

Co-authored-by: Peter Josef Scheibel <[email protected]>
Co-authored-by: Adam J. Stewart <[email protected]>
Co-authored-by: Todd Gamblin <[email protected]>
lorddavidiii added a commit to lorddavidiii/spack that referenced this pull request Sep 24, 2020
This is based on
spack@d9b945f
without the changes in the .github dir and should be reverted before
updating to a new spack version

original commit message:
Mac OS: support Python >= 3.8 by using fork-based multiprocessing (spack#18124)

As detailed in https://bugs.python.org/issue33725, starting new
processes with 'fork' on Mac OS is not guaranteed to work in general.
As of Python 3.8 the default process spawning mechanism was changed
to avoid this issue.

Spack depends on the fork-based method to preserve file descriptors
transparently, to preserve global state, and to avoid pickling some
objects. An effort is underway to remove dependence on fork-based
process spawning (see spack#18205). In the meantime, this allows Spack to
run with Python 3.8 on Mac OS by explicitly choosing to use 'fork'.
ChristianTackeGSI pushed a commit to FairRootGroup/spack that referenced this pull request Sep 24, 2020
…ack#18124)

As detailed in https://bugs.python.org/issue33725, starting new
processes with 'fork' on Mac OS is not guaranteed to work in general.
As of Python 3.8 the default process spawning mechanism was changed
to avoid this issue.

Spack depends on the fork-based method to preserve file descriptors
transparently, to preserve global state, and to avoid pickling some
objects. An effort is underway to remove dependence on fork-based
process spawning (see spack#18205). In the meantime, this allows Spack to
run with Python 3.8 on Mac OS by explicitly choosing to use 'fork'.

Co-authored-by: Peter Josef Scheibel <[email protected]>
Co-authored-by: Adam J. Stewart <[email protected]>
Co-authored-by: Todd Gamblin <[email protected]>
(cherry picked from commit d9b945f)
(modified to apply to 0.15.4)
ChristianTackeGSI added a commit to ChristianTackeGSI/FairSoft that referenced this pull request Sep 24, 2020
* Backported the fix from spack/spack#18124
  to our spack
* Switch to that version in our spack.
dennisklein pushed a commit to FairRootGroup/FairSoft that referenced this pull request Sep 25, 2020
* Backported the fix from spack/spack#18124
  to our spack
* Switch to that version in our spack.
alalazo added a commit to alalazo/spack that referenced this pull request Aug 26, 2021
This object was introduced in spack#18124 which
was later superseded by spack#18205 and removed
any use if the object.
scheibelp pushed a commit that referenced this pull request Aug 26, 2021
This object was introduced in #18124, and was later superseded by
#18205 and removed any use if the object.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MacOS Mojave, Python 3.8: "AttributeError: Can't pickle local object 'fork.<locals>.child_process'"

5 participants