Add multiprocessing to buildcache command#9440
Add multiprocessing to buildcache command#9440obreitwi wants to merge 1 commit intospack:developfrom
Conversation
|
@alalazo @scheibelp could you take a look at this? |
* Add -w/--without-dependencies to suppress dependency resolution. * Add -j/--jobs to install/create several package from/to buildcache in parallel (defaults to 1). * Applied some flake8-fixes * Added (and compressed) tests Reasoning: The current buildcache implementation lacks the ability to install packages/create tarballs en bulk: * Dependencies are *always* checked for and included * the package index is updated after EVERY installed package instead of once after everything is installed In order to reduce build time of our spack-based singularity container we want to pre-compute all hashes to be installed from buildcache in advance (i.e. we do not want spack to perform additional checks) and we want this installation to be possible with more than one core (decompressing is cpu-intensive and therefore not completely IO-bound). The same goes for creating tarballs that are put into the buildcache after container creation. Hence this commit introduces these two possibilities. By default the behavior remains the same, i.e. spack fails after encountering its first error. When using several nodes we gather all occurred errors and print them afterwards. The only (minor) difference is that since it is not possible to raise exceptions in multiprocessing.Pool-workers we have to convert all encountered SpackErrors to strings and print them manually. In theory parallelization could also be achieved by using tools such as xargs and spawning multiple python processes, but I think it would be nicer to have this possibility out of the box in spack (in the same way as you can install packages with multiple cores). Change-Id: I97b1c14bf7563ac9d68b904b0c38eeb08b8d210e
f351742 to
eaa8ba4
Compare
|
Interestingly, |
scheibelp
left a comment
There was a problem hiding this comment.
Hi @obreitwi I have a few notes/requests
- Most importantly, I think this will conflict with #8718, which we are prioritizing, so I recommend waiting on that before syncing
- I'm not sure why you are overriding
Processto disable daemon functionality - could you explain why you're doing that? - Comments include a few suggestions for improving robustness
| spec.dag_hash(7) + | ||
| ' to install "%s"' % | ||
| spec.format()) | ||
| spec.dag_hash(7) + ' to install "%s"' % spec.format()) |
There was a problem hiding this comment.
As long as flake wasn't complaining before, I prefer that style changes be minimized (minimally it would be useful to separate out style changes into a separate commit)
There was a problem hiding this comment.
I did these changes in response to Travis-complaints but will separate them out into another commit if they should re-appear in the rewrite.
|
|
||
| class NoDaemonProcess(Process): | ||
| """Daemon processes are not allowed to create child processes. | ||
| Yet Sometimes that feature is needed.""" |
There was a problem hiding this comment.
It would be best to mention when it is needed - ideally whatever reason this is being used for in Spack (for example if there are many, just list the most relevant ones for its use in Spack).
There was a problem hiding this comment.
See my reply below. Will update the comment string in the rework.
| def _get_daemon(self): | ||
| return False | ||
|
|
||
| def _set_daemon(self, value): |
There was a problem hiding this comment.
I'm leery of preventing processes from becoming Daemons by overriding and treating this as a noop. Are you trying to prevent multiprocessing.Pool from doing this automatically? I didn't get that sense from skimming https://docs.python.org/2/library/multiprocessing.html
If somewhere in the Spack code is setting processes as daemons, I think it would be good to address that directly rather than intercepting and dropping the daemon requests at this level
There was a problem hiding this comment.
It is the other way around - daemon just means processes can't have children and are cleaned up along with their parent process (see below). We need the processes to have children for unpacking packages → daemon-flag needs to be set to False.
| # currently, specs cause an infinite recursion bug when pickled | ||
| # -> as multiprocessing uses pickle internally, we need to transform | ||
| # specs prior to distributing the work via worker pool | ||
| specs = [s.to_dict() for s in specs] |
There was a problem hiding this comment.
I think you'll want to set all_deps=True for to_dict so that build deps are included in this
| # Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA | ||
| ############################################################################## | ||
| import argparse | ||
| import copy |
There was a problem hiding this comment.
@obreitwi @scottwittenburg this and #8718 will likely conflict, I haven't yet determined how extensively. We are prioritizing #8718 because it is helping us do more extensive testing of building packages with Spack.
There was a problem hiding this comment.
| args = ['create', '-d', mirror_path, "-j", jobs, str(spec)] | ||
| if not signed: | ||
| args.insert(-1, "-u") | ||
| elif not key_created: |
There was a problem hiding this comment.
It appears key_created may not be defined at this point (reading to the beginning of the function, there appear to be code paths where it isn't set)
There was a problem hiding this comment.
I did not touch the gpg-key-handling with my PR in any way - this could be some side-effects from rebasing - for the rework I'll again rebase our working solution on to the current develop branch.
|
|
||
| pool = NoDaemonPool(args.jobs) | ||
| retvals = pool.map(f_create, specs, chunksize=1) | ||
| # TODO: every f_create-call updates the index -> unnecessary |
There was a problem hiding this comment.
Is this still a TODO? The last two lines are phrased as though the TODO is handled.
There was a problem hiding this comment.
Ah, yes, it is handled, will remove the TODO..
Agreed! As far as I can tell I would also need to do some work when rebasing to the current
Quoting the documentation:
As described here,
Will answer them in place. |
|
Organisational question: Should I rebase this PR or open a new one for the new rebase/rework? |
Both approaches are OK with me. If you think the changes will be significant, I think it would be reasonable to close this and then refer to it in a new PR - we've done that with many Spack PRs and I think it can be a useful way to refocus the conversation. |
|
@obreitwi sorry for the slew of commits. I was trying to resolve conflicts using the web interface which led to all sorts of issues that needed to be fixed. |
No problem. I was meaning to ask what the current state regarding these change is? It seems #8718 (or rather: #10274) is merged. Does that mean I can start with the rework or are you already in the process of rewriting these changes, @gartung ? |
|
I just merged #10756 |
25aa30b to
eaa8ba4
Compare
I don't mind either way. We recently discovered (after almost 6 months of using the feature) that installing a lot of packages with many processes takes a lot of memory and I have the suspicion that either there is a memory leak somewhere or one can improve memory usage drastically. I would prefer concluding my investigation prior to doing the rework. But since this only occurs when installing ~400 packages at the same time I don't know if it is a show-stopper. |
Can you update on the state of this PR? |
It is effectively dead, as even with multiprocessing processing all files in Python to apply |
|
@obreitwi Thanks for getting back so quickly! |
Add
-w/--without-dependenciesto suppress dependency resolution.Add
-j/--jobsto install/create several package from/to buildcache in parallel (defaults to 1).Applied some flake8-fixes
Added (and compressed) tests
Reasoning:
The current buildcache implementation lacks the ability to install packages/create tarballs en bulk:
In order to reduce build time of our spack-based singularity container we want to pre-compute all hashes to be installed from buildcache in advance (i.e. we do not want spack to perform additional checks) and we want this installation to be possible with more than one core (decompressing is cpu-intensive and therefore not completely IO-bound).
The same goes for creating tarballs that are put into the buildcache after container creation.
Hence this commit introduces these two possibilities.
By default the behavior remains the same, i.e. spack fails after encountering its first error. When using several cores we gather all occurred errors and print them afterwards.
In theory parallelization could also be achieved by using tools such as
xargsand spawning multiple python processes, but I think it would be nicer to have this possibility out of the box in spack (in the same way as you can install packages with multiple cores).Note: We already use the proposed modifications without problems. I only got around to porting the modifications to the current
develop-branch now and added some tests while I was at it.