Skip rebuilding installed packages#16724
Conversation
9fe161f to
20b3e41
Compare
|
@tgamblin @scheibelp RFR. The three commits "should" be sufficiently distinct to retain separately. |
To make sure, say we have two packages X and Y where X depends on Y; X is installed, Y is not. Are you saying that in this case, that Spack would reinstall X? Or are you saying that Spack would attempt to install Y as part of installing X? |
Given X has Y as a dependency with X installed and Y not installed, Thanks for pointing out the wording issue. I have updated the wording in the description. |
|
To clarify: I think you are describing the behavior of Spack with this PR applied. In #16724 (comment) I was more curious what the behavior of Spack was before this PR: would X be reinstalled? Or was the problem that Y was being installed even though it "wasn't needed"? |
Ah. IIRC -- I heard second hand -- the problem was Y was being installed when not needed. |
|
OK: I'll note that generally speaking we won't always want to skip the installation of Y, even if X is installed. This depends on whether
If Y is not a build dependency, then we probably still want to install it; the exception is if X is an external, in which case we assume that all dependencies are taken care of. Is that accounted for here? |
We were already skipping externals, though I would have to review to determine if the build was skipping the dependencies of externals. (That is an excellent point, thanks.) Perhaps I misunderstood the change. @tgamblin What are your thoughts on this? |
scheibelp
left a comment
There was a problem hiding this comment.
In addition to my concern at #16724 (comment) i have the following questions/requests. Let me know if any seem off-track.
| # If Package.install is called after this point, it will fail | ||
| pkg.succeed = False | ||
| pkg.do_install() | ||
| monkeypatch.setattr(spack.package.Package, 'remove_prefix', |
There was a problem hiding this comment.
I think it makes sense to use monkeypatch for remove_prefix - that is more in keeping with how methods are temporarily adjusted for other tests. I'm curious though which changes in this PR prevent setting pkg.succeed.
There was a problem hiding this comment.
Noticed a number of differences with the tests here and develop so removed the commit that led to these test changes and am re-running the tests.
| self._update_installed(spec.package) | ||
| tty.debug('Assuming dependencies of {0} are also installed' | ||
| .format(spec.package.name)) | ||
| for dep in spec.dependencies(): |
There was a problem hiding this comment.
I'm concerned about a predicate function with side effects. Also if we have packages X, Y, Z such that
- X depends on Z
- Y depends on Z
- X is installed
- Y and Z are not installed
Then I think this would not be the right decision to mark Z as installed.
There was a problem hiding this comment.
Thanks for the counter example. I will have to give this more thought.
| # Traverse dependencies from the bottom-up so any that are flagged | ||
| # as installed can be readily removed | ||
| for dep in self.pkg.spec.traverse(root=False, | ||
| predicate=not_installed): |
There was a problem hiding this comment.
I'm assuming that adding this predicate would be sufficient for avoiding the installation of dependencies when a dependent is installed (with the caveat mentioned elsewhere that generally we may still have to install non-build deps). What is the purpose of additionally marking the dependencies as installed (i.e. in the above not_installed function)?
There was a problem hiding this comment.
IIRC the core issue spawning this work was the undesirable affect of having dependencies of installed packages being installed.
So flagging dependencies of installed packages as being installed should help ensure there is no attempt to install the dependencies.
| # uninstalled dependencies. This is necessary at this point | ||
| # since bootstrap compilers may not be listed as dependents | ||
| # of packages. | ||
| task.flag_installed(self.installed) |
There was a problem hiding this comment.
If the priority is the number of uninstalled dependents, then I'm concerned there may be queue ordering issues if we have to readjust the priority at this point: ideally after any task is completed, the priority of all related tasks should be updated at that time.
In other words, the priority of tasks in the queue should be updated when we enqueue/complete tasks, not when we retrieve them.
There was a problem hiding this comment.
Not sure I follow.
Queue ordering isn't solely dependent on the priority. BuildTasks are ordered by the tuple (# uninstalled dependencies, sequence), where sequence is a counter used to preserve the order in which the task is added (or re-added) to the queue.
There was a problem hiding this comment.
Just to be clear, I believe the code is doing what you think it should in terms of build task priorities.
That is, the priority of the build task is the number of uninstalled dependencies at the time the task is added or re-queued plus a sequence number. The build task of a dependent package is re-queued when it is determined that one of its dependencies has been "installed". The sequence number is used to ensure build tasks with the same number of uninstalled dependencies are sequenced by the order in which they were added to the queue. Sequencing the build tasks helps support parallel builds by allowing a process to cycle through those that have no uninstalled dependencies are being built by other processes.
There was a problem hiding this comment.
I'll need to read through this code again to refresh my understanding and see if I still have this concern.
If the task is being requeued when a dependency is installed, I think that would address my concern (and if that is already happening then perhaps I missed it - could you point me to where this happens?). I wasn't sure that was happening though because it didn't look like any requeueing was happening when an installation was finished.
lib/spack/spack/installer.py
Outdated
| # Determine state of installation artifacts and adjust accordingly. | ||
| self._prepare_for_install(task, keep_prefix, keep_stage, | ||
| restage) | ||
| # Possession of a read lock is required/assumed. |
There was a problem hiding this comment.
Given that a read lock was just acquired, IMO this comment is unnecessary.
(UPDATE: Was #16654 but renamed to fix typo in the branch name.)
This PR addresses an issue raised in
slackwhere a package that is already installed but has uninstalled dependencies was having its dependencies re-installed even when those dependencies were not needed to use the package.Todd recommended using pre- (versus post-) order traversals to prune installed dependencies when initializing the build queue so that has been included.
There is a third commit to replace
try-finallyblocks with attribute assignments that attempt to do the equivalent ofmonkeypatchbut does not always work as expected. For example, at one point tests usingcanfailwith initial installs withsucceedset toFalsethen run after settingsucceedtoTruewould fail with complaints thatsucceedwas [still]False. Using the testing framework'smonkeypatchseemed to solve this problem.