Skip rebuilding installed packages by tldahlgren · Pull Request #16724 · spack/spack

tldahlgren · 2020-05-19T18:36:08Z

(UPDATE: Was #16654 but renamed to fix typo in the branch name.)

This PR addresses an issue raised in slack where a package that is already installed but has uninstalled dependencies was having its dependencies re-installed even when those dependencies were not needed to use the package.

Todd recommended using pre- (versus post-) order traversals to prune installed dependencies when initializing the build queue so that has been included.

There is a third commit to replace try-finally blocks with attribute assignments that attempt to do the equivalent of monkeypatch but does not always work as expected. For example, at one point tests using canfail with initial installs with succeed set to False then run after setting succeed to True would fail with complaints that succeed was [still] False. Using the testing framework's monkeypatch seemed to solve this problem.

…erly reset package attrs

tldahlgren · 2020-05-28T17:24:26Z

@tgamblin @scheibelp RFR. The three commits "should" be sufficiently distinct to retain separately.

scheibelp · 2020-05-28T18:28:54Z

This PR addresses an issue raised in slack where a package that is already installed but has uninstalled dependencies was being re-installed even when those dependencies were not needed to use the package.

To make sure, say we have two packages X and Y where X depends on Y; X is installed, Y is not. Are you saying that in this case, that Spack would reinstall X? Or are you saying that Spack would attempt to install Y as part of installing X?

tldahlgren · 2020-05-28T18:39:14Z

This PR addresses an issue raised in slack where a package that is already installed but has uninstalled dependencies was being re-installed even when those dependencies were not needed to use the package.

To make sure, say we have two packages X and Y where X depends on Y; X is installed, Y is not. Are you saying that in this case, that Spack would reinstall X? Or are you saying that Spack would attempt to install Y as part of installing X?

Given X has Y as a dependency with X installed and Y not installed, spack install X would result in no further action than to report X is installed. In other words, we would no longer attempt to install Y in this case.

Thanks for pointing out the wording issue. I have updated the wording in the description.

scheibelp · 2020-05-28T18:47:16Z

To clarify: I think you are describing the behavior of Spack with this PR applied. In #16724 (comment) I was more curious what the behavior of Spack was before this PR: would X be reinstalled? Or was the problem that Y was being installed even though it "wasn't needed"?

tldahlgren · 2020-05-28T18:49:38Z

To clarify: I think you are describing the behavior of Spack with this PR applied. In #16724 (comment) I was more curious what the behavior of Spack was before this PR: would X be reinstalled? Or was the problem that Y was being installed even though it "wasn't needed"?

Ah. IIRC -- I heard second hand -- the problem was Y was being installed when not needed.

scheibelp · 2020-05-28T18:51:15Z

OK: I'll note that generally speaking we won't always want to skip the installation of Y, even if X is installed. This depends on whether

Y is a build dependency
X is an external

If Y is not a build dependency, then we probably still want to install it; the exception is if X is an external, in which case we assume that all dependencies are taken care of. Is that accounted for here?

tldahlgren · 2020-05-28T19:14:39Z

OK: I'll note that generally speaking we won't always want to skip the installation of Y, even if X is installed. This depends on whether
* Y is a build dependency

* X is an external
If Y is not a build dependency, then we probably still want to install it; the exception is if X is an external, in which case we assume that all dependencies are taken care of. Is that accounted for here?

We were already skipping externals, though I would have to review to determine if the build was skipping the dependencies of externals. (That is an excellent point, thanks.)

Perhaps I misunderstood the change. @tgamblin What are your thoughts on this?

scheibelp

In addition to my concern at #16724 (comment) i have the following questions/requests. Let me know if any seem off-track.

scheibelp · 2020-05-28T17:54:55Z

lib/spack/spack/test/install.py

-        # If Package.install is called after this point, it will fail
-        pkg.succeed = False
-        pkg.do_install()
+    monkeypatch.setattr(spack.package.Package, 'remove_prefix',


I think it makes sense to use monkeypatch for remove_prefix - that is more in keeping with how methods are temporarily adjusted for other tests. I'm curious though which changes in this PR prevent setting pkg.succeed.

Noticed a number of differences with the tests here and develop so removed the commit that led to these test changes and am re-running the tests.

lib/spack/spack/installer.py

scheibelp · 2020-05-28T18:38:38Z

lib/spack/spack/installer.py

+                self._update_installed(spec.package)
+                tty.debug('Assuming dependencies of {0} are also installed'
+                          .format(spec.package.name))
+                for dep in spec.dependencies():


I'm concerned about a predicate function with side effects. Also if we have packages X, Y, Z such that

X depends on Z

Y depends on Z

X is installed

Y and Z are not installed

Then I think this would not be the right decision to mark Z as installed.

Thanks for the counter example. I will have to give this more thought.

scheibelp · 2020-05-28T19:08:28Z

lib/spack/spack/installer.py

+            # Traverse dependencies from the bottom-up so any that are flagged
+            # as installed can be readily removed
+            for dep in self.pkg.spec.traverse(root=False,
+                                              predicate=not_installed):


I'm assuming that adding this predicate would be sufficient for avoiding the installation of dependencies when a dependent is installed (with the caveat mentioned elsewhere that generally we may still have to install non-build deps). What is the purpose of additionally marking the dependencies as installed (i.e. in the above not_installed function)?

IIRC the core issue spawning this work was the undesirable affect of having dependencies of installed packages being installed.

So flagging dependencies of installed packages as being installed should help ensure there is no attempt to install the dependencies.

scheibelp · 2020-05-28T19:18:40Z

lib/spack/spack/installer.py

+                # uninstalled dependencies.  This is necessary at this point
+                # since bootstrap compilers may not be listed as dependents
+                # of packages.
+                task.flag_installed(self.installed)


If the priority is the number of uninstalled dependents, then I'm concerned there may be queue ordering issues if we have to readjust the priority at this point: ideally after any task is completed, the priority of all related tasks should be updated at that time.

In other words, the priority of tasks in the queue should be updated when we enqueue/complete tasks, not when we retrieve them.

Not sure I follow.

Queue ordering isn't solely dependent on the priority. BuildTasks are ordered by the tuple (# uninstalled dependencies, sequence), where sequence is a counter used to preserve the order in which the task is added (or re-added) to the queue.

Just to be clear, I believe the code is doing what you think it should in terms of build task priorities.

That is, the priority of the build task is the number of uninstalled dependencies at the time the task is added or re-queued plus a sequence number. The build task of a dependent package is re-queued when it is determined that one of its dependencies has been "installed". The sequence number is used to ensure build tasks with the same number of uninstalled dependencies are sequenced by the order in which they were added to the queue. Sequencing the build tasks helps support parallel builds by allowing a process to cycle through those that have no uninstalled dependencies are being built by other processes.

I'll need to read through this code again to refresh my understanding and see if I still have this concern.

If the task is being requeued when a dependency is installed, I think that would address my concern (and if that is already happening then perhaps I missed it - could you point me to where this happens?). I wasn't sure that was happening though because it didn't look like any requeueing was happening when an installation was finished.

scheibelp · 2020-05-28T19:21:31Z

lib/spack/spack/installer.py

            # Determine state of installation artifacts and adjust accordingly.
-            self._prepare_for_install(task, keep_prefix, keep_stage,
-                                      restage)
+            # Possession of a read lock is required/assumed.


Given that a read lock was just acquired, IMO this comment is unnecessary.

tldahlgren self-assigned this May 19, 2020

tldahlgren added WIP build labels May 19, 2020

tldahlgren changed the title ~~Snapshot of skipping installed packages and pre-order traversal~~ Skip rebuilding installed packages May 19, 2020

tldahlgren closed this May 21, 2020

tldahlgren force-pushed the features/skip-deps-when-installed branch from 9fe161f to 20b3e41 Compare May 21, 2020 01:11

tldahlgren added 3 commits May 27, 2020 19:28

Add a predicate check to traverse

74daf63

Use new traverse predicate; skip installed packages

35d0a66

Replace try-finally with monkeypatch to ensure unit tests always prop…

95b9fae

…erly reset package attrs

tldahlgren reopened this May 28, 2020

tldahlgren removed the WIP label May 28, 2020

tldahlgren requested review from alalazo, scheibelp and tgamblin and removed request for alalazo May 28, 2020 17:21

scheibelp self-assigned this May 28, 2020

scheibelp requested changes May 28, 2020

View reviewed changes

Removed unused install_compilers argument and comment (per PR feedback)

46b3bc2

tldahlgren mentioned this pull request Sep 25, 2020

Support parallel environment builds #18131

Merged

11 tasks

tgamblin closed this in #18131 Nov 17, 2020

Conversation

tldahlgren commented May 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tldahlgren commented May 28, 2020

Uh oh!

scheibelp commented May 28, 2020

Uh oh!

tldahlgren commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scheibelp commented May 28, 2020

Uh oh!

tldahlgren commented May 28, 2020

Uh oh!

scheibelp commented May 28, 2020

Uh oh!

tldahlgren commented May 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scheibelp left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tldahlgren Jul 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tldahlgren Jul 28, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tldahlgren commented May 19, 2020 •

edited

Loading

tldahlgren commented May 28, 2020 •

edited

Loading

tldahlgren commented May 28, 2020 •

edited

Loading

tldahlgren Jul 28, 2020 •

edited

Loading

tldahlgren Jul 28, 2020 •

edited

Loading