Skip to content

Bootstrap most of Spack dependencies using environments#34029

Merged
alalazo merged 23 commits intospack:developfrom
alalazo:bootstrap/refactor-dev-dependencies
Dec 6, 2022
Merged

Bootstrap most of Spack dependencies using environments#34029
alalazo merged 23 commits intospack:developfrom
alalazo:bootstrap/refactor-dev-dependencies

Conversation

@alalazo
Copy link
Copy Markdown
Member

@alalazo alalazo commented Nov 21, 2022

The main objective of this PR is to rework the bootstrapping procedure to use Spack environments as much as possible.

While doing so, the spack.bootstrap module has been reorganized into a Python package. For reviewers, the new code is under spack.bootstrap.environment (most of the other changes are just adding docs or linting).

In this PR a distinction is made among "core" Spack dependencies (clingo, GnuPG, patchelf) and other dependencies. For a number of reasons, explained in the spack.bootstrap.core module docstring, "core" dependencies are bootstrapped with the current ad-hoc method. All the other dependencies are instead bootstrapped using a Spack environment that lives in a directory specific to the interpreter and the architecture being used. An example of the directory structure is:

$ tree $(spack bootstrap root) -L 3
/home/culpo/.spack/bootstrap
├── config
│   └── linux
│       ├── bootstrap.yaml
│       ├── compilers.yaml
│       └── config.yaml
├── environments
│   └── python3.8-x86_64-57f95
│       ├── Makefile
│       ├── spack.lock
│       ├── spack.yaml
│       └── view -> /home/culpo/.spack/bootstrap/environments/python3.8-x86_64-57f95/._view/dgsyi3525yl3vzeakomxqrekchzd7l6s
└── store
    ├── bin
    │   └── sbang
    ├── linux-centos7-x86_64
    │   └── gcc-10.2.1
    └── linux-ubuntu20.04-x86_64
        └── gcc-9.4.0

The bootstrapping output has been reduced and now looks like:
Screenshot from 2022-11-21 07-00-59

Modifications:

  • Bootstrap most of Spack dependencies using a Spack environment
  • Add pytest to the list of bootstrapped dependencies
  • Reduced verbosity of bootstrapping output
  • Added a --dev argument to spack bootstrap now
  • Reorganized spack.bootstrap into a package and linted it

@spackbot-app spackbot-app bot added commands core PR affects Spack core functionality environments tests General test capability(ies) utilities labels Nov 21, 2022
@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Nov 21, 2022

A few notes in addition to the description:

  1. This PR makes it possible to set the bootstrap store and the user store in the same place. I still think it's a very bad idea to mix installations needed by Spack for its own working with installations needed by the user, but if people want to experiment with that it is now possible (@psakievich might be interested given the discussions during community meetings)
  2. After this PR is in, we can delete pytest from the Python packages we vendor (see Use "vendoring" to maintain Spack's pure Python dependencies #33951)
  3. The default environment modifications in
    def update_syspath_and_environ(self):
    """Update ``sys.path`` and the PATH, PYTHONPATH environment variables to point to
    the environment view.
    """
    # Do minimal modifications to sys.path and environment variables. In particular, pay
    # attention to have the smallest PYTHONPATH / sys.path possible, since that may impact
    # the performance of the current interpreter
    sys.path.extend(self.pythonpaths())
    os.environ["PATH"] = os.pathsep.join(
    [str(x) for x in self.bin_dirs()] + os.environ.get("PATH", "").split(os.pathsep)
    )
    os.environ["PYTHONPATH"] = os.pathsep.join(
    os.environ.get("PYTHONPATH", "").split(os.pathsep)
    + [str(x) for x in self.pythonpaths()]
    )
    are manual, since it seems environments by default add both views and "store" paths. In the case of PYTHONPATH this can result in a large number of directories to be searched, which clutters the environment and may slow-down Python.
  4. To bootstrap on rhel8 with platform-python and no Python headers we require both d5142d1 (install 2 Python extensions from wheels, otherwise we'd fail on headers being missing) and 2ca9680 (fall back on the Python command to be /usr/libexec/platform-python if a series of unfortunate circumstances occur 🙂 ). Installing from wheels is needed for 2 mypy dependencies, otherwise so far all the other Python packages we depend on are pure Python code.

@psakievich
Copy link
Copy Markdown
Contributor

@alalazo I agree with you that there are a lot of dangers in mixing the stores. I would primarily like to reuse the bootstrapped dependencies from spack style and soon to be spack unit-test. I'm not totally sure of what else is involved beyond those and the "core" dependencies you've outlined in the PR description. I don't have the time to really dig into a PR this big right now so I'll ask a couple of questions instead.

  1. Are the "core" dependencies (clingo, patchelf, GnuPG) included in the store mixing? I think it would be best to keep those in a separate store/install_tree, and keep them off limits. From reading the comments and doc string I believe that is correct?
  2. How is the "store mixing" done? Will spack uninstall --all blow away the bootstrap store, or will it be treated like an upstream with some level of additional protection? I guess if it's just some python packages that get blown away then I can live with that. If it is something more fundamental, then that is less ideal.

@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Nov 22, 2022

@psakievich

I'm not totally sure of what else is involved beyond those and the "core" dependencies you've outlined in the PR description.

"""Bootstrap Spack core dependencies from binaries.
This module contains logic to bootstrap software required by Spack from binaries served in the
bootstrapping mirrors. The logic is quite different from an installation done from a Spack user,
because of the following reasons:
1. The binaries are all compiled on the same OS for a given platform (e.g. they are compiled on
``centos7`` on ``linux``), but they will be installed and used on the host OS. They are also
targeted at the most generic architecture possible. That makes the binaries difficult to reuse
with other specs in an environment without ad-hoc logic.
2. Bootstrapping has a fallback procedure where we try to install software by default from the
most recent binaries, and proceed to older versions of the mirror, until we try building from
sources as a last resort. This allows us not to be blocked on architectures where we don't
have binaries readily available, but is also not compatible with the working of environments
(they don't have fallback procedures).
3. Among the binaries we have clingo, so we can't concretize that with clingo :-)
4. clingo, GnuPG and patchelf binaries need to be verified by sha256 sum (all the other binaries
we might add on top of that in principle can be verified with GPG signatures).
"""

Are the "core" dependencies (clingo, patchelf, GnuPG) included in the store mixing? [ ... ]

To be clear:

  1. There is no mixing by default, things remain exactly as on develop with that respect. I just removed a check that was preventing people from setting the "bootstrapping store" and config:install_tree:root to the same place.
  2. The mixing is done as you like, if you want to experiment with that. You can use the bootstrap store as an upstream, or you can set both stores to the same place.1

Footnotes

  1. I wouldn't recommend any of the two, but they are possible

@alalazo alalazo requested a review from haampie November 22, 2022 15:06
"-C",
str(self.environment_root()),
"-j",
str(spack.build_environment.determine_number_of_jobs(parallel=True)),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue here is that it will regenerate the view for every package being installed

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in any case much faster than a serial installation, so I guess we can optimize later to generate a depfile that can install without generating the view?1

Footnotes

  1. It's also ugly that we have to spawn another process to generate the Makefile, but given the difference in performance I think it's preferable to env.install_all().

@trws trws self-assigned this Nov 22, 2022
@alalazo alalazo force-pushed the bootstrap/refactor-dev-dependencies branch from 3a3ebef to c22012b Compare November 23, 2022 12:01
@alalazo alalazo force-pushed the bootstrap/refactor-dev-dependencies branch from 753d836 to f415929 Compare November 23, 2022 14:02
Comment on lines +152 to +156
with spack.environment.no_active_environment():
with spack.platforms.prevent_cray_detection():
with spack.platforms.use_platform(spack.platforms.real_host()):
with spack.repo.use_repositories(spack.paths.packages_path):
with spack.store.use_store(bootstrap_store_path):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on maybe collapsing this to a single with statement with multiple values? The semantics are the same, but avoiding the extra four levels of nesting might be nice.

@trws
Copy link
Copy Markdown
Contributor

trws commented Nov 29, 2022

All my testing shows this working well, mac, linux, arm, x86, I have yet to find a problem with bootstrapping. The one thing that struck me as a bit odd though, I'm getting output describing what's going to be built after it's built rather than before but I'm not seeing obviously why that is:

==> clingo-bootstrap: Successfully installed clingo-bootstrap-spack-ls7pgf6b6ytnlgtgim4cpdsfvm4hrmqo
  Stage: 0.33s.  Cmake: 2.03s.  Build: 28.68s.  Install: 0.55s.  Total: 31.73s
[+] /Users/scogland1/.spack/bootstrap/store/darwin-monterey-aarch64/apple-clang-14.0.0/clingo-bootstrap-spack-ls7pgf6b6ytnlgtgim4cpdsfvm4hrmqo
==> [BOOTSTRAPPING] Spack has missing dependencies, creating a bootstrapping environment
==> [BOOTSTRAPPING] Installing dependencies ([email protected]:, [email protected]:, py-black, py-flake8, py-pytest)

@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Nov 30, 2022

Took me a while to find, but the unit tests failing here are because some environment modification leak, fixed in b8f0964, of tests running before them. The unit tests have been modified in #30882

The one thing that struck me as a bit odd though, I'm getting output describing what's going to be built after it's built rather than before but I'm not seeing obviously why that is: [ ... ]

That's because clingo, patchelf and GnuPG are treated differently. We first bootstrap them with an ad-hoc procedure, then use the environment for everything else (currently [email protected]:, [email protected]:, py-black, py-flake8, py-pytest) - so what you see is the bootstrapping of "core" software before the environment.

@alalazo alalazo requested a review from trws November 30, 2022 12:30
@trws
Copy link
Copy Markdown
Contributor

trws commented Nov 30, 2022

Got it, doing some more testing, but I'm reminded of a related bootstrap question. Do we need to keep using a development version of clingo? As it is, we're forcing an increase in build dependencies to do it, is there no release new enough to get rid of the re2c and bison dependencies? Especially with an external cmake and python available, the size of the build set is substantially larger because of it:

# spack version
clingo-bootstrap@spack%[email protected]~docs~ipo+python~static_libstdcpp build_system=cmake build_type=Release arch=darwin-monterey-m1
    ^[email protected]%[email protected] build_system=autotools arch=darwin-monterey-m1
        ^[email protected]%[email protected] build_system=autotools arch=darwin-monterey-m1
            ^[email protected]%[email protected] build_system=autotools libs=shared,static arch=darwin-monterey-m1
        ^gnuconfig@2022-09-17%[email protected] build_system=generic arch=darwin-monterey-m1
        ^[email protected]%[email protected]+sigsegv build_system=autotools patches=9dc5fbd,bfdffa7 arch=darwin-monterey-m1
            ^[email protected]%[email protected] build_system=autotools arch=darwin-monterey-m1
        ^[email protected]%[email protected]+cpanm+shared+threads build_system=generic arch=darwin-monterey-m1
            ^[email protected]%[email protected]+cxx~docs+stl build_system=autotools patches=26090f4,b231fcc arch=darwin-monterey-m1
            ^[email protected]%[email protected]~debug~pic+shared build_system=generic arch=darwin-monterey-m1
            ^[email protected]%[email protected] build_system=autotools arch=darwin-monterey-m1
                ^[email protected]%[email protected] build_system=autotools arch=darwin-monterey-m1
                    ^[email protected]%[email protected]~symlinks+termlib abi=none build_system=autotools arch=darwin-monterey-m1
                        ^[email protected]%[email protected] build_system=autotools arch=darwin-monterey-m1
            ^[email protected]%[email protected]+optimize+pic+shared build_system=makefile arch=darwin-monterey-m1
    ^[email protected]%[email protected]~doc+ncurses+ownlibs~qt build_system=generic build_type=Release arch=darwin-monterey-m1
    ^[email protected]%[email protected]+bz2+crypt+ctypes+dbm~debug+libxml2+lzma+nis~optimizations+pic+pyexpat~pythoncmd+readline+shared+sqlite3+ssl+tix+tkinter+uuid+zlib build_system=generic patches=0d98e93,f2fd060 arch=darwin-monterey-m1
    ^[email protected]%[email protected] build_system=generic arch=darwin-monterey-m1

# release version
[email protected]%[email protected]~docs~ipo+python~static_libstdcpp build_system=cmake build_type=Release arch=darwin-monterey-m1
    ^[email protected]%[email protected]~doc+ncurses+ownlibs~qt build_system=generic build_type=Release arch=darwin-monterey-m1
    ^[email protected]%[email protected] build_system=python_pip arch=darwin-monterey-m1
        ^[email protected]%[email protected] build_system=autotools arch=darwin-monterey-m1
            ^gnuconfig@2022-09-17%[email protected] build_system=generic arch=darwin-monterey-m1
        ^[email protected]%[email protected] build_system=autotools arch=darwin-monterey-m1
        ^[email protected]%[email protected] build_system=generic arch=darwin-monterey-m1
        ^[email protected]%[email protected] build_system=python_pip arch=darwin-monterey-m1
        ^[email protected]%[email protected] build_system=generic arch=darwin-monterey-m1
        ^[email protected]%[email protected] build_system=generic arch=darwin-monterey-m1
    ^[email protected]%[email protected]+bz2+crypt+ctypes+dbm~debug+libxml2+lzma+nis~optimizations+pic+pyexpat~pythoncmd+readline+shared+sqlite3+ssl+tix+tkinter+uuid+zlib build_system=generic patches=0d98e93,f2fd060 arch=darwin-monterey-m1

@alalazo alalazo force-pushed the bootstrap/refactor-dev-dependencies branch from b8f0964 to 49df9dc Compare December 5, 2022 21:51
@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Dec 5, 2022

@trws Is there any further request for changes?

@trws
Copy link
Copy Markdown
Contributor

trws commented Dec 5, 2022

I don't see any more problems with this, but now that core bootstrap is split out it looks like it should be a great deal easier to solve the bootstrap locking problem. In fact I think you have solved it for everything but the core set because of the way the environment install works. Would you be willing to take a stab at adding locking to the bootstrap.core end, perhaps shortly after this merges?

@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Dec 6, 2022

Would you be willing to take a stab at adding locking to the bootstrap.core end, perhaps shortly after this merges?

Sure!

@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Dec 6, 2022

@spackbot run pipeline

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Dec 6, 2022

I've started that pipeline for you!

@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Dec 6, 2022

Merging, since the spec only that is reported broken is also broken in develop, see https://gitlab.spack.io/spack/spack/-/jobs/4817633

@alalazo
Copy link
Copy Markdown
Member Author

alalazo commented Dec 6, 2022

I'll work on fixing that spec next. For the record the pipeline finished here https://gitlab.spack.io/spack/spack/-/pipelines/233092 and so far never reported back to the PR status:
Screenshot from 2022-12-06 11-51-05

@alalazo alalazo merged commit 4c05fe5 into spack:develop Dec 6, 2022
@alalazo alalazo deleted the bootstrap/refactor-dev-dependencies branch December 6, 2022 10:54
amd-toolchain-support pushed a commit to amd-toolchain-support/spack that referenced this pull request Feb 16, 2023
This commit reworks the bootstrapping procedure to use Spack environments 
as much as possible.

The `spack.bootstrap` module has also been reorganized into a Python package. 
A distinction is made among "core" Spack dependencies (clingo, GnuPG, patchelf)
and other dependencies. For a number of reasons, explained in the `spack.bootstrap.core`
module docstring, "core" dependencies are bootstrapped with the current ad-hoc
method. 

All the other dependencies are instead bootstrapped using a Spack environment
that lives in a directory specific to the interpreter and the architecture being used.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants