Environments: Add support for include URLs#29026
Conversation
09dc184 to
83a0e4e
Compare
83a0e4e to
7553cdc
Compare
ea3bacf to
c21d2d1
Compare
b010a86 to
761b04e
Compare
500e110 to
bb676ef
Compare
|
I wasn't sure when I first saw this, but the more I think about it the more I love this idea. It has the feel of nix flakes almost. Any chance of a convenient way to give it a git url to clone or checkout a given revision to grab the file from and build resulting components from? It seems like pointing to a github raw URL would probably work to fetch things, but then it would have to have the information in two places in a way, so made me wonder. |
The idea was to keep it simple. Allow the configuration information to be maintained in a repo, for instance, so we can save people from having to fetch and store it locally before they can re-use it.
I could see the convenience of having Spack build software from an environment file (and associated configuration if suitably configured) that it clones from a URL. But IMO that would be the "next" step and another PR. What did you have in mind? |
ffe1ec6 to
7b014e8
Compare
scheibelp
left a comment
There was a problem hiding this comment.
I have questions/comments about the overall approach (so I'm
avoiding commenting on the individual changes for now):
Firstly, I assume this is just for SingleFileScopes (i.e. does not
support URLs that point to a directory of configuration files).
Secondly, these changes seem more extensive than I would expect: I would
assume you could modify Environment.included_config_scopes to download
a URL to a temporary file and then consume it like any other
SingleFileScope e.g. in included_config_scopes instead of:
local_path = url.path if url.scheme == 'file' else None
do
# download url to some temporary file
local_path = the_temporary_file_path
In that case, you don't have to "prepare" for cases where config was
previously a path but in the current changes could be a URL (e.g. in
config.read_config_file).
8c3e1f1 to
44eab25
Compare
e4dfcf4 to
b681499
Compare
|
@tgamblin @scheibelp The only check that's failing ATM are the windows tests. |
scheibelp
left a comment
There was a problem hiding this comment.
I have some requests related to:
- Use of real URLs (and internet communication) as part of the tests
- Possible bug in
raw_urland also a potential issue withfetch_remote_configs
Let me know if any of this seems off-track.
I was using real URLs but @tgamblin said not to use them, or he didn't like that I could modify the ones I have for that Purpose.
Ooh. I'll have to check.
|
|
@spackbot fix style |
|
Let me see if I can fix that for you! |
|
I was able to run spack style --fix==> Running style checks on spack
selected: isort, mypy, black, flake8
==> Modified files
lib/spack/spack/binary_distribution.py
lib/spack/spack/cmd/buildcache.py
lib/spack/spack/config.py
lib/spack/spack/environment/environment.py
lib/spack/spack/fetch_strategy.py
lib/spack/spack/package_base.py
lib/spack/spack/stage.py
lib/spack/spack/test/bootstrap.py
lib/spack/spack/test/cmd/env.py
lib/spack/spack/test/cmd/spec.py
lib/spack/spack/test/config.py
lib/spack/spack/test/conftest.py
lib/spack/spack/test/gcs_fetch.py
lib/spack/spack/test/packaging.py
lib/spack/spack/test/s3_fetch.py
lib/spack/spack/test/stage.py
lib/spack/spack/test/url_fetch.py
lib/spack/spack/util/path.py
lib/spack/spack/util/url.py
lib/spack/spack/util/web.py
==> Running isort checks
Fixing /tmp/tmps69v_yt9/spack/lib/spack/spack/test/config.py
Fixing /tmp/tmps69v_yt9/spack/lib/spack/spack/test/conftest.py
isort checks were clean
==> Running mypy checks
Success: no issues found in 555 source files
mypy checks were clean
==> Running black checks
reformatted lib/spack/spack/config.py
reformatted lib/spack/spack/fetch_strategy.py
reformatted lib/spack/spack/environment/environment.py
reformatted lib/spack/spack/util/web.py
All done! ✨ 🍰 ✨
4 files reformatted, 16 files left unchanged.
black checks were clean
==> Running flake8 checks
flake8 checks were clean
==> spack style checks were clean
I've updated the branch with isort fixes. |
|
ping @scheibelp |
1 similar comment
|
ping @scheibelp |
scheibelp
left a comment
There was a problem hiding this comment.
This looks nearly done. My only concerns are related to documentation and fail-fast behavior. Let me know if you disagree with any of the comments.
| else [] | ||
| ) | ||
| basename = os.path.basename(config_path) | ||
| if basename in staged_configs: |
There was a problem hiding this comment.
It looks like this now allows multiple URL-based includes as long as:
- They are for different sections
- They are distinct directories
So for example it looks like if you have:
* include: http://example.com/1/packages.yaml
* include: http://example.com/2/packages.yaml
that only the first would be used. This is fine as long as it is documented.
There was a problem hiding this comment.
Are you suggesting a directory hierarchy for the staged files would be preferred?
There was a problem hiding this comment.
Sorry no I didn't mean to imply any particular solution. If we wanted to change this, then one possibility is that you could create a new filename that is a hash of the URL path (we do similar things e.g. for package source downloads to avoid collisions.
To be clear though changing the behavior is unnecessary here: my original request is just that this is clarified in the documentation about environment includes.
There was a problem hiding this comment.
Modified the comment.
lib/spack/spack/test/cmd/env.py
Outdated
|
|
||
| def test_env_with_included_config_scope(tmpdir, packages_file): | ||
| """Test inclusion of a 'remote' package configuration file 'staged' in the | ||
| environment's configuration stage directory.""" |
There was a problem hiding this comment.
I'm having trouble relating this docstring to this test:
- I don't see how this is remote: it's a file (and in particular it's not a URL)
- I assume the file might become staged as part of the process but not sure if that's an essential detail (however the environment wants to manage this "under the hood" is fine, not something that needs to be explained here).
- That being said, when I read through the code in
environment.py, I don't see local files that exist being placed in the config stage directory (that only appears to happen for URLs).
There was a problem hiding this comment.
I was trying to convey that this test is intended to represent the case of including a previously staged (from a remote URL) package configuration file. Will re-word.
Local files are not staged since they are already available on the file system so their existing file path is sufficient.
There was a problem hiding this comment.
Modified the comment.
97dc373 to
a172b95
Compare
|
@spackbot fix style |
|
Let me see if I can fix that for you! |
|
I was able to run spack style --fix==> Running style checks on spack
selected: isort, mypy, black, flake8
==> Modified files
lib/spack/spack/binary_distribution.py
lib/spack/spack/cmd/buildcache.py
lib/spack/spack/config.py
lib/spack/spack/environment/environment.py
lib/spack/spack/fetch_strategy.py
lib/spack/spack/package_base.py
lib/spack/spack/stage.py
lib/spack/spack/test/bootstrap.py
lib/spack/spack/test/cmd/env.py
lib/spack/spack/test/cmd/spec.py
lib/spack/spack/test/config.py
lib/spack/spack/test/conftest.py
lib/spack/spack/test/gcs_fetch.py
lib/spack/spack/test/packaging.py
lib/spack/spack/test/s3_fetch.py
lib/spack/spack/test/stage.py
lib/spack/spack/test/url_fetch.py
lib/spack/spack/util/path.py
lib/spack/spack/util/url.py
lib/spack/spack/util/web.py
==> Running isort checks
isort checks were clean
==> Running mypy checks
Success: no issues found in 558 source files
mypy checks were clean
==> Running black checks
reformatted lib/spack/spack/environment/environment.py
All done! ✨ 🍰 ✨
1 file reformatted, 19 files left unchanged.
black checks were clean
==> Running flake8 checks
flake8 checks were clean
==> spack style checks were clean
I've updated the branch with isort fixes. |
This involves staging remote configuration files to a subdir of the environment. At this time, each remote file is staged ONCE to avoid overwriting user customizations to the configuration file.
6af25b2 to
1d043c7
Compare
1d043c7 to
3af08ca
Compare
3af08ca to
73ba0eb
Compare
scheibelp
left a comment
There was a problem hiding this comment.
I think there may have been an issue with naming tests, but otherwise this looks good to me (see comment).
* Preliminary support for include URLs in spack.yaml (environment) files This commit adds support in environments for external configuration files obtained from a URL with a preference for grabbing raw text from GitHub and gitlab for efficient downloads of the relevant files. The URL can also be a link to a directory that contains multiple configuration files. Remote configuration files are retrieved and cached for the environment. Configuration files with the same name will not be overwritten once cached.
| errors = [] | ||
| for url in self.candidate_urls: | ||
| if not self._existing_url(url): | ||
| if not web_util.url_exists(url, self.curl): |
There was a problem hiding this comment.
I think using self.curl triggers:
spack/lib/spack/spack/fetch_strategy.py
Lines 295 to 302 in 99fcc57
which in turn has the required=True argument. I didn't test it, but by just reading the code I think this might be the cause of the behavior reported in #34029 (comment) @trws @tldahlgren
There was a problem hiding this comment.
If the above is correct, this PR makes curl a requirement even if Spack is configured not to use it.
Fixes #28840
FYI. @adrienbernede
This PR adds support in environments for external configuration files obtained from a URL with a preference for grabbing raw text from
GitHubandgitlabfor efficient downloads of the relevant files.Spack currently assumes these files reside on the local file system but projects and facilities put a lot of effort into defining standard, or base, configuration files for their systems. Spack should make it easier to use configuration files fetched from the web.
Some have system-specific configurations, more inclusive environment files, or both. Examples can be found at:
You can provide the URL for a folder, which will result in all
.yamlfiles directly under it to be cached (under the environment), or a specific (.yaml) file. Consequently, this PR works for managed/named and independent environments.See https://spack.readthedocs.io/en/latest/environments.html#included-configurations for use.
STAGING/CACHING
This PR will not re-stage or overwrite configuration files with the same name if they have already been cached. This will ensure the configuration is unchanged and, should someone intentionally modify their cached file(s), any local changes are not replaced.
TBD
spack.*files and copied over.)TODO: