Skip to content

Environments: Add support for include URLs#29026

Merged
tldahlgren merged 4 commits intospack:developfrom
tldahlgren:add-named-config-support
Aug 30, 2022
Merged

Environments: Add support for include URLs#29026
tldahlgren merged 4 commits intospack:developfrom
tldahlgren:add-named-config-support

Conversation

@tldahlgren
Copy link
Copy Markdown
Contributor

@tldahlgren tldahlgren commented Feb 17, 2022

Fixes #28840

FYI. @adrienbernede

This PR adds support in environments for external configuration files obtained from a URL with a preference for grabbing raw text from GitHub and gitlab for efficient downloads of the relevant files.

Spack currently assumes these files reside on the local file system but projects and facilities put a lot of effort into defining standard, or base, configuration files for their systems. Spack should make it easier to use configuration files fetched from the web.

Some have system-specific configurations, more inclusive environment files, or both. Examples can be found at:

You can provide the URL for a folder, which will result in all .yaml files directly under it to be cached (under the environment), or a specific (.yaml) file. Consequently, this PR works for managed/named and independent environments.

See https://spack.readthedocs.io/en/latest/environments.html#included-configurations for use.

STAGING/CACHING
This PR will not re-stage or overwrite configuration files with the same name if they have already been cached. This will ensure the configuration is unchanged and, should someone intentionally modify their cached file(s), any local changes are not replaced.

TBD

  • What changes are needed to support sharing environments from open to closed networks? (Note the environment's configuration subdirectory could be archived with the spack.* files and copied over.)
  • What would be the syntax for including a repository URL and identifying the relevant configuration file(s)?

TODO:

@spackbot-app spackbot-app bot added the tests General test capability(ies) label Feb 17, 2022
@tldahlgren tldahlgren force-pushed the add-named-config-support branch from 09dc184 to 83a0e4e Compare February 25, 2022 03:38
@tldahlgren tldahlgren force-pushed the add-named-config-support branch from 83a0e4e to 7553cdc Compare February 26, 2022 02:18
@tldahlgren tldahlgren force-pushed the add-named-config-support branch 4 times, most recently from ea3bacf to c21d2d1 Compare March 2, 2022 00:01
@tldahlgren tldahlgren force-pushed the add-named-config-support branch from b010a86 to 761b04e Compare March 12, 2022 00:17
@tldahlgren tldahlgren force-pushed the add-named-config-support branch 2 times, most recently from 500e110 to bb676ef Compare March 25, 2022 01:41
@tldahlgren tldahlgren requested a review from scheibelp March 28, 2022 18:02
@scheibelp scheibelp self-assigned this Mar 28, 2022
@trws
Copy link
Copy Markdown
Contributor

trws commented Mar 30, 2022

I wasn't sure when I first saw this, but the more I think about it the more I love this idea. It has the feel of nix flakes almost. Any chance of a convenient way to give it a git url to clone or checkout a given revision to grab the file from and build resulting components from? It seems like pointing to a github raw URL would probably work to fetch things, but then it would have to have the information in two places in a way, so made me wonder.

@tldahlgren
Copy link
Copy Markdown
Contributor Author

I wasn't sure when I first saw this, but the more I think about it the more I love this idea. It has the feel of nix flakes almost. Any chance of a convenient way to give it a git url to clone or checkout a given revision to grab the file from and build resulting components from? It seems like pointing to a github raw URL would probably work to fetch things, but then it would have to have the information in two places in a way, so made me wonder.

The idea was to keep it simple. Allow the configuration information to be maintained in a repo, for instance, so we can save people from having to fetch and store it locally before they can re-use it.

Any chance of a convenient way to give it a git url to clone or checkout a given revision to grab the file from and build resulting components from?

I could see the convenience of having Spack build software from an environment file (and associated configuration if suitably configured) that it clones from a URL. But IMO that would be the "next" step and another PR.

What did you have in mind?

@tldahlgren tldahlgren force-pushed the add-named-config-support branch from ffe1ec6 to 7b014e8 Compare March 31, 2022 23:05
Copy link
Copy Markdown
Member

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have questions/comments about the overall approach (so I'm
avoiding commenting on the individual changes for now):

Firstly, I assume this is just for SingleFileScopes (i.e. does not
support URLs that point to a directory of configuration files).

Secondly, these changes seem more extensive than I would expect: I would
assume you could modify Environment.included_config_scopes to download
a URL to a temporary file and then consume it like any other
SingleFileScope e.g. in included_config_scopes instead of:

local_path = url.path if url.scheme == 'file' else None

do

# download url to some temporary file
local_path = the_temporary_file_path

In that case, you don't have to "prepare" for cases where config was
previously a path but in the current changes could be a URL (e.g. in
config.read_config_file).

@tldahlgren tldahlgren force-pushed the add-named-config-support branch 3 times, most recently from 8c3e1f1 to 44eab25 Compare May 12, 2022 22:55
@tldahlgren tldahlgren marked this pull request as ready for review May 12, 2022 22:57
@tldahlgren tldahlgren force-pushed the add-named-config-support branch 2 times, most recently from e4dfcf4 to b681499 Compare May 18, 2022 00:48
@tldahlgren
Copy link
Copy Markdown
Contributor Author

@tgamblin @scheibelp The only check that's failing ATM are the windows tests.

@tldahlgren tldahlgren requested review from scheibelp and tgamblin May 19, 2022 00:04
Copy link
Copy Markdown
Member

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some requests related to:

  • Use of real URLs (and internet communication) as part of the tests
  • Possible bug in raw_url and also a potential issue with fetch_remote_configs

Let me know if any of this seems off-track.

@tldahlgren
Copy link
Copy Markdown
Contributor Author

I have some requests related to:

  • Use of real URLs (and internet communication) as part of the tests

I was using real URLs but @tgamblin said not to use them, or he didn't like that
I skipped them is not connected to internet.

I could modify the ones I have for that Purpose.

  • Possible bug in raw_url and also a potential issue with fetch_remote_configs

Ooh. I'll have to check.

Let me know if any of this seems off-track.

@tldahlgren
Copy link
Copy Markdown
Contributor Author

@spackbot fix style

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Aug 10, 2022

Let me see if I can fix that for you!

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Aug 10, 2022

I was able to run spack style --fix for you!

spack style --fix
==> Running style checks on spack
  selected: isort, mypy, black, flake8
==> Modified files
  lib/spack/spack/binary_distribution.py
  lib/spack/spack/cmd/buildcache.py
  lib/spack/spack/config.py
  lib/spack/spack/environment/environment.py
  lib/spack/spack/fetch_strategy.py
  lib/spack/spack/package_base.py
  lib/spack/spack/stage.py
  lib/spack/spack/test/bootstrap.py
  lib/spack/spack/test/cmd/env.py
  lib/spack/spack/test/cmd/spec.py
  lib/spack/spack/test/config.py
  lib/spack/spack/test/conftest.py
  lib/spack/spack/test/gcs_fetch.py
  lib/spack/spack/test/packaging.py
  lib/spack/spack/test/s3_fetch.py
  lib/spack/spack/test/stage.py
  lib/spack/spack/test/url_fetch.py
  lib/spack/spack/util/path.py
  lib/spack/spack/util/url.py
  lib/spack/spack/util/web.py
==> Running isort checks
Fixing /tmp/tmps69v_yt9/spack/lib/spack/spack/test/config.py
Fixing /tmp/tmps69v_yt9/spack/lib/spack/spack/test/conftest.py
  isort checks were clean
==> Running mypy checks
Success: no issues found in 555 source files
  mypy checks were clean
==> Running black checks
reformatted lib/spack/spack/config.py
reformatted lib/spack/spack/fetch_strategy.py
reformatted lib/spack/spack/environment/environment.py
reformatted lib/spack/spack/util/web.py
All done! ✨ 🍰 ✨
4 files reformatted, 16 files left unchanged.
  black checks were clean
==> Running flake8 checks
  flake8 checks were clean
==> spack style checks were clean
Keep in mind that I cannot fix your flake8 or mypy errors, so if you have any you'll need to fix them and update the pull request. If I was able to push to your branch, if you make further changes you will need to pull from your updated branch before pushing again.

I've updated the branch with isort fixes.

@tldahlgren
Copy link
Copy Markdown
Contributor Author

ping @scheibelp

1 similar comment
@tldahlgren
Copy link
Copy Markdown
Contributor Author

ping @scheibelp

Copy link
Copy Markdown
Member

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks nearly done. My only concerns are related to documentation and fail-fast behavior. Let me know if you disagree with any of the comments.

else []
)
basename = os.path.basename(config_path)
if basename in staged_configs:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this now allows multiple URL-based includes as long as:

  • They are for different sections
  • They are distinct directories

So for example it looks like if you have:

* include: http://example.com/1/packages.yaml
* include: http://example.com/2/packages.yaml

that only the first would be used. This is fine as long as it is documented.

Copy link
Copy Markdown
Contributor Author

@tldahlgren tldahlgren Aug 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you suggesting a directory hierarchy for the staged files would be preferred?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry no I didn't mean to imply any particular solution. If we wanted to change this, then one possibility is that you could create a new filename that is a hash of the URL path (we do similar things e.g. for package source downloads to avoid collisions.

To be clear though changing the behavior is unnecessary here: my original request is just that this is clarified in the documentation about environment includes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified the comment.


def test_env_with_included_config_scope(tmpdir, packages_file):
"""Test inclusion of a 'remote' package configuration file 'staged' in the
environment's configuration stage directory."""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble relating this docstring to this test:

  • I don't see how this is remote: it's a file (and in particular it's not a URL)
  • I assume the file might become staged as part of the process but not sure if that's an essential detail (however the environment wants to manage this "under the hood" is fine, not something that needs to be explained here).
  • That being said, when I read through the code in environment.py, I don't see local files that exist being placed in the config stage directory (that only appears to happen for URLs).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to convey that this test is intended to represent the case of including a previously staged (from a remote URL) package configuration file. Will re-word.

Local files are not staged since they are already available on the file system so their existing file path is sufficient.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified the comment.

@tldahlgren tldahlgren force-pushed the add-named-config-support branch from 97dc373 to a172b95 Compare August 26, 2022 00:34
@tldahlgren
Copy link
Copy Markdown
Contributor Author

@spackbot fix style

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Aug 26, 2022

Let me see if I can fix that for you!

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Aug 26, 2022

I was able to run spack style --fix for you!

spack style --fix
==> Running style checks on spack
  selected: isort, mypy, black, flake8
==> Modified files
  lib/spack/spack/binary_distribution.py
  lib/spack/spack/cmd/buildcache.py
  lib/spack/spack/config.py
  lib/spack/spack/environment/environment.py
  lib/spack/spack/fetch_strategy.py
  lib/spack/spack/package_base.py
  lib/spack/spack/stage.py
  lib/spack/spack/test/bootstrap.py
  lib/spack/spack/test/cmd/env.py
  lib/spack/spack/test/cmd/spec.py
  lib/spack/spack/test/config.py
  lib/spack/spack/test/conftest.py
  lib/spack/spack/test/gcs_fetch.py
  lib/spack/spack/test/packaging.py
  lib/spack/spack/test/s3_fetch.py
  lib/spack/spack/test/stage.py
  lib/spack/spack/test/url_fetch.py
  lib/spack/spack/util/path.py
  lib/spack/spack/util/url.py
  lib/spack/spack/util/web.py
==> Running isort checks
  isort checks were clean
==> Running mypy checks
Success: no issues found in 558 source files
  mypy checks were clean
==> Running black checks
reformatted lib/spack/spack/environment/environment.py
All done! ✨ 🍰 ✨
1 file reformatted, 19 files left unchanged.
  black checks were clean
==> Running flake8 checks
  flake8 checks were clean
==> spack style checks were clean
Keep in mind that I cannot fix your flake8 or mypy errors, so if you have any you'll need to fix them and update the pull request. If I was able to push to your branch, if you make further changes you will need to pull from your updated branch before pushing again.

I've updated the branch with isort fixes.

This involves staging remote configuration files to a subdir of the environment.
At this time, each remote file is staged ONCE to avoid overwriting user
customizations to the configuration file.
@tldahlgren tldahlgren force-pushed the add-named-config-support branch from 6af25b2 to 1d043c7 Compare August 26, 2022 21:00
@spackbot-app spackbot-app bot added the documentation Improvements or additions to documentation label Aug 26, 2022
@tldahlgren tldahlgren force-pushed the add-named-config-support branch from 1d043c7 to 3af08ca Compare August 26, 2022 21:55
@tldahlgren tldahlgren force-pushed the add-named-config-support branch from 3af08ca to 73ba0eb Compare August 26, 2022 22:15
@tldahlgren tldahlgren requested a review from scheibelp August 27, 2022 00:44
Copy link
Copy Markdown
Member

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there may have been an issue with naming tests, but otherwise this looks good to me (see comment).

@tldahlgren tldahlgren merged commit 3894cee into spack:develop Aug 30, 2022
ma595 pushed a commit to ma595/spack that referenced this pull request Sep 13, 2022
* Preliminary support for include URLs in spack.yaml (environment) files

This commit adds support in environments for external configuration files obtained from a URL with a preference for grabbing raw text from GitHub and gitlab for efficient downloads of the relevant files. The URL can also be a link to a directory that contains multiple configuration files.

Remote configuration files are retrieved and cached for the environment. Configuration files with the same name will not be overwritten once cached.
errors = []
for url in self.candidate_urls:
if not self._existing_url(url):
if not web_util.url_exists(url, self.curl):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using self.curl triggers:

@property
def curl(self):
if not self._curl:
try:
self._curl = which("curl", required=True)
except CommandNotFoundError as exc:
tty.error(str(exc))
return self._curl

which in turn has the required=True argument. I didn't test it, but by just reading the code I think this might be the cause of the behavior reported in #34029 (comment) @trws @tldahlgren

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the above is correct, this PR makes curl a requirement even if Spack is configured not to use it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 Great catch.

Copy link
Copy Markdown

@Fredlt87 Fredlt87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finalizar versión

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

binary-packages broken-develop commands core PR affects Spack core functionality documentation Improvements or additions to documentation environments fetching radiuss stage tests General test capability(ies) utilities

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Environments: Add support for include URLs

7 participants