Skip to content

Add binary distribution cache manager#18359

Merged
scottwittenburg merged 3 commits intospack:developfrom
scottwittenburg:add-binary-distribution-cache-manager
Sep 30, 2020
Merged

Add binary distribution cache manager#18359
scottwittenburg merged 3 commits intospack:developfrom
scottwittenburg:add-binary-distribution-cache-manager

Conversation

@scottwittenburg
Copy link
Copy Markdown
Contributor

Provide the same features/benefits of #18212 but without having to change the spec hash and equality operators.

@scottwittenburg
Copy link
Copy Markdown
Contributor Author

FYI @tgamblin @zackgalbreath

@scottwittenburg
Copy link
Copy Markdown
Contributor Author

Just a heads up @gartung that this refactor of binary cache management is in the works and could be ready for review fairly soon. I'd like to add you as a reviewer when it's ready.

@scottwittenburg scottwittenburg force-pushed the add-binary-distribution-cache-manager branch 4 times, most recently from 5d6cd8d to 2d11afe Compare September 10, 2020 19:01
@scottwittenburg scottwittenburg changed the title WIP: Add binary distribution cache manager Add binary distribution cache manager Sep 10, 2020
@scottwittenburg
Copy link
Copy Markdown
Contributor Author

fyi @zackgalbreath @opadron @aashish24

def find_built_spec(self, spec):
"""Find the built spec corresponding to ``spec``.

If the spec can be found among the confighured binary mirrors, an
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/confighured/configured

"""

def __init__(self, cache_root=None):
self._cache_root = default_manager_root
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend lines 60-62:

Suggested change
self._cache_root = default_manager_root
self._cache_root = cache_root or default_manager_root

cache_path = self._index_file_cache.cache_path(cache_key)

self._local_index_cache = {}
if os.path.exists(cache_path) and os.path.isfile(cache_path):
Copy link
Copy Markdown
Contributor

@tldahlgren tldahlgren Sep 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not necessary to call exists and isfile on the same path. Suggest simply checking isfile to avoid the extra function call:

Suggested change
if os.path.exists(cache_path) and os.path.isfile(cache_path):
if os.path.isfile(cache_path):

since

@scottwittenburg scottwittenburg force-pushed the add-binary-distribution-cache-manager branch 2 times, most recently from de92621 to 7449cad Compare September 22, 2020 00:06
if os.path.isfile(cache_path):
with self._index_file_cache.read_transaction(
cache_key) as cache_file:
self._local_index_cache = json.load(cache_file)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to have a test that covers this line since this is new code.

locally_computed_hash = compute_hash(index_object_str)

if fetched_hash is not None and locally_computed_hash != fetched_hash:
msg_tmpl = ('Computed hash ({0}) did not match remote ({1}), '
Copy link
Copy Markdown
Contributor

@tldahlgren tldahlgren Sep 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test to exercise this code?

if spec_dag_hash not in self._built_spec_cache:
self._built_spec_cache[spec_dag_hash] = found_list
else:
current_list = self._built_spec_cache[spec_dag_hash]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test to exercise this path?

current_list = self._built_spec_cache[spec_dag_hash]
for new_entry in found_list:
for cur_entry in current_list:
if new_entry['mirror_url'] == cur_entry['mirror_url']:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add unit tests to cover these paths?

Copy link
Copy Markdown
Contributor

@tldahlgren tldahlgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. While coverage is within the allowance, there are a few key places (marked) where it would be good to have unit tests exercising the (new) code.

@scottwittenburg
Copy link
Copy Markdown
Contributor Author

Thanks @tldahlgren I'll work on covering the lines you indicated.

Copy link
Copy Markdown
Contributor

@tldahlgren tldahlgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra tests for increasing patch coverage would be good to have; however, not worth holding up other important work.

This key used to live in the buildcache_spec schema.  This change moves
it into the spec schema so it can exist with each spec anywhere that
schema is used (just buildcache_spec and database_index at the moment).
This change adds a class which provides two levels of caching to the
module.

At one level, it caches buildcache index files from mirrors,
and uses a hash of the index contents to track whether the local caches
need to be updates.

Then the cached index files can be used to answer questions like "what
binary specs are available to me, based on my configured mirrors".  In
that case, the cached index files are used to populate the other level
of cache, the built spec cache.

The built spec cache in the cache manager replaces the previous global
"_cached_specs" variable with a cache that keeps track of which mirrors
contains which concrete specs.  However, if this cache does not contain
a target spec, the direct approach to fetching a spec.yaml from a mirror
is still supported.

One goal of this work is to support a new install option which will tell
spack to avoid installation from binary unless the full_hash of the binary
spec on te mirror mathes that of the local concrete spec.
@scottwittenburg scottwittenburg force-pushed the add-binary-distribution-cache-manager branch from 7449cad to a1b07aa Compare September 30, 2020 18:14
This option defaults to false, preserving the previous behavior, which
was to install the first binary package found that matched the DAG hash
of the target spec.  If this option is provided, spack will install the
spec from source if it cannot find a binary spec on a remote mirror with
a full_hash that matches that of the local target spec.
@scottwittenburg scottwittenburg force-pushed the add-binary-distribution-cache-manager branch from a1b07aa to 45e1d82 Compare September 30, 2020 19:14
@scottwittenburg scottwittenburg merged commit 075c3e0 into spack:develop Sep 30, 2020
@scottwittenburg scottwittenburg deleted the add-binary-distribution-cache-manager branch September 30, 2020 22:37
robertu94 pushed a commit to robertu94/spack that referenced this pull request Oct 2, 2020
…bution-cache-manager

Add binary distribution cache manager
scottwittenburg added a commit that referenced this pull request Oct 2, 2020
When we attempt to determine whether a remote spec (in a binary mirror)
is up-to-date or needs to be rebuilt, we compare the full_hash stored in
the remote spec.yaml file against the full_hash computed from the local
concrete spec.  Since the full_hash moved into the spec (and is no longer
at the top level of the spec.yaml), we need to look there for it.  This
oversight from #18359 was causing all specs to get rebuilt when the
full_hash wasn't fouhd at the expected location.
scottwittenburg added a commit that referenced this pull request Oct 2, 2020
When we attempt to determine whether a remote spec (in a binary mirror)
is up-to-date or needs to be rebuilt, we compare the full_hash stored in
the remote spec.yaml file against the full_hash computed from the local
concrete spec.  Since the full_hash moved into the spec (and is no longer
at the top level of the spec.yaml), we need to look there for it.  This
oversight from #18359 was causing all specs to get rebuilt when the
full_hash wasn't fouhd at the expected location.
scottwittenburg added a commit that referenced this pull request Oct 2, 2020
When we attempt to determine whether a remote spec (in a binary mirror)
is up-to-date or needs to be rebuilt, we compare the full_hash stored in
the remote spec.yaml file against the full_hash computed from the local
concrete spec.  Since the full_hash moved into the spec (and is no longer
at the top level of the spec.yaml), we need to look there for it.  This
oversight from #18359 was causing all specs to get rebuilt when the
full_hash wasn't fouhd at the expected location.
@adamjstewart adamjstewart mentioned this pull request Oct 4, 2020
3 tasks
scottwittenburg added a commit that referenced this pull request Oct 5, 2020
…tribution-cache-manager"

This reverts commit 075c3e0.
scheibelp pushed a commit that referenced this pull request Oct 5, 2020
This reverts #18359 and follow-on PRs intended to address issues with
#18359 because that PR changes the hash of all specs. A future PR will
reintroduce the changes.

* Revert "Fix location in spec.yaml where we look for full_hash (#19132)"
* Revert "Fix fetch of spec.yaml files from buildcache (#19101)"
* Revert "Merge pull request #18359 from scottwittenburg/add-binary-distribution-cache-manager"
scottwittenburg added a commit that referenced this pull request Oct 6, 2020
When we attempt to determine whether a remote spec (in a binary mirror)
is up-to-date or needs to be rebuilt, we compare the full_hash stored in
the remote spec.yaml file against the full_hash computed from the local
concrete spec.  Since the full_hash moved into the spec (and is no longer
at the top level of the spec.yaml), we need to look there for it.  This
oversight from #18359 was causing all specs to get rebuilt when the
full_hash wasn't fouhd at the expected location.
scottwittenburg added a commit to scottwittenburg/spack that referenced this pull request Oct 16, 2020
When we attempt to determine whether a remote spec (in a binary mirror)
is up-to-date or needs to be rebuilt, we compare the full_hash stored in
the remote spec.yaml file against the full_hash computed from the local
concrete spec.  Since the full_hash moved into the spec (and is no longer
at the top level of the spec.yaml), we need to look there for it.  This
oversight from spack#18359 was causing all specs to get rebuilt when the
full_hash wasn't fouhd at the expected location.
scottwittenburg added a commit to scottwittenburg/spack that referenced this pull request Oct 23, 2020
When we attempt to determine whether a remote spec (in a binary mirror)
is up-to-date or needs to be rebuilt, we compare the full_hash stored in
the remote spec.yaml file against the full_hash computed from the local
concrete spec.  Since the full_hash moved into the spec (and is no longer
at the top level of the spec.yaml), we need to look there for it.  This
oversight from spack#18359 was causing all specs to get rebuilt when the
full_hash wasn't fouhd at the expected location.
scottwittenburg added a commit to scottwittenburg/spack that referenced this pull request Oct 29, 2020
When we attempt to determine whether a remote spec (in a binary mirror)
is up-to-date or needs to be rebuilt, we compare the full_hash stored in
the remote spec.yaml file against the full_hash computed from the local
concrete spec.  Since the full_hash moved into the spec (and is no longer
at the top level of the spec.yaml), we need to look there for it.  This
oversight from spack#18359 was causing all specs to get rebuilt when the
full_hash wasn't fouhd at the expected location.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants