WIP: Package hash check binaries by scottwittenburg · Pull Request #8444 · spack/spack

scottwittenburg · 2018-06-12T00:41:20Z

A new command addressing the steps documented in #8387

tgamblin

@scottwittenburg: this looks mostly good. The logic is good.

I think the main questions I have are how we'll farm out the jobs from a release spec-set, and how this command fits into GitLab CI.

Any ideas? The spec-set can figure out what has changed, but then runners need to be scheduled to build the different dependencies. I wish we were actually having a telcon today so that we could discuss this, but let's do it on GitHub and/or email.

tgamblin · 2018-07-03T15:20:51Z

lib/spack/docs/example_files/test.yaml

+---
+spec-set:
+    include: [ ape, atompaw, transset]
+    exclude: [binutils,tk]


I forget if this is the case, but if there is an include list and an exclude list, the thing should start with what's in include and exclude should exclude from that set. So in this example, the exclude list wouldn't be necessary.

Makes sense, I'll keep that in mind. In the sample release file I provided (just as a start), I just left them both empty for the moment. I figured I would start populating the exclude list as I discover that some things don't work together.

tgamblin · 2018-07-03T15:21:20Z

lib/spack/docs/testing_guide.rst

+   #
+   # This YAML file describes a Spack test suite
+   #
+   spec-set:


I like spec-set.

tgamblin · 2018-07-03T15:21:55Z

lib/spack/docs/testing_guide.rst

+CDash test suites
+---------------------
+
+The ``spack test-suite`` command reads in a specialy formatted YAML file


Are you going to keep test-suite? I think it needs a rework, but it would make sense if it is part of what drives the CI. The issue I see is that it's sequential, and if we want to test 2700 packages quickly, we need to farm them out to separate runners.

Well, I was going to keep it, thinking it was in some ways separate from the release process. I'll give it another look, but like you say, we'll need to parallelize the builds. My feeling is that we'll want to "fan out" to as many little jobs as possible so we have maximum flexibility in how we distribute them. I have spent quite a bit of time discussing approaches in this area with @opadron, so it will be good when we can meet and discuss them. I'll make another comment on our current thinking along those lines somewhere else.

I think the test-suite command can be retired and we can start from scratch, given that you guys have much more flexible ideas about this.

tgamblin · 2018-07-03T15:29:54Z

lib/spack/spack/cmd/checkbinaries.py

+level = "long"
+
+
+_buildcache_path = os.path.join('build_cache')


I don't think this needs an os.path.join. Shouldn't this location be taken from binary_distribution? Oh... I just looked at that. It's not defined as a variable there, it's just a string written in lots of places... ugh. Can you make it into a variable in binary_distribution?

Yes, I was thinking the same thing here. I'll do it in my next pass.

tgamblin · 2018-07-03T15:33:13Z

lib/spack/spack/cmd/checkbinaries.py

+_pkg_yaml_match_regex = re.compile(r'<a href=[^>]+\>\s*([^<]+)\s*\<')
+
+
+def changed_package_files(args):


This should be moved to some common place like spack.util.git so that this command and spack flake8 can use the same code.

I agree with you here, if we keep this logic I'll move it somewhere it can be shared by both. However, just recently I've been thinking that using git to determine what has changed is flawed in general. Maybe you can straighten me out though. If we only look at changed packages, then will we not also need, for each changed package, to somehow walk back up the dependency tree and add all dependent packages to the rebuild list? I've recently been wondering if we just need to do a full_hash on every package in the release spec-set, since that will truly discover if any package needs to be rebuilt. To avoid a lot of network traffic, we could think about keeping a slightly more detailed index on the binary mirror so we can fetch all the full hashes in a single request.

If we only look at changed packages, then will we not also need, for each changed package, to somehow walk back up the dependency tree and add all dependent packages to the rebuild list?

Yes, that's true. Look at spack dependents for some code that does that -- it constructs an inverted dependency dict and uses that to get the static dependency information, which will tell you which packages could have changed.

I think computing the full hash for the full spec set is the best way to go. We need to do something equivalent to that. It may still be a win to run the code from spack dependents to figure out which packages could have changed, then only re-hash the things from the spec set that have those names, but it's probably only a small win at this point.

Modifying the index to have more info is definitely a good idea.

tgamblin · 2018-07-03T15:38:22Z

lib/spack/spack/cmd/checkbinaries.py

+
+def parse_yaml_list(index_html_contents):
+    result = []
+    match_list = _pkg_yaml_match_regex.findall(index_html_contents)


This looks good but I think adding a parseable json file alongside index.html would be good.

Yes, I agree. While I did add a yaml index to the buildcache command, I didn't yet update this to read it. Based on your comments on that PR (#8451) though, I think I'll change that to a json representation before updating this code.

Basically the undocumented policy we've been using is json for machine-generated stuff, yaml for human-edited stuff (config files). #2189 has a pretty lengthy discussion of how much faster json is.

tgamblin · 2018-07-03T15:40:13Z

lib/spack/spack/cmd/checkbinaries.py

+    except URLError as urlErr:
+        tty.error('Unable to open url %s' % index_path)
+        tty.error(urlErr)
+        return None


These return Nones should probably raise some type of SpackError. Any SpackErrorr raised will end the command and print the message from the error, or a stack trace in debug mode.

Sounds good.

tgamblin · 2018-07-03T15:40:59Z

lib/spack/spack/cmd/checkbinaries.py

+            try:
+                pkg_yaml.index('-%s-' % pkg_name)
+                tty.msg('  %s' % pkg_yaml)
+            except Exception as expn:


maybe a warning here? Dunno why this would happen.

At the time I wasn't quite sure that I could count on the form of the file name. Just a warning sounds fine though.

tgamblin · 2018-07-03T15:41:56Z

lib/spack/spack/cmd/checkbinaries.py

+    return result
+
+
+def get_mirror_rebuilds(mirror_name, mirror_url, pkg_change_set):


logic in here looks good to me -- code could use some polish.

tgamblin · 2018-07-03T15:42:52Z

lib/spack/spack/cmd/checkbinaries.py

+        help="configuration scope containing mirrors to check")
+
+
+def checkbinaries(parser, args):


check-binaries would be better - the function would be called check_binares in that case. does that work?

Fine with me, I don't see why it wouldn't work.

scottwittenburg · 2018-07-03T16:45:45Z

@tgamblin Thanks for the review. As I mentioned somewhere above, I wanted to mention what @opadron and I are currently thinking about farming out the build jobs. Please chime in Omar, if I've mis-spoken.

For one thing, we thought it would be nice if we were not tied too tightly to running on Kubernetes, so a purely gitlab-ci approach seemed like a good idea (though we might have more fine-grained control if we did want to tie ourselves to Kubernetes).

So gitlab-ci.yml documentation describes how jobs can be organized into stages, where all jobs in the same stage are run in parallel. Since they do not, however, provide any way to dynamically define those jobs, we're thinking of dynamically generating the gitlab-ci.yml file (perhaps on Travis would be good, where we could add it to a "throwaway" commit we push on top of the developers changes, and push that to our gitlab). The jobs we define in that file would associate a single package to build with any information needed to build it (like which container is needed, where it should be pushed afterwards, etc).

Then, via the magic of the work @opadron has done, those jobs can be run in parallel on the cluster. I've probably left out some important details in that high-level description, so feel free to poke holes in it to determine whether I just forgot to mention something, or if there are details we really haven't considered yet.

One thing I'm still not 100% clear on is where the different OS come in. Should OS be added as another dimension in the CombinatorialSpecSet? Or should we just have a list of a few we want to build on somewhere, and then run our release process for each one?

ping @aashish24 @bryonbean

opadron · 2018-07-03T16:58:11Z

@scottwittenburg your description is pretty spot-on, imo. Depending on how many builds we end up running, and how deep our pockets for AWS resources, we may want to limit the number of concurrent builds.

For the OS question, I think your latter suggestion- that we simply run our build matrix once for each of a static set of images that we specify up front- is a perfectly fine first step that we can adjust later if needed.

tgamblin · 2018-07-10T15:23:03Z

@scottwittenburg @opadron: Sorry for the late reply. I, like @opadron, think the plan is great. I like the idea of generating the gitlab YAML file from a spec set. I don't expect gitlab-ci.yaml to change much between releases, so I don't think this approach will cause a lot of noise in the repo, either (not that there aren't a lot of commits already).

I also agree that we shouldn't tie the workflow to Kube, though kube is a useful place to run cloud containers. I agree with @opadron about where the OS configurations should come from. Handling OS's by running for multiple images works for me. I don't think we can really handle that elegantly at the spec set level -- let something else run Spack and have it build for the platform it's running on.

- add CombinatorialSpecSet in spack.util.spec_set module. - class is iterable and encaspulated YAML parsing and validation. - Adjust YAML format to be more generic - YAML test-suite format now has a `matrix` section, which can contain multiple lists of specs, generated different ways. Including: - specs: a raw list of specs. - packages: a list of package names and versions - compilers: a list of compiler names and versions - All of the elements of `matrix` are dimensions for the build matrix; we take the cartesian product of these lists of specs to generate a build matrix. This means we can add things like [^mpich, ^openmpi] to get builds with different MPI versions. It also means we can multiply the build matrix out with lots of different parameters. - Add a schema format for test-suites - Bug fixes: - fix bug with constraining an anonymous spec by name, add a test.

This reflects the more general usefulness of the class for other purposes including, for example, specifying the set of specs that comprise a release.

Now it no longer uses the git diff to find changes, but rather just iterates over everything in the release spec_set. Also relies on updates to the buildcache command in another PR.

scottwittenburg · 2018-07-12T16:57:18Z

Take a look at the last commit here @opadron, for a first stab at what our .gitlab-ci.yaml might look like.

Also brings check_binaries.py command from spack#8444, and refactors it a bit to allow it to check a single spec against a single remote binary mirror. This way each job can check it's own spec against the mirror to see if it needs to do its work or just quit.

scottwittenburg · 2018-08-14T23:18:15Z

I'm closing this because it has been mostly superseded by #8718

Also brings check_binaries.py command from spack#8444, and refactors it a bit to allow it to check a single spec against a single remote binary mirror. This way each job can check it's own spec against the mirror to see if it needs to do its work or just quit.

scottwittenburg added the don't-merge-yet label Jun 12, 2018

scottwittenburg force-pushed the package-hash-check-binaries branch from add286f to 7fae0f2 Compare June 12, 2018 15:19

scottwittenburg mentioned this pull request Jun 12, 2018

Add full_hash and json index to buildcache #8451

Closed

scottwittenburg force-pushed the package-hash-check-binaries branch from 7fae0f2 to e229cab Compare June 12, 2018 21:50

tgamblin self-requested a review June 15, 2018 03:39

scottwittenburg force-pushed the package-hash-check-binaries branch from e229cab to 6312b9d Compare June 15, 2018 16:12

scottwittenburg force-pushed the package-hash-check-binaries branch from 9c45b46 to c64a053 Compare June 25, 2018 21:14

tgamblin added the buildcache label Jun 26, 2018

scottwittenburg force-pushed the package-hash-check-binaries branch from c64a053 to 3f9e792 Compare June 26, 2018 19:56

scottwittenburg mentioned this pull request Jun 26, 2018

style: Add pep8-naming plugin to flake8 #8582

Merged

tgamblin reviewed Jul 3, 2018

View reviewed changes

scottwittenburg and others added 9 commits July 10, 2018 11:36

WIP: Test changing some packages

444b4dd

Ensure binaries are rebuilt when package.py changes

e1030ba

WIP: Experiment with how job will be triggered in gitlab-ci

98fc27c

Rename test-suite schema to test-suite

c282bef

This reflects the more general usefulness of the class for other purposes including, for example, specifying the set of specs that comprise a release.

Add an initial spec set defining the release packages

3db825d

Remove compilers my system doesn't have, or checkbinaries doesn't work

64bd351

Comment out the validation until it's implemented

74af814

Start re-working checkbinaries

7555d65

Now it no longer uses the git diff to find changes, but rather just iterates over everything in the release spec_set. Also relies on updates to the buildcache command in another PR.

scottwittenburg force-pushed the package-hash-check-binaries branch from 964ddf8 to 7555d65 Compare July 11, 2018 20:00

First attempt at what .gitlab-ci.yaml might really look like

35f94fa

scottwittenburg mentioned this pull request Jul 16, 2018

WIP: Generate gitlab ci yaml #8718

Closed

mplegendre mentioned this pull request Jul 30, 2018

Allow installations of package sets across toolchains #8835

Closed

scottwittenburg closed this Aug 14, 2018

		level = "long"


		_buildcache_path = os.path.join('build_cache')

		_pkg_yaml_match_regex = re.compile(r'<a href=[^>]+\>\s([^<]+)\s\<')


		def changed_package_files(args):

		return result


		def get_mirror_rebuilds(mirror_name, mirror_url, pkg_change_set):

		help="configuration scope containing mirrors to check")


		def checkbinaries(parser, args):

Conversation

scottwittenburg commented Jun 12, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tgamblin left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

scottwittenburg commented Jul 3, 2018

Uh oh!

opadron commented Jul 3, 2018

Uh oh!

tgamblin commented Jul 10, 2018

Uh oh!

scottwittenburg commented Jul 12, 2018

Uh oh!

scottwittenburg commented Aug 14, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

scottwittenburg commented Jun 12, 2018 •

edited

Loading