CI boilerplate reduction by kwryankrattiger · Pull Request #34272 · spack/spack

kwryankrattiger · 2022-12-02T17:41:25Z

Reduce boilerplate in the CI yaml by allowing composition based on pipeline targets OS/arch

blue42u · 2022-12-04T06:28:44Z

@kwryankrattiger Thanks for working on this! One high-level question, based on ci_process_includes it seems like the included configurations are merged and then mappings are applied later. If so, this has the quirky nuance I alluded to before and was hoping to avoid. Given this configuration sketch:

---  # common/ci.yaml
ci:
  variables:
    SIZE: "small"
  mappings:
  - match: [llvm]
    runner-attributes:
      variables:
        SIZE: "medium"

---  # big-env/spack.yaml
spack:
  spec:
  - llvm targets=all
  - paraview
  ci:
    include: common
    variables:
      SIZE: "large"  # All the packages in this environment are large

In the generated pipeline, a paraview build job will be SIZE:"large", but an llvm build job will be SIZE:"medium". That doesn't make intuitive sense to me, if I set the SIZE in spack.yaml I would expect it to override the variable set in the common configuration, no matter "where" it was set.

If this PR doesn't already, is there anything preventing the included configurations from being applied in order, instead of merged?

kwryankrattiger · 2022-12-05T16:05:37Z

@blue42u So the ci_process_includes function is no longer used in favor of adding a "ci.yaml" and "cdash.yaml" and "view.yaml" to spack. That said, you concern persists even with spacks default merging strategy.

To address the quirky behavior, I think that has more to do with how the "scope" is defined. At the top most level, you don't specify the value the must be applied to each job, you specify the "default" value that is populated to all jobs, including the service jobs and signing job.

The next scope that is applied is the job type scope. For the build jobs the build-job-attributes section is applied.

The last scope that is applied for build jobs is the mapping scope which will override and append to everything defined in the previous scopes.

This is all relatively consistent with how things were done before, but now configurations are allowed to span multiple files to move boilerplate out of the specific stack files.

To achieve the desired result of forcing all build jobs to set a variable you could either change the custom configuration scopes passed to the generate step to exclude the default mappings (subtle) or specify a mapping to all discovered packages that sets the variables:SIZE:"large" (less subtle).

One of the things that I am less thrilled about with the default spack merging is that script is going to merge using the prepend strategy which was the major difference between the spack.config.merge_yaml and spack.ci._merge_attributes routines. For before_script and after_script we probably want to use an append strategy and for script we always want to override.

I think maybe I am just reading the config stuff backwards and "source" prepending means the the lower precedence configs are processed last and therefore everything after is appended to it. I will run some tests once I get this working again.

But the non-override behavior for script is an issue. @tgamblin @scottwittenburg if either of you have ideas or comments on how this should work.

kwryankrattiger · 2022-12-05T16:20:55Z

One the override behavior, I think I found the solution, script:: [] should allow overriding values it seems, not sure if there is more needed for that, but that is what the docs seem to imply.

blue42u · 2022-12-05T16:44:33Z

@kwryankrattiger It worries me that you're trying to add key-specific merging behavior here, in #32300 I was avoiding that completely (to decouple as much as possible the Spack configuration format from GitLab CI's own schema). There are also weird edge cases I do not believe the Spack config-merge can or should handle, e.g. cache: can be a list or a dict (interpreted as a 1-element list).

Regarding "scope," while I understand the explanation and know that the proposed workaround works, I still find it very unintuitive and confusing. It introduces cross-dependencies where one config needs to know where values are set in "previous" configs to even work. Which means a setup using "mixin" configs is more-or-less unmaintainable (or at the very least, really confusing).

kwryankrattiger · 2022-12-05T17:05:26Z

I do not believe the Spack config-merge can or should handle, e.g. cache: can be a list or a dict (interpreted as a 1-element list).

I did check this, and it is handled by checking if one of them is a list, converting the dict value to a list, and then applying the merge. This should be okay.

where one config needs to know where values are set in "previous" configs to even work.

I don't think there is really a way around this. The way spack configs work in general is read -> merge -> get processed by spack command.

Even if each yaml was read and processed in order in the CI module, which is what we decided not to do by moving to the spack config merge instead of the ci_process_includes, what is the correct logic for overriding values at different scopes across different configuration precedence? I feel like the rules would be even more convoluted in that workflow than they are using the default spack workflow.

Having to know which configs are being used, the order of precedence they are specified in, and what values they set I think is an okay requirement since that is already a requirement when using configuration scopes in spack. The biggest difference is the CI module has scopes to make it easier to say "do this for everything" once at a high level rather than having to always specify configs at the lower level. Things like SIZE that get specified at a mapping level in general will just always need to be specified at a mapping level.

kwryankrattiger · 2022-12-05T17:08:26Z

It worries me that you're trying to add key-specific merging behavior here

The most recent changes are trying not to do this because your concerns were echoed by others. The scripts are the only ones that I am a little concerned about the default behavior being applied, but they will probably be fine.

blue42u · 2022-12-05T17:08:38Z

Crazy half-baked idea, what if it was a list of "modifications" applied to the generated pipeline in reverse order:

--- # common/ci.yaml
ci:
  pipeline-gen:  # TODO: Better name for this key
  - match_behavior: first
    submapping:
    - match: [llvm]
      build-job-attributes:
        +variables: {SIZE: "medium"}
  - build-job-attributes:
      +variables: {SIZE: "small"}

--- # big-env/spack.yaml
spack:
  include: common/ci.yaml
  spec:
  - llvm targets=all
  - paraview
  ci:
    pipeline-gen:
    - +variables: {SIZE: "large"}

After the Spack config-merge, the above two should become the unified config:

spack:
  spec:
  - llvm targets=all
  - paraview
  ci:
    pipeline-gen:
    - +variables: {SIZE: "large"}
    - match_behavior: first
      submapping:
      - match: [llvm]
        build-job-attributes:
          +variables: {SIZE: "medium"}
    - build-job-attributes:
        +variables: {SIZE: "small"}

The entries below pipeline-gen: can then be "applied" to the pipeline in reverse order using a custom merge (e.g. the + merge prefix from #32300), and the final end result correctly prefers spack.yaml over common/ci.yaml.

Still a bit confusing from the reversed order (it would need to be explained as something like "value from first matching key" and "+ prefix indicates use next matching sub-key"). IMHO it's still less confusing (and more maintainable) than 3 different "scopes" at which changes could be made.

kwryankrattiger · 2022-12-05T17:18:13Z

Crazy half-baked idea, what if it was a list of "modifications" applied to the generated pipeline in reverse order:

From what I can tell, this is exactly like what is already there, but less verbose. The biggest change this is making I think is the contents of the mappings list can either be just a runner attributes dict, or a submapping/matches specification. I do like submapping over matches.

I like the idea of having a better shorthand for saying "everything gets this mapping if it gets here" that is not the somewhat odd match: ["@:"] hack to catch all of the things.

kwryankrattiger · 2022-12-05T17:29:40Z

actually...I think I am seeing the more subtle part of this proposal. There would be no top level runner attributes, except for maybe a subset which we can chose like image.

I think the way this would look then would be

--- # common/ci.yaml
ci:
  pipeline-gen:  # TODO: Better name for this key
  - match_behavior: first
    submapping:
    - match: [llvm]
      build-job-attributes:
        +variables: {SIZE: "medium"}
  # build-job-attrbutes apply all block
  -  +variables: {SIZE: "small"}

--- # big-env/spack.yaml
spack:
  include: common/ci.yaml
  spec:
  - llvm targets=all
  - paraview
  ci:
    pipeline-gen:
    # stack override apply all block
    - +variables: {SIZE: "large"}

blue42u · 2022-12-05T17:36:52Z

I wanted to keep the build-job-attributes: key to apply a modification only to the build jobs (and not rebuild-index, no-specs-to-rebuild, etc.). It is also needed to properly distinguish between submapping: the Spack key and submapping: the potential GitLab CI key.

But otherwise, exactly!

kwryankrattiger · 2022-12-05T17:56:07Z

Mappings always only apply to the build jobs, so maybe it would make sense to nest the mappings under that section? If we go this route, the idea of having a build-job-attributes section kind of becomes redundant. It would make sense to me to just unify all of the job attribute mapping to the _find_matching_config routine and call it done. Then I could remove the extra step of merging the build-job-attributes section into the top level configs when initializing build job configs.

blue42u · 2022-12-05T18:15:27Z

There's also signing-job-attributes:, and from #32300 noop-job-attributes:, reindex-job-attributes: and pipeline-attributes:. I would like to give them different attributes so having separate keys is important in pipeline-gen: entries.

The problem with submapping under build-job-attribute: is that it conflicts with the potential GitLab key submapping:, while if you remove build-job-attributes: from the match: clause you conflict with the potential GitLab key match:. It's redundant but necessary.

Basically, for #32300 the parts of the config that are GitLab CI attributes should be considered (practically) a opaque blob of data (except in _merge_attributes):

--- # common/ci.yaml
ci:
  pipeline-gen:  # TODO: Better name for this key
  - match_behavior: first
    submapping:
    - match: [llvm]
      build-job-attributes: GITLAB BITS
  - build-job-attributes: GITLAB BITS

--- # big-env/spack.yaml
spack:
  include: common/ci.yaml
  spec:
  - llvm targets=all
  - paraview
  ci:
    pipeline-gen:
    - build-job-attributes: GITLAB BITS

So, I'm pretty sure all the build-job-attributes: keys are needed to make this work properly.

(It looks like there was a missing build-job-attributes: in my first example, sorry. This one is correct.)

kwryankrattiger · 2022-12-05T20:51:13Z

Okay, I think I am seeing more where you are going with this, your idea is taking this quite a bit further than I originally was thinking, but I think that is a good thing. It certainly leans more into your work with handling then merge semantics, which I think is good.

I like this idea of the pipeline-gen that pushes all of the configuration blocks into a list, rather than trying to track map levels and mixed scope level between configs. In the end, it makes it more consistent when trying to reason about what you actually get in the end and validating that configs are being scoped in the correct order.

This will make the schema a tad more complicated, because there will be three types of items allowed now instead of one, global, <job>-attributes, and submapping items. But, in the ci module code, this could be modeled by looping the pipeline-gen list and calling handlers based on the type of scope encountered which would likely simplify parts of the CI module quite a bit since applying configs to jobs will always be done the same, it is only the job selection that would be different.

I would like to get more feedback from @scottwittenburg on this, but I think this is a good direction.

adamjstewart

I like this a lot! This resolves most of my complaints about adding new pipelines. Some additional ideas (alternatives?) worth considering, either in this PR or more likely a follow-up PR:

Platform based on directory name

I wonder if we could condition arch on pipeline location. For example, move:

share/spack/gitlab/cloud_pipelines/stacks/ml-linux-x86_64-cpu/spack.yaml

to:

share/spack/gitlab/cloud_pipelines/linux/x86_64/ml-cpu/spack.yaml

and have the arch automatically detected based on the file location.

Matrix of platforms

For a lot of (most?) pipelines, we want to build a set of packages on most (all?) platforms. For example, for the ML stack, I'm planning on testing the following cross product:

platform: linux, darwin
target: x86_64, aarch64, ppc64le
variants: +cuda, +rocm, ~~cuda~~rocm

This requires me to duplicate the list of packages I want to build in 18 files. If I could instead specify a matrix in spack.yaml and exclude known invalid combinations (no such thing as darwin ppc64le, tensorflow doesn't build on aarch64) that would make it much easier to maintain. As is, if I want to add a new package to the list, I would have to update 18 files.

See #32894 and #32893 for details on this.

P.S. Not trying to hold up this wonderful PR by suggesting more work, just wanted to put these ideas out here so people have them in mind while working on this refactor. I'm guessing these will end up being follow-up PRs in the end.

Will review again after this is rebased, the ML pipelines have changed quite a bit since this PR branch was first created and I want to make sure nothing is lost in the refactor.

blue42u · 2022-12-30T16:05:24Z

Thanks @adamjstewart for pushing this PR forward!

I'd like to request one change to be made before this is merged. Currently this PR changes the *script: fields representation to strings containing YAML code, I would much prefer if they were instead recursive lists of strings as is supported by the GitLab CI format. This allows sharing script-fragments between parts of the configuration by using YAML anchors, which I use frequently in my spack ci configuration, e.g.:

  ci:
    pipeline-gen:
    - reindex-job:
        script:
          - &init_spack
            - git clone ...
            - source ...
          - spack buildcache update-index ...
    - build-job:
        tags: [foo]
        script:
          - *init_spack
          - spack ci rebuild ...

I already raised this request with @kwryankrattiger over Slack DMs, we settled on an idea to reduce surprises when using this format: *script: keys default to override (*script::) unless explicitly indicated as merge (*script+:), all other keys merge by default (unless explicitly indicated as override). Since "merge" for lists is confusingly prepend (instead of append) and *script: keys are uniquely sensitive to order, this reduces surprises by making "merge" opt-in.

spackbot-app · 2023-02-03T16:15:06Z

I've started that pipeline for you!

kwryankrattiger · 2023-02-03T21:16:32Z

@spackbot rebuild everything

spackbot-app · 2023-02-03T21:16:56Z

I've started that pipeline for you!

kwryankrattiger · 2023-02-06T17:06:35Z

Itemizing all of the failures in "rebuild everything".
@scottwittenburg

Hash issue. May a GitHub fix gone stale? @haampie I think you have seen this before here

Other

https://gitlab.spack.io/spack/spack/-/jobs/5447338 - Network issue
https://gitlab.spack.io/spack/spack/-/jobs/5447682 - paraview Log too long
https://gitlab.spack.io/spack/spack/-/jobs/5447581 - py-tensorlfow log too long
https://gitlab.spack.io/spack/spack/-/jobs/5447695 - py-tensorlfow log too long
https://gitlab.spack.io/spack/spack/-/jobs/5442935 - pdt%oneapi, cannot reproduce locally

Didn't run rebuild script for some reason, no artifacts

Lost runner (@zackgalbreath )

Deprecated zlib packages

Configuration: - New notation for list concatenation (prepend/append) - New notation for string concatenation (prepend/append) - Break out configuration files for: ci.yaml, cdash.yaml, view.yaml - Spack CI section refactored to improve self-consistency and composability - Scripts are now lists of lists and/or lists of strings - Job attributes are now listed under precedence ordered list that are composed/merged using Spack config merge rules. - "service-jobs" are identified explicitly rather than as a batch CI: - Consolidate common, platform, and architecture configurations for all CI stacks into composable configuration files - Make padding consistent across all stacks (256) - Merge all package -> runner mappings to be consistent across all stacks Unit Test: - Refactor CI module unit-tests for refactor configuration Docs: - Add docs for new notations in configuration.rst - Rewrite docs on CI pipelines to be consistent with refactored CI workflow

kwryankrattiger · 2023-03-09T20:39:01Z

@spackbot run pipeline

spackbot-app · 2023-03-09T20:39:07Z

I've started that pipeline for you!

kwryankrattiger · 2023-03-10T18:24:44Z

@scottwittenburg quick, review this, CI passed!

haampie · 2023-03-10T18:48:47Z

Sorry for reviewing this only so late, but regarding +: and -:, I would sort of like the opposite... so that it mimics

>>> a = [1, 2, 3]
>>> a += [4]
>>> a
[1, 2, 3, 4]

So, key+: would be append.

It's time. One feature I have been waiting to be merged in upstream Spack (separate CI configuration: spack/spack#34272) has just been merged. The other feature (local binary cache: spack/spack#32136) remains stagant, but the older install-tree caching (https://gitlab.com/hpctoolkit/hpctoolkit/-/merge_requests/828) and the Buildah shared layer cache (https://gitlab.com/hpctoolkit/hpctoolkit/-/merge_requests/810) serve a similar purpose. This commit rips out all use of my custom Spack fork (from November), instead using upstream Spack develop. This unfreezes all the versions of dependencies tested in CI, meaning we will *really* be testing with the latest of all packages. Always.

* CI configuration boilerplate reduction and refactor Configuration: - New notation for list concatenation (prepend/append) - New notation for string concatenation (prepend/append) - Break out configuration files for: ci.yaml, cdash.yaml, view.yaml - Spack CI section refactored to improve self-consistency and composability - Scripts are now lists of lists and/or lists of strings - Job attributes are now listed under precedence ordered list that are composed/merged using Spack config merge rules. - "service-jobs" are identified explicitly rather than as a batch CI: - Consolidate common, platform, and architecture configurations for all CI stacks into composable configuration files - Make padding consistent across all stacks (256) - Merge all package -> runner mappings to be consistent across all stacks Unit Test: - Refactor CI module unit-tests for refactor configuration Docs: - Add docs for new notations in configuration.rst - Rewrite docs on CI pipelines to be consistent with refactored CI workflow * Script verbose environ, dev bootstrap * Port spack#35409

kwryankrattiger requested a review from scottwittenburg December 2, 2022 17:41

spackbot-app bot added commands core PR affects Spack core functionality gitlab Issues related to gitlab integration tests General test capability(ies) labels Dec 2, 2022

blue42u mentioned this pull request Dec 7, 2022

Various binary cache improvements #34371

Open

kwryankrattiger force-pushed the ci_boiler_plate_reduction branch from 5788af4 to 286d9fe Compare December 8, 2022 16:44

zackgalbreath mentioned this pull request Dec 8, 2022

gitlab ci: more resources for paraview and py-torch #34412

Merged

This was referenced Dec 8, 2022

[WIP] ML CI stack: matrix of targets #34353

Closed

binary cache: do not create separate spec.json.sig files #34350

Closed

blue42u mentioned this pull request Dec 11, 2022

[WIP] Refine CI configuration #34454

Closed

kwryankrattiger force-pushed the ci_boiler_plate_reduction branch from 7fb751b to 31475bf Compare December 13, 2022 21:21

blue42u mentioned this pull request Dec 20, 2022

spack ci: Refine job attribute configuration #32300

Closed

kwryankrattiger marked this pull request as ready for review December 29, 2022 21:57

tgamblin changed the title ~~Ci boiler plate reduction~~ CI boiler plate reduction Dec 29, 2022

adamjstewart previously approved these changes Dec 30, 2022

View reviewed changes

kwryankrattiger force-pushed the ci_boiler_plate_reduction branch from f63411f to 6e5c619 Compare February 3, 2023 17:36

tgamblin changed the title ~~CI boiler plate reduction~~ CI boilerplate reduction Feb 5, 2023

kwryankrattiger force-pushed the ci_boiler_plate_reduction branch 3 times, most recently from e32e16b to 223c915 Compare February 17, 2023 17:46

kwryankrattiger force-pushed the ci_boiler_plate_reduction branch from d836348 to 0d08dbc Compare February 24, 2023 16:07

kwryankrattiger added 3 commits March 7, 2023 15:22

Script verbose environ, dev bootstrap

d9ff46d

Port spack#35409

fee0e36

kwryankrattiger force-pushed the ci_boiler_plate_reduction branch from 0d08dbc to fee0e36 Compare March 7, 2023 21:22

kwryankrattiger enabled auto-merge (squash) March 9, 2023 20:39

kwryankrattiger mentioned this pull request Mar 10, 2023

Set build_jobs dynamically in CI to avoid oversubscription #35996

Merged

scottwittenburg self-requested a review March 10, 2023 19:25

scottwittenburg approved these changes Mar 10, 2023

View reviewed changes

kwryankrattiger merged commit f3595da into spack:develop Mar 10, 2023

haampie mentioned this pull request Mar 11, 2023

Drop --dev and drop set -x #36010

Merged

alalazo mentioned this pull request Mar 17, 2023

Ci backwards compat #36045

Merged

kwryankrattiger mentioned this pull request Mar 24, 2023

Missing backwards compatibility for gitlab-ci section after refactor #36409

Closed

3 tasks

alalazo mentioned this pull request Apr 13, 2023

Docker reproducer for pipelines fails due to path mismatch #36833

Closed

3 tasks

blue42u mentioned this pull request May 13, 2023

[WIP] Add an HPCToolkit stack #32327

Closed

3 tasks

Conversation

kwryankrattiger commented Dec 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blue42u commented Dec 4, 2022

Uh oh!

kwryankrattiger commented Dec 5, 2022

Uh oh!

kwryankrattiger commented Dec 5, 2022

Uh oh!

blue42u commented Dec 5, 2022

Uh oh!

kwryankrattiger commented Dec 5, 2022

Uh oh!

kwryankrattiger commented Dec 5, 2022

Uh oh!

blue42u commented Dec 5, 2022

Uh oh!

kwryankrattiger commented Dec 5, 2022

Uh oh!

kwryankrattiger commented Dec 5, 2022

Uh oh!

blue42u commented Dec 5, 2022

Uh oh!

kwryankrattiger commented Dec 5, 2022

Uh oh!

blue42u commented Dec 5, 2022

Uh oh!

kwryankrattiger commented Dec 5, 2022

Uh oh!

adamjstewart left a comment

Choose a reason for hiding this comment

Platform based on directory name

Matrix of platforms

Uh oh!

blue42u commented Dec 30, 2022

Uh oh!

spackbot-app bot commented Feb 3, 2023

Uh oh!

kwryankrattiger commented Feb 3, 2023

Uh oh!

spackbot-app bot commented Feb 3, 2023

Uh oh!

kwryankrattiger commented Feb 6, 2023

Uh oh!

kwryankrattiger commented Mar 9, 2023

Uh oh!

spackbot-app bot commented Mar 9, 2023

Uh oh!

kwryankrattiger commented Mar 10, 2023

Uh oh!

haampie commented Mar 10, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

kwryankrattiger commented Dec 2, 2022 •

edited

Loading