Skip to content

Ci reactivate darwin pipelines#48453

Merged
eugeneswalker merged 10 commits intospack:developfrom
eugeneswalker:ci-reactivate-darwin-pipelines
Feb 26, 2025
Merged

Ci reactivate darwin pipelines#48453
eugeneswalker merged 10 commits intospack:developfrom
eugeneswalker:ci-reactivate-darwin-pipelines

Conversation

@eugeneswalker
Copy link
Copy Markdown
Contributor

Re-activating the darwin CI pipelines

@spackbot-app spackbot-app bot added core PR affects Spack core functionality gitlab Issues related to gitlab integration labels Jan 7, 2025
@eugeneswalker
Copy link
Copy Markdown
Contributor Author

@spackbot run pipeline

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Jan 10, 2025

I've started that pipeline for you!

@eugeneswalker
Copy link
Copy Markdown
Contributor Author

@spackbot run pipelines - tweaked local runner environment

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Jan 11, 2025

I've started that pipeline for you!

@eugeneswalker
Copy link
Copy Markdown
Contributor Author

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Jan 13, 2025

I've started that pipeline for you!

@eugeneswalker eugeneswalker force-pushed the ci-reactivate-darwin-pipelines branch from 693f8f3 to 42d4f0e Compare January 21, 2025 15:49
@eugeneswalker
Copy link
Copy Markdown
Contributor Author

There is an issue with the stage dir and permissions such that files are being created which cannot be deleted!

-r--r--r--  1 gitlab-runner-0  staff  609 Jan 21 10:40 tmp/stage/spack-stage-fzf-0.57.0-xa5f3xwhin2xgkvbpkhck74xzdbauvsz/pkg/mod/github.com/mattn/[email protected]/isatty_bsd.go

One of the first actions the gitlab-runner takes on processing a new job is to delete files from the old job, and that is failing because of the permission shown above:

...
warning: failed to remove tmp/stage/spack-stage-fzf-0.57.0-xa5f3xwhin2xgkvbpkhck74xzdbauvsz/pkg/mod/golang.org/x/[email protected]/term_unix_other.go: Permission denied
warning: failed to remove tmp/stage/spack-stage-fzf-0.57.0-xa5f3xwhin2xgkvbpkhck74xzdbauvsz/pkg/mod/golang.org/x/[email protected]/go.mod: Permission denied
warning: failed to remove tmp/stage/spack-stage-fzf-0.57.0-xa5f3xwhin2xgkvbpkhck74xzdbauvsz/pkg/mod/golang.org/x/[email protected]/terminal_test.go: Permission denied
...

We do want to keep the stage dir like below, but we need the permissions to be such that the files can be cleaned-up/deleted!

config:
  build_stage:
  - $spack/tmp/stage

@eugeneswalker eugeneswalker force-pushed the ci-reactivate-darwin-pipelines branch from 82e8eb0 to 3e9eb2c Compare January 24, 2025 19:53
@eugeneswalker
Copy link
Copy Markdown
Contributor Author

There is an issue with the stage dir and permissions such that files are being created which cannot be deleted!

* [[email protected] /au2dystiuwtwgkx52wmjkqbkiwq4wsdw Spack CI](https://gitlab.spack.io/spack/spack/-/jobs/14744491)
-r--r--r--  1 gitlab-runner-0  staff  609 Jan 21 10:40 tmp/stage/spack-stage-fzf-0.57.0-xa5f3xwhin2xgkvbpkhck74xzdbauvsz/pkg/mod/github.com/mattn/[email protected]/isatty_bsd.go

The issue here appeared to be due to golang mod files being created with read only permissions, by design. We can simply invoke spack clean at the end of darwin jobs to ensure the staged files get removed, rather than leaving them for the gitlab-runner to clean up.

@eugeneswalker
Copy link
Copy Markdown
Contributor Author

@spackbot run pipeline

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Jan 26, 2025

I've started that pipeline for you!

@eugeneswalker
Copy link
Copy Markdown
Contributor Author

@spackbot run pipeline

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Jan 26, 2025

I've started that pipeline for you!

@eugeneswalker eugeneswalker force-pushed the ci-reactivate-darwin-pipelines branch 2 times, most recently from 0b23a3b to c4c32bf Compare January 29, 2025 15:18
@eugeneswalker
Copy link
Copy Markdown
Contributor Author

There are a few build errors blocking this:

From: bootstrap-aarch64-darwin-build

  1. clingo-bootstrap@spack /x2a3vkokdagmv3c3de5hvduocmqshbru Spack CI
...
error: /Library/Developer/CommandLineTools/usr/bin/install_name_tool: for: /Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/opt/spack/[padded-to-256-chars]/darwin-sequoia-aarch64/apple-clang-16.0.0/clingo-bootstrap-spack-x2a3vkokdagmv3c3de5hvduocmqshbru/lib/libclingo.4.0.dylib (for architecture arm64) option "-add_rpath /Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/opt/spack/[padded-to-256-chars]/darwin-sequoia-aarch64/apple-clang-16.0.0/clingo-bootstrap-spack-x2a3vkokdagmv3c3de5hvduocmqshbru/lib" would duplicate path, file already has LC_RPATH for: /Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/opt/spack/[padded-to-256-chars]/darwin-sequoia-aarch64/apple-clang-16.0.0/clingo-bootstrap-spack-x2a3vkokdagmv3c3de5hvduocmqshbru/lib
...
  The C compiler
    "/Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/lib/spack/env/clang/clang"
  is not able to compile a simple test program.
  It fails with the following output:
    Change Dir: '/Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/tmp/stage/spack-stage-clingo-bootstrap-spack-x2a3vkokdagmv3c3de5hvduocmqshbru/spack-build-x2a3vko/CMakeFiles/CMakeScratch/TryCompile-9cNMK2'
...
    error: Error in reading profile /Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/tmp/stage/spack-stage-clingo-bootstrap-spack-x2a3vkokdagmv3c3de5hvduocmqshbru/spack-src/reports/default.profdata: No such file or directory
    gmake[1]: *** [CMakeFiles/cmTC_260ce.dir/build.make:82: CMakeFiles/cmTC_260ce.dir/testCCompiler.c.o] Error 1
    gmake[1]: Leaving directory '/Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/tmp/stage/spack-stage-clingo-bootstrap-spack-x2a3vkokdagmv3c3de5hvduocmqshbru/spack-build-x2a3vko/CMakeFiles/CMakeScratch/TryCompile-9cNMK2'
    gmake: *** [Makefile:134: cmTC_260ce/fast] Error 2
...
  1. [email protected] /2pvlbhhqutmkbmrjo7sgeliijfprmzgm Spack CI
...
error: /Library/Developer/CommandLineTools/usr/bin/install_name_tool: for: /Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/opt/spack/[padded-to-256-chars]/darwin-sequoia-aarch64/apple-clang-16.0.0/clingo-bootstrap-5.7.1-2pvlbhhqutmkbmrjo7sgeliijfprmzgm/lib/libclingo.4.0.dylib (for architecture arm64) option "-add_rpath /Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/opt/spack/[padded-to-256-chars]/darwin-sequoia-aarch64/apple-clang-16.0.0/clingo-bootstrap-5.7.1-2pvlbhhqutmkbmrjo7sgeliijfprmzgm/lib" would duplicate path, file already has LC_RPATH for: /Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/opt/spack/[padded-to-256-chars]/darwin-sequoia-aarch64/apple-clang-16.0.0/clingo-bootstrap-5.7.1-2pvlbhhqutmkbmrjo7sgeliijfprmzgm/lib
...
  The C compiler
    "/Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/lib/spack/env/clang/clang"
  is not able to compile a simple test program.
  It fails with the following output:
    Change Dir: '/Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/tmp/stage/spack-stage-clingo-bootstrap-5.7.1-2pvlbhhqutmkbmrjo7sgeliijfprmzgm/spack-build-2pvlbhh/CMakeFiles/CMakeScratch/TryCompile-Zh6pVg'
...
    error: Error in reading profile /Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/tmp/stage/spack-stage-clingo-bootstrap-5.7.1-2pvlbhhqutmkbmrjo7sgeliijfprmzgm/spack-src/reports/default.profdata: No such file or directory
    gmake[1]: *** [CMakeFiles/cmTC_480d8.dir/build.make:82: CMakeFiles/cmTC_480d8.dir/testCCompiler.c.o] Error 1
    gmake[1]: Leaving directory '/Users/gitlab-runner-0/builds/t1_TDJH3y/0/spack/spack/tmp/stage/spack-stage-clingo-bootstrap-5.7.1-2pvlbhhqutmkbmrjo7sgeliijfprmzgm/spack-build-2pvlbhh/CMakeFiles/CMakeScratch/TryCompile-Zh6pVg'
    gmake: *** [Makefile:134: cmTC_480d8/fast] Error 2

From: developer-tools-darwin-build

  1. [email protected] /3fuavwi5f3v7wweq6fvsyasg6kglrxuh Developer Tools Darwin
...
ld: duplicate LC_RPATH '/usr/local/spack-tools/spack/opt/spack/darwin-sonoma-aarch64/apple-clang-16.0.0/gcc-14.2.0-dxv7yb3pasmje7wnkq4eizyvyvsdl2d3/lib' in '/Users/gitlab-runner-0/builds/t1_a84b7x/0/spack/spack/opt/spack/[padded-to-256-chars]/darwin-sequoia-aarch64/apple-clang-16.0.0/lua-lpeg-1.1.0-1-zfvdag6bm4qshhipfblhtukfj66iht5e/lib/lua/5.1/lpeg.so'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
...
     239    ld: duplicate LC_RPATH '/usr/local/spack-tools/spack/opt/spack/darw
            in-sonoma-aarch64/apple-clang-16.0.0/gcc-14.2.0-dxv7yb3pasmje7wnkq4
            eizyvyvsdl2d3/lib' in '/Users/gitlab-runner-0/builds/t1_a84b7x/0/sp
            ack/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeho
            lder__/__spack_path_placeholder__/__spack_path_placeholder__/__spac
            k_path_placeholder__/__spack_path_placeholder__/__spack_path_placeh
            older__/__s/darwin-sequoia-aarch64/apple-clang-16.0.0/lua-lpeg-1.1.
            0-1-zfvdag6bm4qshhipfblhtukfj66iht5e/lib/lua/5.1/lpeg.so'
  >> 240    clang: error: linker command failed with exit code 1 (use -v to see
             invocation)
...

@eugeneswalker
Copy link
Copy Markdown
Contributor Author

OpenMPI seems to hang forever:

@eugeneswalker
Copy link
Copy Markdown
Contributor Author

@spackbot run pipeline

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Jan 31, 2025

I've started that pipeline for you!

@eugeneswalker
Copy link
Copy Markdown
Contributor Author

OpenMPI build freezes, never completes, resulting in job timeout:

I would disable OpenMPI until this could be figured out but many packages from the machine-learning-mps stack depend on it. @adamjstewart do we really need openmpi specifically here? Can we use a different MPI until the openmpi issue is resolved?

  packages:
   ...
    mpi:
      require: openmpi
   ...

Many if not all the clingo-bootstrap builds fail in bootstrap-aarch64-darwin. @alalazo Any idea what the issue may be?

@adamjstewart
Copy link
Copy Markdown
Member

For all ML pipelines, we can use any MPI provider. I recall having some kind of problem with whatever was being chosen by default and had to force it to be OpenMPI, but feel free to undo that. It may not even be needed on macOS.

@eugeneswalker
Copy link
Copy Markdown
Contributor Author

@spackbot run pipeline

@spackbot-app
Copy link
Copy Markdown

spackbot-app bot commented Feb 3, 2025

I've started that pipeline for you!

@eugeneswalker
Copy link
Copy Markdown
Contributor Author

The only build errors remaining here are clingo-bootstrap (see above) and the fact that numerous packages in the darwin-mps stack do seem to depend specifically on openmpi, and openmpi build hangs indefinitely for an as-yet-undetermined reason. If we can solve these, we can merge this and have our darwin pipelines back!

@adamjstewart
Copy link
Copy Markdown
Member

What happens when you require mpich?

@eugeneswalker
Copy link
Copy Markdown
Contributor Author

eugeneswalker commented Feb 25, 2025

I commented out the handful of failing packages and all CI is green now. I think we should merge this now to bring the darwin pipelines back. Separately, I will follow up with a PR re-enabling the failing specs where we can iterate on fixes. Let's bring it back! @adamjstewart @kwryankrattiger

adamjstewart
adamjstewart previously approved these changes Feb 25, 2025
Copy link
Copy Markdown
Member

@adamjstewart adamjstewart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only approving my own pipelines, don't fully understand the other changes.

adamjstewart
adamjstewart previously approved these changes Feb 26, 2025
@eugeneswalker
Copy link
Copy Markdown
Contributor Author

eugeneswalker commented Feb 26, 2025

All CI is going to be green here. Only one gitlab job remains and it is update-index. Any final reservations @kwryankrattiger? If not, I will turn on auto-merge and we will be back in business.

Copy link
Copy Markdown
Contributor

@kwryankrattiger kwryankrattiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor things, both can be pulled out to follow ups I think as neither poses immediate risks and maybe worth experimenting with /tmp changes more before reverting back to the default.

adamjstewart
adamjstewart previously approved these changes Feb 26, 2025
@eugeneswalker
Copy link
Copy Markdown
Contributor Author

eugeneswalker commented Feb 26, 2025

OK. All fixed up based on recent comments. Turning on auto-merge. Will follow up with a PR to re-enable the failing py-torch* related specs. All CI is passing now.

@eugeneswalker eugeneswalker merged commit 677caec into spack:develop Feb 26, 2025
37 checks passed
@alalazo
Copy link
Copy Markdown
Member

alalazo commented Feb 27, 2025

Is it just me, or is there no darwin pipeline that has been reactivated: https://gitlab.spack.io/spack/spack/-/pipelines/995843 ? I can't find any darwin pipeline, as instead I can see e.g. here https://gitlab.spack.io/spack/spack/-/pipelines/989714 (where they are the only ones running) 🤔

@alalazo
Copy link
Copy Markdown
Member

alalazo commented Feb 27, 2025

CI was good here:

but then these changes 9b90ef5 3253a87 seem to have disabled darwin pipelines again.

white238 pushed a commit that referenced this pull request Mar 3, 2025
* ci: darwin stacks: update tags following system updates

* disable SPACK_CI_DISABLE_STACKS; only enable *darwin* stacks for testing

* manually chmod u+w tmp/ before cleanup due to issue#49147

* comment out failing specs for now

* re-enable logic for disabling stacks

* add explanatory comment for darwin after_script additions

* remove more darwin-only targetting

* restore build_stage to default location

* move build-job-remove out of individual darwin stacks into darwin top level config

* keep build_stage in $spack/tmp for now
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core PR affects Spack core functionality gitlab Issues related to gitlab integration macOS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants