Skip to content

Create image for use in ML pipelines#50

Merged
alalazo merged 4 commits intomainfrom
images/ml
Apr 29, 2024
Merged

Create image for use in ML pipelines#50
alalazo merged 4 commits intomainfrom
images/ml

Conversation

@adamjstewart
Copy link
Copy Markdown
Member

@adamjstewart adamjstewart commented Apr 16, 2024

The idea is to create an image that can be used across all Linux ML pipelines (x86_64, aarch64, ppc64le) with a newer OS/GCC version. Some criteria:

  1. TF requires GCC 9.4+
  2. TF requires /usr/bin/python3
  3. TF requires ROCm to be installed in a specific system directory

P.S. I don't know how to write or test a Dockerfile so if you have any suggestions, it's probably quicker to push to my branch than it is to explain it to me 😅

@adamjstewart
Copy link
Copy Markdown
Member Author

adamjstewart commented Apr 20, 2024

spack/spack#43751 passed successfully so I think this image is good to go. There are still a few issues with spack/spack#39666 but those are due to aarch64, not due to the image.

@adamjstewart adamjstewart requested a review from alalazo April 20, 2024 16:43
@kwryankrattiger
Copy link
Copy Markdown
Collaborator

Not sure if there is something more robust, but usually I just build the image locally, then run the target stack install on it to make sure it passes, and call it good.

I there are any changes to the spack.yaml that are needed, like pointing to the ROCm/Python externals, then you can make sure everything is working before pushing the image and it makes the Spack CI PR much easier.

@adamjstewart
Copy link
Copy Markdown
Member Author

For now, the image I created here is good enough. It doesn't yet support external ROCm, but I'm told that AMD is working on supporting Spack-installed ROCm in the future. I'll leave ROCm to someone else.

rsync \
unzip \
wget \
zlib1g-dev \
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check all of these are needed from the system.

autoconf
automake
gettext
libffi-deva
libssl-dev
libxml2-dev
m4
ncurses-dev
zlib1g-dev

Copy link
Copy Markdown
Collaborator

@kwryankrattiger kwryankrattiger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alalazo alalazo merged commit 8c44af9 into main Apr 29, 2024
@alalazo alalazo deleted the images/ml branch April 29, 2024 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants