Skip to content

[Feature] Adding Support for aarch64 Based Architecture#4637

Merged
SwarnaBharathiMantena merged 1 commit into
GoogleCloudPlatform:developfrom
ljqg:cos_mft_on_aarach64
Sep 15, 2025
Merged

[Feature] Adding Support for aarch64 Based Architecture#4637
SwarnaBharathiMantena merged 1 commit into
GoogleCloudPlatform:developfrom
ljqg:cos_mft_on_aarach64

Conversation

@ljqg
Copy link
Copy Markdown
Contributor

@ljqg ljqg commented Sep 10, 2025

Release Notes:

  1. Multi-Architectural Support:
  • The tool is now able to support COS based GCE VM instances with both X86_64 and aarach64 architectures and therefore is able to operate on ARM based GPU VM instances such as a4x-highgpu-4g.
  • The Dockerfile (eg. docker build commands) as well as certain code and comments have been updated to reflect this change.
  1. Documentation update:
  • Provided a more versatile command in the Quick Start section that can dynamically mount the GPU device volumes based on the available devices on the system. This replaced the previously hardcoded command as the number of GPUs in the system varies among VM shapes.
  1. Improved the logic on exporting the MFT logs to GCS bucket:
  • The tool would to lookup the bucket directly and check its existence instead of creating the bucket directly and using try except to handle the case where the bucket exists. This is expected to improve code health and avoids some edge case failure patterns from GCS bucket creation.

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

@ljqg ljqg requested review from a team and samskillman as code owners September 10, 2025 05:59
@ljqg ljqg changed the title Added support for aarch64 based architecture [Feature] Adding Support for aarch64 Based Architecture Sep 10, 2025
@SwarnaBharathiMantena SwarnaBharathiMantena added the release-key-new-features Added to release notes under the "Key New Features" heading. label Sep 10, 2025
@ljqg ljqg force-pushed the cos_mft_on_aarach64 branch from 317c438 to 33bf143 Compare September 10, 2025 06:42
Copy link
Copy Markdown
Contributor

@SwarnaBharathiMantena SwarnaBharathiMantena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SwarnaBharathiMantena SwarnaBharathiMantena merged commit 23f8345 into GoogleCloudPlatform:develop Sep 15, 2025
12 of 62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-key-new-features Added to release notes under the "Key New Features" heading.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants