Pathways cluster config#5370
Conversation
- Introduces `enable_pathways` to the `gke-cluster` module to provision the `cpu-np` node pool (`n2-standard-64`) with necessary GCP scopes. - Introduces `enable_pathways` to the `kubectl-apply` module to template and apply Kueue quotas (`cpu-user` ResourceFlavor, 480 CPU, 2000G memory) automatically. - Adds `examples/pathways-gke.yaml` blueprint demonstrating the integration without the deprecated PathwaysJob CRD.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces comprehensive support for deploying and configuring Google Cloud's Pathways on GKE clusters within the toolkit. It provides a new example blueprint, integrates Pathways-specific Kueue configurations, and adds a dedicated CPU node pool, allowing users to easily provision GKE environments optimized for Pathways workloads. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces a new GKE Pathways cluster configuration, enabling dedicated CPU node pools and Kueue integration. The changes include a new example blueprint, a Kubernetes template for Kueue, and modifications to the kubectl-apply and gke-cluster modules to support the new enable_pathways feature. While the overall functionality is a good addition, there are opportunities to improve the configurability and flexibility of the newly introduced resources.
SwarnaBharathiMantena
left a comment
There was a problem hiding this comment.
I have limited knowledge on Kueue configurations. So, added a few questions to ensure there is nothing missing.
This prevents 'rendered manifests contain a resource that already exists' Helm errors when a custom Kueue template defining a ClusterQueue is passed while enable_pathways is also set to true, by automatically grouping and merging their resourceGroups.
7c5504c to
9175e4e
Compare
This renaming clarifies that Pathways coordination infrastructure (cpu-np and its Kueue flavor) should only be enabled when TPU node pools are also being deployed, helping prevent misconfigurations where users might enable it on purely CPU or standalone environments.
|
/gcbrun |
|
Running tests using babysit |
|
/gcbrun |
|
SUCCESS PR-test-gke go/ghpc-cb/6a2419e0-336f-4535-9c59-0dfe8302c59b |
Head branch was pushed to by a user without write access
28e7dbe
766bb48 to
28e7dbe
Compare
561ab9c
into
GoogleCloudPlatform:develop
Co-authored-by: Swarna Bharathi Mantena <[email protected]> and Neelabh94
Adds Pathways required configuration during cluster create to Cluster Toolkit
Submission Checklist
NOTE: Community submissions can take up to 2 weeks to be reviewed.
Please take the following actions before submitting this pull request.