Skip to content

Conversation

@rueian
Copy link
Collaborator

@rueian rueian commented Feb 25, 2025

Why are these changes needed?

Currently, KubeRay adds different suffixes to make different services for a RayCluster. It will also trim these service names if they are too long to fit the 63-character limitation by k8s. However, the silent cut is not user-friendly as users can't easily know the resulting service names beforehand.

As a part of #3076: we are going to add a length validation to RayCluster names in v1.4 to make sure the silent cut does not take place while keeping sufficient room for the names specified by users, this PR shortens the -headless-worker-svc suffix to just -headless, saving 11 characters for users to use (63-len("-serve-svc") = 53 characters).

Related issue number

#3076

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@kevin85421
Copy link
Member

can you split this into two PRs, one for headless service and one for RayCluster?

@rueian rueian force-pushed the less-suffix-for-more-resource-name branch from cb99863 to d67af8c Compare February 25, 2025 03:57
@rueian rueian changed the title [feat] shorten HeadlessServiceSuffix and RayClusterSuffix to have more space for CR names [feat] shorten HeadlessServiceSuffix to have more space for CR names Feb 25, 2025
@rueian rueian changed the title [feat] shorten HeadlessServiceSuffix to have more space for CR names [feat][RayCluster] shorten HeadlessServiceSuffix to have more space for CR names Feb 25, 2025
@rueian rueian force-pushed the less-suffix-for-more-resource-name branch from d67af8c to 536d17a Compare February 25, 2025 03:58
@rueian rueian marked this pull request as ready for review February 25, 2025 05:51
@rueian
Copy link
Collaborator Author

rueian commented Feb 25, 2025

can you split this into two PRs, one for headless service and one for RayCluster?

Sure, another PR is here: #3102

@rueian
Copy link
Collaborator Author

rueian commented Feb 25, 2025

Hi @andrewsykim,

We'd like to add a length validation to RayCluster names in the next release to provide non-trimmed service names for a better user experience. But the current headless service suffix for the TPU webhook, -headless-worker-svc, is already too long, we'd like to shorten it to just -headless to leave more space for RayCluster names.

Please let me know if you have any concerns about this. Thanks!

@kevin85421
Copy link
Member

#3101 (comment)

cc @ryanaoleary

@ryanaoleary
Copy link
Collaborator

#3101 (comment)

cc @ryanaoleary

I'll need to change the logic of the webhook here before this change so that TPU_WORKER_HOSTNAMES are still generated correctly. I'll change it to look up the headless service name based on the RayCluster label.

@kevin85421
Copy link
Member

GoogleCloudPlatform/ai-on-gke#1003 is merged.

@kevin85421 kevin85421 merged commit 326ff65 into ray-project:master Feb 28, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants