v1.5.1

@ryanaoleary

v1.5.1

Highlights

This release adds support for Ray token authentication using RayCluster, RayJob and RayService.

You can enable Ray token authentication using the following API:

apiVersion: ray.io/v1
kind: RayCluster
metadata:
  name: ray-cluster-with-auth
spec:
  rayVersion: '2.52.0'
  authOptions:
    mode: token

You must specify spec.rayVersion to 2.52.0 or newer. See full example at ray-cluster.auth.yaml.

Bug fixes

Fix a bug in the NewClusterWithIncrementalUpgrade strategy where the Active (old) cluster's Serve configuration cache was incorrectly updated to the new ServeConfigV2 during an upgrade. #4212 @ryanaoleary
Surface pod-level container failures to RayCluster status #4196 @spencer-p
Fix a bug where RayJob status is not updated if failure happens in Initializing phase #4191 @spencer-p
Fix a bug where RayCluster status was not always propagated to RayJob status #4192 @machichima

@andrewsykim

Highlights

Ray Label Selector API

Ray v2.49 introduced a label selector API. Correspondingly, KubeRay v1.5 now features a top-level API for defining Ray labels and resources. This new top-level API is the preferred method going forward, replacing the previous practice of setting labels and custom resources within rayStartParams.

The new API will be consumed by the Ray autoscaler, improving autoscaling decisions based on task and actor label selectors. Furthermore, labels configured through this API are mirrored directly into the Pods. This mirroring allows users to more seamlessly combine Ray label selectors with standard Kubernetes label selectors when managing and interacting with their Ray clusters.

You can use the new API in the following way:

apiVersion: ray.io/v1
kind: RayCluster
spec:
  ...
  headGroupSpec:
    rayStartParams: {}
    resources:
      Custom1: "1"
    labels:
      ray.io/zone: us-west-2a
      ray.io/region: us-west-2
  workerGroupSpec:
  - replicas: 1
    rayStartParams: {}
    resources:
      Custom1: "1"
    labels:
      ray.io/zone: us-west-2a
      ray.io/region: us-west-2

RayJob Sidecar submission mode

The RayJob resource now supports a new value for spec.submissionMode called SidecarMode.
Sidecar mode directly addresses a key limitation in both K8sJobMode and HttpMode: the network connectivity requirement from an external Pod or the KubeRay operator for job submission. With Sidecar mode, job submission is orchestrated by injecting a sidecar container into the Head Pod. This solution eliminates the need for an external client to handle the submission process and reduces job submission failure due to network failures.

To use this feature, set spec.submissionMode to SidecarMode in your RayJob:

apiVersion: ray.io/v1
kind: RayJob
metadata:
  name: my-rayjob
spec:
  submissionMode: "SidecarMode"
  ...

Advanced deletion policies for RayJob

KubeRay now supports a more advanced and flexible API for expressing deletion policies within the RayJob specification. This new design moves beyond the singular boolean field, spec.shutdownAfterJobFinishes, and allows users to define different cleanup strategies using configurable TTL values based on the Ray job's status.

This API unlocks new use cases that require specific resource retention after a job completes or fails. For example, users can now implement policies that:

Preserve only the Head Pod for a set duration after job failure to facilitate debugging.
Retain the entire Ray Cluster for a longer TTL after a successful run for post-analysis or data retrieval.

By linking specific TTLs to Ray job statuses (e.g., success, failure) and strategies (e.g. DeleteWorkers, DeleteCluster, DeleteSelf), users gain fine-grained control over resource cleanup and cost management.

Below is an example of how to use this new, flexible API structure:

apiVersion: ray.io/v1
kind: RayJob
metadata:
  name: rayjob-deletion-rules
spec:
  deletionStrategy:
    deletionRules:
    - policy: DeleteWorkers
      condition:
        jobStatus: FAILED
        ttlSeconds: 100
    - policy: DeleteCluster
      condition:
        jobStatus: FAILED
        ttlSeconds: 600
    - policy: DeleteCluster
      condition:
        jobStatus: SUCCEEDED
        ttlSeconds: 0

This feature is disabled by default and requires enabling the RayJobDeletionPolicy feature gate.

Incremental upgrade support for RayService

KubeRay v1.5 introduces the capability to enable zero-downtime incremental upgrades for RayServices. This new feature improves the upgrade process by leveraging the Gateway API and Ray autoscaling to incrementally migrate user traffic from the existing Ray cluster to the newly upgraded one.

This approach is more efficient and reliable compared to the former mechanism. The previous method required creating the upgraded Ray cluster at its full capacity and then shifting all traffic at once, which could lead to disruptions and unnecessary resource usage. By contrast, the incremental approach gradually scales up the new cluster and migrates traffic in smaller, controlled steps, resulting in improved stability and resource utilization during upgrade.

To enable this feature, set the following fields in RayService:

apiVersion: ray.io/v1
kind: RayService
metadata:
  name: example-rayservice
spec:
  upgradeStrategy:
    type: "NewClusterWithIncrementalUpgrade"
    clusterUpgradeOptions:
      maxSurgePercent: 40 
      stepSizePercent: 5  
      intervalSeconds: 10
      gatewayClassName: "cluster-gateway"

This feature is disabled by default and requires enabling the RayServiceIncrementalUpgrade feature gate.

Improved multi-host support for RayCluster

Previous KubeRay versions supported multi-host worker groups via the numOfHosts API, but this capability lacked fundamental capabilities required for managing multi-host accelerators. Firstly, it lacked logical grouping of worker Pods belonging to the same multi-host unit (or slice). As a result, it was not possible to run operations like “replace all workers in this group”. In addition, there was no ordered indexing, which is often required for coordinating multi-host workers when using TPUs.

When using multi-host in KubeRay v1.5, KubeRay will automatically set the following labels for multi-host Ray workers:

labels:
  ray.io/worker-group-replica-name: tpu-group-af03de
  ray.io/worker-group-replica-index: 0
  ray.io/replica-host-index: 1

Below is a description of each label and its purpose:

ray.io/worker-group-replica-name: this label provides a unique identifier for each replica (i.e. host group or slice) in a worker group. The label enables KubeRay to rediscover all other pods in the same group and apply group operators.
ray.io/worker-group-replica-index: this label is an ordered replica index in the worker group. This label is particularly important for cases like multi-slice TPUs, where each slice must be aware of its slice index.
ray.io/replica-host-index: this label is an ordered host index per replica (host group or slice).

These changes collectively enable reliable, production-level scaling and management of multi-host GPU workers or TPU slices.

This feature is disabled by default and requires enabling the RayMultiHostIndexing feature gate.

Breaking Changes

For RayCluster objects created by a RayJob, KubeRay will no longer attempt to recreate the Head Pod if it fails or is deleted after its initial successful provisioning. To retry failed jobs, use spec.backoffLimit which will result in KubeRay provisioning a new RayCluster.

CHANGELOG

[release-1.5] update version to v1.5.0 (#4177, @andrewsykim)
[CherryPick][Feature Enhancement] Set ordered replica index label to support mult… (#4171, @ryanaoleary)
[releasey-1.5] update version to v1.5.0-rc.1 (#4170, @andrewsykim)
[release-1.5] fix: dashboard build for kuberay 1.5.0 (#4169, @andrewsykim)
[release-1.5] update versions to v1.5.0-rc.0 (#4155, @andrewsykim)
[Bug] Sidecar mode shouldn't restart head pod when head pod is delete… (#4156, @rueian)
Bump Kubernetes dependencies to v0.34.x (#4147, @mbobrovskyi)
[Chore] Remove duplicate test-e2e-rayservice in Makefile (#4145, @seanlaii)
[Scheduler] Replace AddMetadataToPod with AddMetadataToChildResource across all schedulers (#4123, @win5923)
[Feature] Add initializing timeout for RayService (#4143, @seanlaii)
[RayService] Support Incremental Zero-Downtime Upgrades (#3166, @ryanaoleary)
Example RayCluster spec with Labels and label_selector API (#4136, @ryanaoleary)
[RayCluster] Fix for multi-host indexing worker creation (#4139, @chiayi)
Support uppercase default resource names for top-level Resources (#4137, @ryanaoleary)
[Bug] [KubeRay Dashboard] Misclassifies RayCluster type (#4135, @CheyuWu)
[RayCluster] Add multi-host indexing labels (#3998, @chiayi)
[Grafana] Use Range option instead of instant for RayCluster Provisioned Duration panel (#4062, @win5923)
[Feature] Separate controller namespace and CRD namespaces for KubeRay-Operator Dashboard (#4088, @400Ping)
Update grafana dashboards to ray 2.49.2 + add README instructions on how to update (#4111, @alanwguo)
fix: update broken and outdated links (#4129, @ErikJiang)
[Feature] Provide multi-arch images for apiserver and security proxy (#4131, @seanlaii)
test: add LastTransition to fix test (#4132, @machichima)
Add top-level Labels and Resources Structed fields to HeadGroupSpec and WorkerGroupSpec ([#4106](#4...

Changelog

cc344f1 [Fix][Release] Fix Krew release indenetation error (#3823) (#3877)
c78bdcf [RayCluster] Make headpod name back to non-deterministic (#3872) (#3876)
34ea80e [Release] Update KubeRay version references for 1.4.2 (#3879)

Changelog

0e6b464 Cherry-pick: Use Go 1.24.0 in go module (#3835) (#3846)
600a346 Fix ray nightly image env var setup (#3826) (#3828)
3d138cf [Release] Update KubeRay version references for 1.4.1 (#3848)

@MortalHappiness

Highlights

Enhanced Kubectl Plugin

KubeRay v1.4.0 introduces major improvements to the Kubectl Plugin:

Added a new scale command to scale worker groups in a RayCluster.
Extended the get command to support listing Ray nodes and worker groups.
Improved the create command:
- Allows overriding default values in config files.
- Supports additional fields such as Kubernetes labels and annotations, node selectors, ephemeral storage, ray start parameters, TPUs, autoscaler version, and more.

See Using the Kubectl Plugin (beta) for more details.

KubeRay Dashboard (alpha)

Starting from v1.4.0, you can use the open source dashboard UI for KubeRay. This component is still experimental and not considered ready for production, but feedback is welcome.

KubeRay dashboard is a web-based UI that allows you to view and manage KubeRay resources running on your Kubernetes cluster. It's different from the Ray dashboard, which is a part of the Ray cluster itself. The KubeRay dashboard provides a centralized view of all KubeRay resources.

See Use KubeRay dashboard (experimental) for more information. (The link will be replaced to doc website after the PR being merged)

Integration with `kubernetes-sigs/scheduler-plugins`

Starting with v1.4.0, KubeRay integrates one more scheduler kubernetes-sigs/scheduler-plugins to support gang scheduling for RayCluster resources. Currently, only single scheduler mode is supported.

See KubeRay integration with scheduler plugins for details.

KubeRay APIServer V2 (alpha)

The new APIServer v2 provides an HTTP proxy interface compatible with the Kubernetes API. It enables users to manage Ray resources using standard Kubernetes clients.

Key features:

Full compatibility with Kubernetes OpenAPI Spec and CRDs.
Available as a Go library for building custom proxies with pluggable HTTP middleware.

APIServer v1 is now in maintenance mode and will no longer receive new features. v2 is still in alpha. Contributions and feedback are encouraged.

Service Level Indicator (SLI) Metrics

KubeRay now includes SLI metrics to help monitor the state and performance of KubeRay resources.

See KubeRay Metrics Reference for details.

Breaking Changes

Default to Non-Login Bash Shell

Prior to v1.4.0, KubeRay ran most commands using a login shell. Starting from v1.4.0, the default shell is a non-login Bash shell. You can temporarily revert to login shell behavior using the ENABLE_LOGIN_SHELL environment variable, but using login shell is not recommended and this environment variable will be removed in the future release. (#3679)

If you encounter any issues with the new default behavior, please report in #3822 and don't open new issues.

Resource Name Changes and Length Validation

Before v1.4.0, KubeRay silently truncated resource names if they are too long to fit the 63-character limitation for Kubernetes. Starting from v1.4.0, we don't implicitly truncate resource names anymore. Instead, we emit an invalid spec event if the names are too long. (#3083)

We also shortened some of the resource names to loosen the length limitation. The following changes are made:

The suffix of headless service for RayCluster changes from headless-worker-svc to headless. (#3101)
The suffix of RayCluster name changes from -raycluster-xxxxx to -xxxxx (#3102)
The suffix of the head pod for RayCluster changes from -head-xxxxx to -head (#3028)

Updated Autoscaler v2 configuration

Starting from v1.4.0, autoscaler v2 is now configured using:

spec:
  autoscalerOptions:
    version: v2

You should not use the old RAY_enable_autoscaler_v2 environment variable.

See Autoscaler v2 Configuration for guidance.

Changelog

[Release] Update KubeRay version references for 1.4.0 (#3816, @MortalHappiness)
[kubeclt-plugin] fix get cluster all namespace (#3809, @fscnick)
[Docs] Add kubectl plugin create cluster sample yaml config files (#3804, @MortalHappiness)
[Helm Chart] Set honorLabel of serviceMonitor to true (#3805, @owenowenisme)
[Metrics] Remove serviceMonitor.yaml (#3795, @owenowenisme)
[Chore][Sample-yaml] Upgrade pytorch-lightning to 1.8.5 for ray-job.pytorch-distributed-training.yaml (#3796, @MortalHappiness)
Use ImplementationSpecific in ray-cluster.separate-ingress.yaml (#3781, @troychiu)
Remove vLLM examples in favor of Ray Serve LLM (#3786, @kevin85421)
Update update-ray-job.kueue-toy-sample.yaml (#3782, @troychiu)
[Feat] Add e2e test for applying ray-job.interactive-mode.yaml (#3779, @CheyuWu)
[Release] Update KubeRay version references for 1.4.0-rc.2 (#3784, @MortalHappiness)
[Doc][Fix] correct the indention of storageClass in ray-cluster.persistent-redis.yaml (#3780, @rueian)
[doc] Improve APIServer v2 doc (#3773, @kevin85421)
[Release] Reset ray-operator version in root go.mod to v0.0.0 (#3774, @MortalHappiness)
Revert "Fix issue where unescaped semicolons caused task execution failures. (#3691)" (#3771, @MortalHappiness)
support scheduler plugins (#3612, @KunWuLuan)
Added Ray-Serve Config For LLMs (#3517, @Blaze-DSP)
[Release] Fix helm chart tag missing "v" prefix and release rc1 (#3757, @MortalHappiness)
[Release] Update KubeRay version references for 1.4.0-rc.0 (#3698, @MortalHappiness)
Improve Grafana Dashboard (#3734, @troychiu)
[Fix][CI] Fix ray operator image build error by setting up docker buildx (#3750, @MortalHappiness)
[Test][Autoscaler] deflaky unexpected dead actors in tests by setting max_restarts=-1 (#3700, @rueian)
add go.mod for operator (#3735, @troychiu)
[fix][operator] RayJob.Status.RayJobStatusInfo.EndTime nil deref error (#3742, @davidxia)
[operator] fix TPU multi-host RayJob and RayCluster samples (#3733, @davidxia)
[chore] upgrade Ray to 2.46.0 in remaining places (#3724, @davidxia)
chore: run yamlft pre-commit hook (#3729, @davidxia)
[Grafana] Update Grafana dashboard (#3726, @win5923)
[Test][Autoscaler] deflaky autoscaler idle timeout e2e tests by a longer timeout (#3727, @rueian)
[Chore] Upgrade Ray to 2.46.0 follow-up (#3722, @MortalHappiness)
[doc] Update API server v1 doc (#3723, @kevin85421)
feat: upgrade to Ray 2.46.0 (#3547, @davidxia)
[Test][Autoscaler] deflaky unexpected dead actors in tests by higher resource requests (#3707, @rueian)
[Doc] add ray cluster uv sample yaml (#3720, @fscnick)
[apiserver] Use ClusterIP instead of NodePort for KubeRay API server service (#3708, @machichima)
Bump next from 15.2.3 to 15.2.4 in /dashboard (#3709, @dependabot[bot])
[Feat][apiserver] Support CORS config (#3711, @MortalHappiness)
Add kuberay operator servicemonitor (#3717, @troychiu)
[CI] Split Autoscaler e2e tests into 2 buildkite runners (#3715, @kevin85421)
Add Grafana Dashboard for KubeRay Operator (#3676, @win5923)
[Fix][Release] Fix KubeRay dahsboard image build pipeline (#3702, @MortalHappiness)
Fix issue where unescaped semicolons caused task execution failures. (#3691, @xianlubird)
[refactor] Refactor enable login shell (#3704, @kevin85421)
[chore] Update use...

Bug fixes

[RayJob] Use --no-wait for job submission to avoid carrying the error return code to the log tailing (#3216)
[kubectl-plugin] kubectl ray job submit: provide entrypoint to preserve compatibility with v1.2.2 (#3186)

Improvements

[kubectl-plugin] Add head/worker node selector option (#3228)
[kubectl-plugin] add node selector option for kubectl plugin create worker group (#3235)

Changelog

[RayJob][Fix] Use --no-wait for job submission to avoid carrying the error return code to the log tailing (#3216)
kubectl ray job submit: provide entrypoint (#3186)
[kubectl-plugin] Add head/worker node selector option (#3228)
add node selector option for kubectl plugin create worker group (#3235)
[Chore][CI] Limit the release-image-build github workflow to only take tag as input (#3117)
[CI] Remove create tag step from release (#3249)

Highlights

This release includes a Go dependency update to resolve an incompatibility issue when using newer versions of k8s.io/component-base

Changelog

Changes required make a build after update of component-base (#3163, mszadkow)

@rueian

Highlights

RayCluster Conditions API

The RayCluster conditions API is graduating to Beta status in v1.3. The new API provides more details about the RayCluster’s observable state that were not possible to express in the old API. The following conditions are supported for v1.3: AllPodRunningAndReadyFirstTime, RayClusterPodsProvisioning, HeadPodNotFound and HeadPodRunningAndReady. We will be adding more conditions in future releases.

Ray Kubectl Plugin

The Ray Kubectl Plugin is graduating to Beta status. The following commands are supported with KubeRay v1.3:

kubectl ray logs <cluster-name>: download Ray logs to a local directory
kubectl ray session <cluster-name>: initiate port-forwarding session to the Ray head
kubectl ray create <cluster>: create a Ray cluster
kubectl ray job submit: create a RayJob and submit a job using a local working directory

See the Ray Kubectl Plugin docs for more details.

RayJob Stability Improvements

Several improvements have been made to enhance the stability of long-running RayJobs. In particular, when using submissionMode=K8sJobMode, job submissions will no longer fail due to the submission of duplicate IDs. Now, if a submission ID already exists, the logs of the existing job will be retrieved instead.

RayService API Improvements

RayService strives to deliver zero-downtime serving. When changes in the RayService spec cannot be applied in place, it attempts to migrate traffic to a new RayCluster in the background. However, users might not always have sufficient resources for a new RayCluster. Beginning with KubeRay 1.3, users can customize this behavior using the new UpgradeStrategy option within the RayServiceSpec.

Previously, the serviceStatus field in RayService was inconsistent and did not accurately represent the actual state. Starting with KubeRay v1.3.0, we have introduced two conditions, Ready and UpgradeInProgress, to RayService. Following the approach taken with RayCluster, we have decided to deprecate serviceStatus. In the future, serviceStatus will be removed, and conditions will serve as the definitive source of truth. For now, serviceStatus remains available but is limited to two possible values: "Running" or an empty string.

GCS Fault Tolerance API Improvements

The new GcsFaultToleranceOptions field in the RayCluster now provides a streamlined way for users to enable GCS Fault Tolerance on a RayCluster. This eliminates the previous need to distribute related settings across Pod annotations, container environment variables, and the RayStartParams. Furthermore, users can now specify their Redis username in the newly introduced field (requires Ray 2.4.1 or later). To see the impact of this change on a YAML configuration, please refer to the example manifest.

Breaking Changes

RayService API

Starting from KubeRay v1.3.0, we have removed all possible values of RayService.Status.ServiceStatus except Running, so the only valid values for ServiceStatus are Running and empty. If ServiceStatus is Running, it means that RayService is ready to serve requests. In other words, ServiceStatus is equivalent to the Ready condition. It is strongly recommended to use the Ready condition instead of ServiceStatus going forward.

Features

RayCluster Conditions API is graduating to Beta status. The feature gate RayClusterStatusConditions is now enabled by default.
New events were added for RayCluster, RayJob and RayService for improved observability
Various improvements to Ray autoscaler v2
Introduce a new API in RayService spec.upgradeStrategy. The upgrade strategy type can be set to NewCluster or None to modify the behavior of zero-downtime upgrades for RayService.
Add RayCluster controller expecatations to mitigate stale informer caches
RayJob now supports submission mode InteractiveMode. Use this submission mode when you want to submit jobs from a local working directory on your laptop.
RayJob now supports spec.deletionPolicy API, this feature requires the RayJobDeletionPolicy feature gate to be enabled. Initial deltion policies are DeleteCluster, DeleteWorkers, DeleteSelf and DeleteNone.
KubeRay now detects TPUs and Neuron Core resources and specifies them as custom resources to ray start parameters
Introduce RayClusterSuspending and RayClusterSuspended conditions
Container CPU requests are now used in Ray –num-cpus if CPU limits is not specified
Various example manifests for using TPU v6 with KubeRay
Add ManagedBy field in RayJob and RayCluster. This is required for Multi-Kueue support.
Add support for kubectl ray create cluster command
Add support for kubectl ray create workergroup command

Guides & Tutorials

Use Ray Kubectl Plugin
New sample manifests using TPU v6e chips
Tuning Redis for a Persistent Fault Tolerant GCS
Reducing image pull latency on Kubernetes
Configure Ray clusters with authentication and access control using KubeRay
RayService + vLLM examples updated to use vLLM v0.6.2
All YAML samples in KubeRay repo has been updated to use Ray v2.41.0

Changelog

[Fix][RayCluster] fix missing pod name in CreatedWorkerPod and Failed… (#3057, @rueian)
[Refactor] Use constants for image tag, image repo, and versions in golang to avoid hard-coded strings (#2978, @400Ping)
Update TPU Ray CR manifests to use Ray 2.41.0 (#2965, @ryanaoleary)
Update samples to use Ray 2.41.0 images (#2964, @andrewsykim)
[Test] Use GcsFaultToleranceOptions in test and backward compatibility (#2972, @fscnick)
[chore][docs] enable Markdownlint rule MD004 (#2973, @davidxia)
[release] Update Volcano YAML files to Ray 2.41 (#2976, @win5923)
[release] Update Yunikorn YAML file to Ray 2.41 (#2969, @kenchung285)
[CI] Change Pre-commit-shellcheck-to-shellcheck-py (#2974, @owenowenisme)
[chore][docs] enable Markdownlint rule MD010 (#2975, @davidxia)
[Release] Upgrade ray-job.batch-inference.yaml image to 2.41 (#2971, @MortalHappiness)
[RayService] adapter vllm 0.6.1.post2 (#2823, @pxp531)
[release][9/N] Update text summarizer RayService to Ray 2.41 (#2961, @kevin85421)
[RayService] Deflaky RayService envtest (#2962, @kevin85421)
[RayJob] Deflaky RayJob e2e tests (#2963, @kevin85421)
[fix][kubectl-plugin] set worker group CPU limit (#2958, @davidxia)
[docs][kubectl-plugin] fix incorrect example commands (#2951, @davidxia)
[release][8/N] Upgrade Stable Diffusion RayService to Ray 2.41 (#2960, @kevin85421)
[kubectl-plugin] Fix panic when GPU resource is not set (#2954, @win5923)
[docs][kubectl-plugin] improve help messages (#2952, @davidxia)
[CI] Enable testifylint len rule (#2945, @LeoLiao123)
[release][7/N] Update RayService YAMLs (#2956, @kevin85421)
[Fix][RayJob] Invalid quote for RayJob submitter (#2949, @MortalHappiness)
[chore][kubectl-plugin] use consistent capitalization (#2950, @davidxia)
[chore] add Markdown linting pre-commit hook (#2953, @davidxia)
[chore][kubectl-plugin] use better test assertions (#2955, @davidxia)
[CI] Add shellcheck and fix error of it (#2933, @owenowenisme)
[docs][kubectl-plugin] add dev docs (#2912, @davidxia)
[release][6/N] Remove unnecessary YAMLs (#2946, @kevin85421)
[release][5/N] Update some RayJob YAMLs from Ray 2.9 to Ray 2.41 (#2941, @kevin85421)
[release][4/N] Update Ray images / versions in kubectl plugin (#2938, @kevin85421)
[release][3/N] Update RayService e2e tests YAML files from Ray 2.9 to Ray 2.41 ([#2937](https://github.com...

Releases: ray-project/kuberay

v1.5.1

v1.5.1

Highlights

Bug fixes

Contributors

Uh oh!

v1.5.0

Highlights

Ray Label Selector API

RayJob Sidecar submission mode

Advanced deletion policies for RayJob

Incremental upgrade support for RayService

Improved multi-host support for RayCluster

Breaking Changes

CHANGELOG

Contributors

Uh oh!

v1.5.0-rc.1

Uh oh!

v1.5.0-rc.0

Uh oh!

v1.4.2

Changelog

Uh oh!

v1.4.1

Changelog

Uh oh!

v1.4.0

Highlights

Enhanced Kubectl Plugin

KubeRay Dashboard (alpha)

Integration with kubernetes-sigs/scheduler-plugins

KubeRay APIServer V2 (alpha)

Service Level Indicator (SLI) Metrics

Breaking Changes

Default to Non-Login Bash Shell

Resource Name Changes and Length Validation

Updated Autoscaler v2 configuration

Changelog

Contributors

Uh oh!

v1.3.2

Bug fixes

Improvements

Changelog

Uh oh!

v1.3.1

Highlights

Changelog

Uh oh!

v1.3.0

Highlights

RayCluster Conditions API

Ray Kubectl Plugin

RayJob Stability Improvements

RayService API Improvements

GCS Fault Tolerance API Improvements

Breaking Changes

RayService API

Features

Guides & Tutorials

Changelog

Contributors

Uh oh!

Integration with `kubernetes-sigs/scheduler-plugins`