Deploying Ray on Kubernetes

Ray is a framework for building and operating distributed applications that need performance, scalability, and fault tolerance. It offers an API for distributed computing and many libraries and tools for machine learning, reinforcement learning, hyperparameter tuning, and more. You can write code once and run it across machines, clusters, or clouds with Ray.

Ray works with Kubernetes to automate provisioning, scaling, and monitoring of Ray applications. It also supports features like secrets, a horizontal pod auto-scaler, and a metrics server.

KubeRay is a Kubernetes operator that simplifies deploying and managing Ray applications on Kubernetes. It provides three Custom Resource Definitions (CRDs) for different use cases:

  1. RayCluster: This CRD defines the configuration of a Ray cluster on Kubernetes. It includes the cluster name, size, node types, autoscaling policies, and customization options. You can create, modify, or remove a RayCluster with kubectl commands or the Ray Jobs CLI. Use this CRD to manage your Ray clusters and run Ray applications on a single cluster.

  2. RayJob: This CRD defines the entrypoint, environment, and shutdown behavior of a Ray job on Kubernetes. You can submit a Ray job with the Ray jobs CLI or the Python SDK. KubeRay automatically creates and deletes a temporary Ray cluster for each Ray job. Use this CRD to run single-Ray applications on Kubernetes.

  3. RayService: This CRD manages the service name, type, port, and backend configuration of a Ray service. You can create, update, and delete a RayService with kubectl or the Ray Serve CLI. KubeRay supports zero-downtime upgrades and high availability for Ray services. Use this CRD to deploy and manage Ray Serve applications on Kubernetes.

This article explains how to deploy a RayCluster, a RayJob, and a RayService on Kubernetes with the KubeRay Operator.


Before you start:

$ sudo snap install helm --classic

Deploy the Ray Workloads

Ray Workloads can run on any machine, cluster, or Cloud. Use Kuberay to deploy the Ray workloads on a VKE instance. Use the Ray project definition files from the KubeRay GitHub repository to deploy the three Workloads on your VKE instance.

In this section, install the Kuberay operator.

  1. Using Helm, add kuberay to your system.

    $ helm repo add KubeRay
  2. Update the Helm repositories.

    $ helm repo update
  3. Install the KubeRay operator.

    $ helm install kuberay-operator kuberay/kuberay-operator --version 1.0.0

    It could take up to 2 minutes for the installation to complete.

  4. Check if the operator is running.

    $ kubectl get pods


    NAME                                READY   STATUS    RESTARTS   AGE
    kuberay-operator-678c7d7997-v4ppc   1/1     Running   0          78s

Deploy a RayCluster

Now that the KubeRay operator is running, you can now deploy a RayCluster in the default namespace. Deploy a RayCluster by downloading the Ray project RayCluster configuration file, ray_v1alpha1_raycluster.yaml, applying it with kubectl, and connecting to the cluster with ray as described in the steps below.

  1. Install the RayCluster from the Helm chart repository.

    $ helm install raycluster kuberay/ray-cluster --version 1.0.0

    The installation may take up to 10 minutes to complete.

  2. Check to confirm that the ray cluster is running:

    $ kubectl get rayclusters


    raycluster-kuberay   1                 1                   ready    10m33s

    The KubeRay operator starts the RayCluster and creates head and worker pods.

  3. View the RayCluster’s pod in the RayCluster named “raycluster-kuberay” to confirm they are running.

    $ kubectl get pods
    NAME                                          READY   STATUS    RESTARTS   AGE
    raycluster-kuberay-head-6sldq                 1/1     Running   0          13m
    raycluster-kuberay-worker-workergroup-jz2k7   1/1     Running   0          13m

Deploy a RayJob

Create a RayJob using the Ray project’s RayJob resource definition file. This file creates a RayCluster resource with the specified configuration. Parse the output of the RayJob resources with jq.

  1. Download the RayJob resource definition file, ray_v1alpha1_rayjob.yaml.

    $ curl -LO
  2. Start a RayJob by applying the downloaded file.

    $ kubectl apply -f ray_v1alpha1_rayjob.yaml
  3. View the available RayJob resources.

    $ kubectl get rayjob


    NAME            AGE
    rayjob-sample   32s
  4. Wait for at least 3 minutes for the RayJob to start before you view the available RayCluster resources.

    $ kubectl get raycluster


    NAME                                 DESIRED WORKERS   AVAILABLE WORKERS   STATUS   AGE
    rayservice-sample-raycluster-4fmtr   1                 1                   ready    2m27s
  5. View the available pods.

    $ kubectl get pods


    kuberay-operator-678c7d7997-l6dhq                          		 1/1     Running     0          91m
    rayjob-sample-4vtkg                                      			 0/1     Completed   0          2m49s
    rayjob-sample-raycluster-4fmtr-head-q5scw                  	 1/1     Running     0          3m46s
    rayjob-sample-raycluster-4fmtr-worker-small-group-xqhrt         1/1     Running     0          3m46s
  6. Install jq.

    $ apt install jq 

    You need jq to parse the JSON output of kubectl get rayjob.

  7. Check to verify if the job has finished.

    $ kubectl get rayjob-sample -o json | jq '.status.jobStatus'


  8. View RayJob output.

    $ kubectl logs -l=job-name=rayjob-sample


    2024-01-23 06:50:44,384 INFO -- Job submission server address: http://rayjob-sample-raycluster-9c546-head-svc.default.svc.cluster.local:8265
    2024-01-23 06:50:44,385 SUCC -- ------------------------------------------------
    2024-01-23 06:50:44,386 SUCC -- Job 'rayjob-sample-4fmtr' submitted successfully
    2024-01-23 06:50:44,387 SUCC -- ------------------------------------------------
    2024-01-23 06:50:44,388 INFO -- Next steps
    2024-01-23 06:50:44,389 INFO -- Query the logs of the job:
    2024-01-23 06:50:44,390 INFO -- ray job logs rayjob-sample-4fmtr

Deploy a RayService

Deploy a RayService with the Ray project’s RayService definition file, apply this file to your VKE cluster to create a RayCluster resource. The Ray operator manages the RayCluster resource for the RayService.

  1. Download the RayService resource definition file, ray_v1alpha1_rayservice.yaml.

    $ curl -LO
  2. Start a RayService by applying the downloaded file.

    $ kubectl apply -f ray_v1alpha1_rayservice.yaml
  3. View the available RayService resources.

    $ kubectl get rayservice


    NAME                AGE
    rayservice-sample   42s
  4. Wait for at least 3 minutes for the RayService to start, then view the available RayCluster resources.

    $ kubectl get raycluster


    NAME                                 DESIRED WORKERS   AVAILABLE WORKERS   STATUS   AGE
    rayservice-sample-raycluster-zpjwg   1                 1                   ready    2m27s
  5. View the available pods.

    $ kubectl get pods


    rayservice-sample-raycluster-zpjwg-worker-small-group-vfvjb   1/1     Running   0          3m52s
    rayservice-sample-raycluster-zpjwg-head-mscgh             1/1     Running   0          3m52s
  6. View the available Ray services.

    $ kubectl get services


    NAME                                          TYPE        CLUSTER-IP      EXTERNAL-IP   PORT     (S)                                                   AGE
    rayservice-sample-head-svc                    ClusterIP     <none>        10001/TCP,8265/TCP,52365/TCP,6379/TCP,8080/TCP,8000/TCP   4m58s
    rayservice-sample-raycluster-zpjwg-head-svc   ClusterIP   <none>        10001/TCP,8265/TCP,52365/TCP,6379/TCP,8080/TCP,8000/TCP   5m25s
    rayservice-sample-serve-svc                   ClusterIP    <none>        8000/TCP                                                  2m48s

Test the RayCluster

Submit a job to the RayCluster to execute it directly on the head pod by following these steps:

  1. Get the name of the head node.

    $ export HEAD_POD=$(kubectl get pods -o --no-headers)
  2. View the name of the head node.

    $ echo $HEAD_POD


  3. Submit a simple ray job to print the cluster resources.

    $ kubectl exec -it $HEAD_POD -- python -c "import ray; ray.init(); print(ray.cluster_resources())"


    2024-01-23 10:57:46,041 INFO -- Connecting to existing Ray cluster at address:
    2024-01-23 10:57:46,126 INFO -- Connected to Ray cluster. View the dashboard at 
    {'memory': 3000000000.0, 'object_store_memory': 743061503.0, 'node:': 1.8, 'CPU': 2.0, 'node:': 1.0, 'node:__internal_head__': 1.0}


You deployed Ray Cluster, Ray Job, and Ray Service on a Kubernetes cluster using the KubeRay operator. Ray Workloads support features such as distributed data loading, distributed training, fault tolerance, batch inference, multi-model serving, and dynamic scaling. For more information about Ray, see the official documentation.

