Join our Discord Server
Adesoji Alu Adesoji brings a proven ability to apply machine learning(ML) and data science techniques to solve real-world problems. He has experience working with a variety of cloud platforms, including AWS, Azure, and Google Cloud Platform. He has a strong skills in software engineering, data science, and machine learning. He is passionate about using technology to make a positive impact on the world.

Deploy DeepSeek-R1 using Ollama-Operator on Kubernetes

2 min read

In this demo, we will walk through the steps to deploy the DeepSeek-R1 quantitative model on your machine using the Ollama Operator for Kubernetes.
This operator makes it easy to run large language models on your cluster. The demo below includes installing the operator, deploying a model, and accessing it.

Prerequisites

  • A running Kubernetes cluster (minikube, kind, or any other distribution)
  • kubectl configured to interact with your cluster
  • Go installed for installing the kollama CLI
  • At least 8GB RAM on your node (more for larger models)

Make sure to have your environment ready. You can use screenshots to verify your cluster status:

Kubernetes Cluster Status

Step 1: Install the Ollama Operator

First, install the operator on your Kubernetes cluster by applying the installation YAML. Open your terminal and run:

kubectl apply --server-side=true -f https://raw.githubusercontent.com/nekomeowww/ollama-operator/v0.10.1/dist/install.yaml

You should see output confirming the creation of resources as seen below.

Kubernetes Operator Ready Output

Step 2: Wait for the Operator to be Ready

After installation, wait for the operator to be fully ready with the following command:

kubectl wait -n ollama-operator-system --for=jsonpath='{.status.readyReplicas}'=1 deployment/ollama-operator-controller-manager

This command waits until the operator’s controller manager is running. A screenshot of a successful wait might look like:

Operator Ready Output

Step 3: Install the Kollama CLI

The kollama CLI makes it easier to interact with the operator. Install it using Go:

go install github.com/nekomeowww/ollama-operator/cmd/kollama@latest

Kollama go download

Step 3.1: Locate the Binary

By default, Go installs binaries in the $HOME/go/bin directory. Check if the kollama binary is present by running:

ls $HOME/go/bin/kollama

If the binary exists as seen below, proceed to the next step.

Kollama binary path

Step 3.2: Add the Go Binary Directory to Your PATH

You need to update your PATH environment variable to include $HOME/go/bin. You can do this by adding the following line to your shell’s configuration file (e.g., .bashrc or .zshrc):

export PATH=$PATH:$HOME/go/bin

After adding the line, reload your configuration. For example, if you are using bash, run:

source ~/.bashrc

Step 3.3: Verify the Installation

Now, try checking the version again:

kollama version

This should display the installed version of kollama. If the issue persists, ensure there are no typos in your configuration file and that the installation path is correct.

Additional Notes

  • If you are using a different shell (like zsh), update the corresponding configuration file (e.g., ~/.zshrc).
  • You can also temporarily update your PATH in the current session by running the export command directly in your terminal.

With these steps, the kollama command should be recognized, allowing you to continue with your deployment tasks.

Once installed, verify that it is accessible from your terminal by running kollama version (if available).

Kollama CLI Version

Step 4: Deploy the DeepSeek-R1 Model

You can deploy your model using the kollama CLI. For example, deploy the phi model with the following command:

kollama deploy phi --expose --node-port 30001

Kollama deploy

Note: If you are using kind (or any environment with a StorageClass that supports only ReadWriteOnce), you might need to customize your Model Custom Resource (CR). Here is a sample YAML file (ollama-model-phi.yaml):

apiVersion: ollama.ayaka.io/v1
kind: Model
metadata:
  name: phi
spec:
  image: phi
  persistentVolume:
    accessMode: ReadWriteOnce

Apply the Model CR to your cluster:

kubectl apply -f ollama-model-phi.yaml

Wait for the model to be ready:

kubectl wait --for=jsonpath='{.status.readyReplicas}'=1 deployment/ollama-model-phi

Deploy Model Output

Step 5: Access and Interact with the Model

Once the model is deployed and ready, forward the ports to access it:

kubectl port-forward svc/ollama-model-phi 8080:80

Now, you can interact with the model using the following command:

ollama run phi

Access Model via Port Forwarding

Step 6: Clean Up Resources

After finishing your demo, it’s important to clean up the deployed resources. Follow these steps:

Remove the Deployed Model

Delete the model custom resource:

kubectl delete -f ollama-model-phi.yaml

Wait until the resources are removed. You can verify by checking the deployments in your cluster.

Uninstall the Ollama Operator

Remove the operator by deleting the installed YAML:

kubectl delete --server-side=true -f https://raw.githubusercontent.com/nekomeowww/ollama-operator/v0.10.1/dist/install.yaml

This command cleans up all operator-related resources.

Cleanup Output

Stop Port Forwarding

If you have an active port-forward session, stop it by pressing Ctrl+C in the terminal window where it is running.

Additional Information and Resources

For further details on the Ollama Operator, check out the GitHub repository and the release notes for v0.10.6.

The operator supports multiple models on the same cluster, and you can customize deployments using additional options like scaling replicas, setting the imagePullPolicy, and more. A full configuration example is provided below:

apiVersion: ollama.ayaka.io/v1
kind: Model
metadata:
  name: phi
spec:
  replicas: 2
  image: phi
  imagePullPolicy: IfNotPresent
  storageClassName: local-path
  persistentVolumeClaim: your-pvc
  persistentVolume:
    accessMode: ReadWriteOnce

Wrapping Up

With these steps, you have successfully deployed the DeepSeek-R1 (phi) model on your Kubernetes cluster using the Ollama Operator. Enjoy exploring and scaling your large language models in a clustered environment!

Have Queries? Join https://launchpass.com/collabnix

Adesoji Alu Adesoji brings a proven ability to apply machine learning(ML) and data science techniques to solve real-world problems. He has experience working with a variety of cloud platforms, including AWS, Azure, and Google Cloud Platform. He has a strong skills in software engineering, data science, and machine learning. He is passionate about using technology to make a positive impact on the world.
Join our Discord Server
Index