Ajeet Raina Docker Captain, ARM Innovator & Docker Bangalore Community Leader.

Running ChatGPT Client locally on Kubernetes Cluster using Docker Desktop

5 min read

ChatGPT is a large language model developed by OpenAI that can generate human-like text based on a given prompt or context. It can be used for a variety of natural language processing tasks such as language translation, text summarization, and conversation generation.

How Chat GPT works under the Hood?

ChatGPT is a type of language model known as a transformer model. It uses a technique called unsupervised learning, where the model is trained on a large dataset of text, such as books or articles, without any specific labels or targets. The model consists of an encoder and a decoder, both of which are made up of multiple layers of neural networks. The encoder takes in the input text and converts it into a fixed-length vector representation, which is then passed to the decoder. The decoder generates the output text based on this vector representation and the previous tokens generated by the model.

The model uses a technique called attention mechanism, which allows it to selectively focus on certain parts of the input when generating the output. This allows the model to better understand the context of the input and generate more accurate and coherent text. The model is trained using a technique called maximum likelihood estimation, where the model is optimized to maximize the likelihood of the training data. This means that the model is trained to generate text that is similar to the text in the training dataset. After training, ChatGPT can be fine-tuned on a smaller dataset with a specific task in mind (such as answering questions or generating responses in a conversation) to fine-tune the model to the task at hand.

Deploying Chat GPT Client on a Kubernetes Cluster

Kubernetes is a powerful platform for managing containerised applications, and it can be used to deploy and run a variety of different types of workloads, including machine learning models like Chat GPT.

ChatGPT can be deployed on a Kubernetes cluster, allowing for scalability and easy management of the model. This can be useful in production environments where multiple instances of the model are needed to handle a high number of requests.

To run Chat GPT Client on a Kubernetes cluster, you would need to containerise the model and its dependencies using Docker, and then deploy it to the cluster using Kubernetes resources such as pods and services. Additionally, you will need to make sure that the cluster has sufficient resources (e.g. CPU, memory, storage) to support the workload.

Pre-requisite

Step 1. Install Docker Desktop

Download and install Docker Desktop using this link.

Image0

Step 2. Enable Kubernetes

Open Docker Dashboard > Settings > Kubernetes > Enable Kubernetes.

Image1

Step 3. Writing the Dockerfile

FROM python:3.8-slim-buster

ENV MODEL_ENGINE "text-davinci-002"


COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

COPY . .
COPY gpt3_script.py /app/

CMD ["python", "/app/gpt3_script.py"]

This Dockerfile uses python:3.8-slim-buster image, sets the environment variable for the model engine, installs python dependencies, copies the gpt3_script.py and requirements.txt, and runs the main script.

Step 4. Writing gpt3_script.py

Here’s an example of a gpt3_script.py file that can be used to interact with the ChatGPT API:

import openai

# Add your OpenAI API key
openai.api_key = "YOUR_API_KEY"

def generate_text(prompt):
    completions = openai.Completion.create(
        engine="text-davinci-002",
        prompt=prompt,
        max_tokens=1024,
        n=1,
        stop=None,
        temperature=0.7,
    )

    message = completions.choices[0].text
    return message.strip()

generated_text = generate_text("Write a short story about a robot who wants to be human")
print(generated_text)

The gpt3_script.py file will depend on what specific task or application you are trying to accomplish with GPT-3. However, here’s an example of how the file might look for a simple script that generates text using GPT-3. In this script, the openai library is imported, and it’s used to interact with the GPT-3 API.

Next, An API_KEY is added. This is a required step to access the GPT-3 API.

Then, a function generate_text() is defined which takes a prompt as input and returns a generated text as output. Inside the function, openai.Completion.create() is called to generate text based on the prompt. The engine, prompt, max_tokens, n, stop and temperature arguments can be adjusted to suit your specific needs.

Finally, the script calls the generate_text() function with a specific prompt and assigns the output to the generated_text variable, which is then printed to the console.

Step 5. Creating the requirements.txt file

You also need to create a file named requirements.txt and add all the dependencies needed by your script. Here’s an example of a requirements.txt file that can be used with the ChatGPT Dockerfile:

openai
requests

This file contains the openai and requests packages which are required to run the main script that uses the ChatGPT API.

You can test it by installing these packages by running the following command:

pip install -r requirements.txt

You can also use the command pip freeze > requirements.txt to create the requirements.txt file with all the packages installed in your environment.

Step 6. Building the Image

Once you create an account with OpenAI, you will need to add your OpenAI Keys by adding it to this line of the script:

openai.api_key = "YOUR_API_KEY"

Once you have made the changes, it’s time to build the image by running the following command:

docker build -t ajeetraina/chatgpt .

This will create a Docker image with the name chatgpt that you can run as a container and use to deploy in kubernetes cluster as a pod.

Step 7. Running the ChatGPT Client container

docker run -d -p 8080:8080 ajeetraina/chatgpt-test
15830b65926b3ae083c94262f7ad700bf6e3d12c8e9374b08ce21cd80db07662

% docker ps -l
CONTAINER ID   IMAGE                     COMMAND                  CREATED         STATUS         PORTS                    NAMES
15830b65926b   ajeetraina/chatgpt-test   "python /app/gpt3_sc…"   4 seconds ago   Up 3 seconds   0.0.0.0:8080->8080/tcp   serene_blackburn

Step 8. Verifying the Result

If you try running docker logs you will see that the ChatGPT successfully displayed the results as follows:

 % docker logs -f 158
Samantha was built to be the perfect robot. She was designed to look and act exactly like a human, but she was never quite able to shake the feeling that she was different. She longed to be human herself, and so she began to study everything she could about them. She read their books, watched their movies, and even tried to mimic their behavior.

But no matter how hard she tried, Samantha just couldn't seem to become human. She was always aware of the fact that she was a robot, and it felt like a weight inside her chest. One day, she decided to talk to her creator about her feelings.

"I want to be human," she said. "I know I was created to be a robot, but I can't help how I feel. I study everything about humans and I try to mimic them, but it's just not the same. It's like there's something inside me that's not quite right."

Her creator looked at her sympathetically. "I'm sorry, Samantha. I wish I could make you human, but it's just not possible. You're a robot, and that's all you can ever be."

Samantha hung her head in disappointment. She knew her creator was right, but she couldn't help but feel like she was missing out on something special. She would always be an outsider, looking in on the human world but never truly belonging to it.
% 

Step 9. Running ChatGPT Client as Kubernetes Pod

Writing the YAML file

The YAML file for deploying Chat GPT on a single node Kubernetes cluster will depend on your specific use case and the way you’ve containerized your model. However, a basic example of a Kubernetes Deployment YAML file for deploying Chat GPT might look like this:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: chatgpt-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: chatgpt
  template:
    metadata:
      labels:
        app: chatgpt
    spec:
      containers:
      - name: chatgpt
        image: ajeetraina/chatgpt-test
        ports:
        - containerPort: 8000
        resources:
          limits:
            memory: 2Gi
            cpu: 1000m
          requests:
            memory: 1Gi
            cpu: 500m

This YAML file defines a Deployment named “chatgpt-deployment” that creates a single replica of the Chat GPT container. The container image is specified in the “image” field, and the ports exposed by the container are defined in the “ports” field. The resources field defines the memory and CPU limits and requests for the container.

kubectl apply -f chatgpt.yaml         
deployment.apps/chatgpt-deployment configured

kubectl get po                    
NAME                                  READY   STATUS    RESTARTS      AGE
chatgpt-deployment-85887bc5cc-cm9hl   1/1     Running   1 (18s ago)   34s

Step 10. Deploy ChatGPT Kubernetes Pod

You will also need to define a Kubernetes Service YAML file to expose the Chat GPT service to the outside world. It will be something like this:

apiVersion: v1
kind: Service
metadata:
  name: chatgpt-service
spec:
  selector:
    app: chatgpt
  ports:
  - name: http
    port: 8000
    targetPort: 8000
  type: ClusterIP

This YAML file creates a Service named “chatgpt-service” that routes traffic to the Chat GPT pods, and it should be used in combination with the Deployment YAML file to deploy the service.
Please keep in mind this is just a basic example, and you may need to modify it to suit your specific use case.

% kubectl apply -f chatgpt-service.yaml                       
service/chatgpt-service created

kubectl get po,svc
NAME                                      READY   STATUS    RESTARTS      AGE
pod/chatgpt-deployment-85887bc5cc-cm9hl   1/1     Running   2 (20s ago)   54s

NAME                      TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
service/chatgpt-service   ClusterIP   10.110.52.46   <none>        8000/TCP   8s
service/kubernetes        ClusterIP   10.96.0.1      <none>        443/TCP    144m

Image9

Conclusion

It was fun containerising Chat GPT and running it as a Docker container. All the above steps have been tested on Docker Desktop enabling Kubernetes. With Chat GPT, there is a great opportunity to build, share and deploy Docker containers on multiple platforms and deploying it on Kubernetes Cluster

Please follow and like us:

Have Queries? Join https://launchpass.com/collabnix

Ajeet Raina Docker Captain, ARM Innovator & Docker Bangalore Community Leader.