Ajeet Raina Docker Captain, ARM Innovator & Docker Bangalore Community Leader.

How to Deploy a Stateful Application using Kubernetes? – KubeLabs Glossary

5 min read

StatefulSet is a Kubernetes controller that is used to manage and maintain one or more Pods. However, so do other controllers like ReplicaSets and, the more robust, Deployments. So what does Kubernetes use StatefulSets for? To answer this question, we need to discuss stateless versus stateful applications.

A stateless application is one that does not care which network it is using, and it does not need permanent storage. Examples of stateless apps may include web servers (Apache, NginX, or Tomcat).

On the other hand, we have stateful apps. Let’s say you have a Solr database cluster that is managed by several Zookeeper instances. For such an application to function correctly, each Solr instance must be aware of the Zookeeper instances that are controlling it. Similarly, the Zookeeper instances themselves establish connections between each other to elect a master node. Due to such a design, Solr clusters are an example of stateful applications. If you deploy Zookeeper on Kubernetes, you’ll need to ensure that pods can reach each other through a unique identity that does not change (hostnames, IPs…etc.). Other examples of stateful applications include MySQL clusters, Redis, Kafka, MongoDB, and others.

Given this difference, Deployment is more suited to work with stateless applications. As far as a Deployment is concerned, Pods are interchangeable. While a StatefulSet keeps a unique identity for each Pod it manages. It uses the same identity whenever it needs to reschedule those Pods.

Exposing a StatefulSet

A Kubernetes Service acts as an abstraction layer. In a stateless application like an Nginx web server, the client does not (and should not) care which pod receives a response to the request. The connection reaches the Service, and it routes it to any backend pod. This is not the case in stateful apps. In the above diagram, a Solr pod may need to reach the Zookeeper master, not any pod. For this reason, part of the Statefulset definition entails a Headless Service. A Headless Service does not contain a ClusterIP. Instead, it creates several Endpoints that are used to produce DNS records. Each DNS record is bound to a pod. All of this is done internally by Kubernetes, but it’s good to have an idea about how it does it.

Deploying a Stateful Application Using Kubernetes Statefulset

If you look at web_stateful.yaml file, you will find a snippet around how we are deploying a stateful application. For simplicity, are we using Nginx as the pod image. The deployment is made up of 2 Nginx web servers; both of them are connected to a persistent volume. For example, look at web_stateful.yaml file under the current location.

Before we start discussing the details of this definition, notice that the file actually contains two definitions: the storage class that the StatefulSet is using and the StatefulSet itself.

Storage Class

Storage classes are Kubernetes objects that let the users specify which type of storage they need from the cloud provider. Different storage classes represent various service quality, such as disk latency and throughput, and are selected depending on the scenario they are used for and the cloud provider’s support. Persistent Volumes and Persistent Volume Claims use Storage Classes.

Persistent Volumes and Persistent Volume Claims

Persistent volumes act as an abstraction layer to save the user from going into the details of how storage is managed and provisioned by each cloud provider (in this example, we are using Google GCE). By definition, StatefulSets are the most frequent users of Persistent Volumes since they need permanent storage for their pods.

A Persistent Volume Claim is a request to use a Persistent Volume. If we are to use the Pods and Nodes analogy, then consider Persistent Volumes as the “nodes” and Persistent Volume Claims as the “pods” that use the node resources. The resources we are talking about here are storage properties, such as storage size, latency, throughput, etc.

Under this tutorial, we will see example of NFS server.

Deploying NFS Server

NFS is a protocol that allows nodes to read/write data over a network. The protocol works by having a master node running the NFS daemon and stores the data. This master node makes certain directories available over the network.

Clients access the masters shared via drive mounts. From the viewpoint of applications, they are writing to the local disk. Under the covers, the NFS protocol writes it to the master.

In this scenario, and for demonstration and learning purposes, the role of the NFS Server is handled by a customised container. The container makes directories available via NFS and stores the data inside the container. In production, it is recommended to configure a dedicated NFS Server.

Start the NFS using the command:

docker run -d --net=host     --privileged --name nfs-server     katacoda/contained-nfs-server:centos7     /exports/data-0001 /exports/data-0002 

The NFS server exposes two directories, data-0001 and data-0002. In the next steps, this is used to store data.

Deploying Persistent Volume

For Kubernetes to understand the available NFS shares, it requires a PersistentVolume configuration. The PersistentVolume supports different protocols for storing data, such as AWS EBS volumes, GCE storage, OpenStack Cinder, Glusterfs and NFS. The configuration provides an abstraction between storage and API allowing for a consistent experience.

In the case of NFS, one PersistentVolume relates to one NFS directory. When a container has finished with the volume, the data can either be Retained for future use or the volume can be Recycled meaning all the data is deleted. The policy is defined by the persistentVolumeReclaimPolicy option.

Spec File:

apiVersion: v1 kind: PersistentVolume metadata:   name:  spec:   capacity:     storage: 1Gi   accessModes:     - ReadWriteOnce     - ReadWriteMany   persistentVolumeReclaimPolicy: Recycle   nfs:     server:      path:  

The spec defines additional metadata about the persistent volume, including how much space is available and if it has read/write access.

Task

Create two new PersistentVolume definitions to point at the two available NFS shares.

kubectl create -f nfs-0001.yaml 
kubectl create -f nfs-0002.yaml 

View the contents of the files using cat nfs-0001.yaml nfs-0002.yaml

Once created, view all PersistentVolumes in the cluster using kubectl get pv

Deploying Persistent Volume Claim

Once a Persistent Volume is available, applications can claim the volume for their use. The claim is designed to stop applications accidentally writing to the same volume and causing conflicts and data corruption.

The claim specifies the requirements for a volume. This includes read/write access and storage space required. An example is as follows:

kind: PersistentVolumeClaim apiVersion: v1 metadata:   name: claim-mysql spec:   accessModes:     - ReadWriteOnce   resources:     requests:       storage: 3Gi 

Task

Create two claims for two different applications. A MySQL Pod will use one claim, the other used by an HTTP server.

kubectl create -f pvc-mysql.yaml 
kubectl create -f pvc-http.yaml 

View the contents of the files using cat pvc-mysql.yaml pvc-http.yaml

Once created, view all PersistentVolumesClaims in the cluster using

kubectl get pvc. 

The claim will output which Volume the claim is mapped to claim.

Using Volume

When a deployment is defined, it can assign itself to a previous claim. The following snippet defines a volume mount for the directory /var/lib/mysql/data which is mapped to the storage mysql-persistent-storage. The storage called mysql-persistent-storage is mapped to the claim called claim-mysql.

 spec:       volumeMounts:         - name: mysql-persistent-storage           mountPath: /var/lib/mysql/data   volumes:     - name: mysql-persistent-storage       persistentVolumeClaim:         claimName: claim-mysql          

Task

Launch two new Pods with Persistent Volume Claims. Volumes are mapped to the correct directory when the Pods start allowing applications to read/write as if it was a local directory.

kubectl create -f pod-mysql.yaml 
kubectl create -f pod-www.yaml 

Use the command below to view the definition of the Pods.

cat pod-mysql.yaml pod-www.yaml 

You can see the status of the Pods starting using

kubectl get pods 

If a Persistent Volume Claim is not assigned to a Persistent Volume, then the Pod will be in Pending mode until it becomes available. In the next step, we’ll read/write data to the volume.

Read/Write Data

tore all database changes to the NFS Server while the HTTP Server will serve static from the NFS drive. When upgrading, restarting or moving containers to a different machine the data will still be accessible.

To test the HTTP server, write a ‘Hello World’ index.html homepage. In this scenario, we know the HTTP directory will be based on data-0001 as the volume definition hasn’t driven enough space to satisfy the MySQL size requirement.

docker exec -it nfs-server bash -c "echo 'Hello World' > /exports/data-0001/index.html" 

Based on the IP of the Pod, when accessing the Pod, it should return the expected response.

ip=$(kubectl get pod www -o yaml |grep podIP | awk '{split($0,a,":"); print a[2]}'); echo $ip 
curl $ip 

Update Data

When the data on the NFS share changes, then the Pod will read the newly updated data.

docker exec -it nfs-server bash -c "echo 'Hello NFS World' > /exports/data-0001/index.html" 
curl $ip 

Recreate Pod

Because a remote NFS server stores the data, if the Pod or the Host were to go down, then the data will still be available.

Task

Deleting a Pod will cause it to remove claims to any persistent volumes. New Pods can pick up and re-use the NFS share.

kubectl delete pod www 
kubectl create -f pod-www2.yaml 
ip=$(kubectl get pod www2 -o yaml |grep podIP | awk '{split($0,a,":"); print a[2]}'); curl $ip 

The applications now use a remote NFS for their data storage. Depending on requirements, this same approach works with other storage engines such as GlusterFS, AWS EBS, GCE storage or OpenStack Cinder.

Have Queries? Join https://launchpass.com/collabnix

Ajeet Raina Docker Captain, ARM Innovator & Docker Bangalore Community Leader.

What are Kubernetes Pods and Containers? – KubeLabs Glossary

What are Kubernetes Pods? Kubernetes pods are the foundational unit for all higher Kubernetes objects. A pod hosts one or more containers. It can...
Ajeet Raina
9 min read

What is Helm?

Ajeet Raina
11 min read

29 Replies to “How to Deploy a Stateful Application using Kubernetes? –…”

  1. Great post. I used to be checking continuously this weblog and I am inspired! Extremely useful info specially the last section 🙂 I handle such information a lot. I was looking for this particular information for a very long time. Thanks and best of luck.

  2. Woah! I’m really enjoying the template/theme of this site. It’s simple, yet effective. A lot of times it’s difficult to get that “perfect balance” between superb usability and visual appearance. I must say that you’ve done a great job with this. Also, the blog loads super quick for me on Firefox. Superb Blog!

  3. When I originally commented I clicked the “Notify me when new comments are added” checkbox and now each time a comment is added I get four emails with the same comment. Is there any way you can remove me from that service? Cheers!

  4. of course like your web-site however you need to check the spelling on quite a few of your posts. Several of them are rife with spelling issues and I find it very bothersome to tell the truth on the other hand I’ll definitely come back again.

  5. Hello there! This post couldn’t be written any better! Reading through this post reminds me of my old room mate! He always kept chatting about this. I will forward this post to him. Pretty sure he will have a good read. Thank you for sharing!

  6. Thanks for the sensible critique. Me & my neighbor were just preparing to do a little research about this. We got a grab a book from our local library but I think I learned more from this post. I am very glad to see such great information being shared freely out there.

  7. There are some attention-grabbing cut-off dates in this article however I don’t know if I see all of them center to heart. There’s some validity but I will take maintain opinion until I look into it further. Good article , thanks and we want extra! Added to FeedBurner as effectively

  8. Hello my family member! I want to say that this post is awesome, great written and come with approximately all important infos. I’d like to see more posts like this.

  9. Hi, Neat post. There’s a problem along with your website in internet explorer, might test thisK IE still is the marketplace chief and a large component of other people will leave out your wonderful writing due to this problem.

  10. I’d have to examine with you here. Which is not one thing I usually do! I take pleasure in reading a post that may make folks think. Additionally, thanks for permitting me to comment!

  11. of course like your website but you have to take a look at the spelling on several of your posts. Several of them are rife with spelling problems and I to find it very bothersome to inform the reality however I will certainly come back again.

  12. I¦ve been exploring for a bit for any high quality articles or weblog posts on this sort of area . Exploring in Yahoo I at last stumbled upon this site. Reading this information So i¦m happy to convey that I have a very excellent uncanny feeling I found out exactly what I needed. I most certainly will make sure to do not put out of your mind this web site and provides it a glance regularly.

  13. Howdy! This post couldn’t be written any better! Reading this post reminds me of my previous room mate! He always kept talking about this. I will forward this page to him. Pretty sure he will have a good read. Thank you for sharing!

Leave a Reply

Your email address will not be published.