After scheduling a pod on a node, the pod immediately starts to use compute resources and if it remains unchecked, it could deprive the other clusters running workloads of necessary resources which could ultimately crash the whole cluster.
Efficient resource management guarantees that applications obtain the required resources for proper functionality, all while optimizing cluster utilization and minimizing expenses.
In this article, you will learn how to use Kubernetes to manage resources effectively and efficiently. You will learn about the following topics:
- Resource requests and limits: how to specify the amount of CPU and memory that each container needs and the maximum amount that each container can use.
- Resource types: how to choose the appropriate type of resource for your workload, such as CPU, memory, storage, network, or ephemeral.
- Resource allocation: how Kubernetes allocates resources to pods and containers based on their requests and limits, and the available capacity of the nodes.
- Resource monitoring: how to monitor the resource usage and performance of your pods and containers using Kubernetes tools and metrics.
By the end of this article, you will have a better understanding of how Kubernetes works and how to use its features to manage resources well.
Resource requests and limits
One of the key aspects of resource management in Kubernetes is to specify the amount of CPU and memory that each container needs and the maximum amount that each container can use. These are called resource requests and resource limits, respectively.
Resource requests and limits serve two main purposes:
- They help Kubernetes to schedule pods to nodes that have enough resources to meet their needs. Pods with resource requests are guaranteed to get the amount of resources they request, while pods without resource requests may get terminated if the node runs out of resources.
- They help Kubernetes to ensure resource isolation and prevent resource starvation. Pods with resource limits are restricted to use the amount of resources they specify, while pods without resource limits may consume more resources than they need and affect other pods on the same node.
By setting resource requests and limits, you can improve the scheduling, stability, and quality of service of your pods and containers.
There are two ways to set resource requests and limits for pods and containers: using kubectl or using YAML files.
Using kubectl
Use the kubectl
command-line tool to set resource requests and limits for existing pods and containers. For example, to set a resource request of 0.5 CPU and 256 MiB of memory, and a resource limit of 1 CPU and 512 MiB of memory for a container named web
in a pod named my-pod
, use the following command:
$ kubectl set resources pod my-pod -c=web --requests='cpu=0.5,memory=256Mi' --limits='cpu=1,memory=512Mi'
Use the kubectl edit
command to edit the resource requests and limits of a pod or a container interactively. For example, to edit the resource requests and limits of a pod named my-pod
, use the following command:
$ kubectl edit pod my-pod
This will open a text editor where you can modify the resource requests and limits of the pod or its containers.
Using YAML files
You can also use YAML files to set resource requests and limits for pods and containers when you create them. For example, to create a pod named my-pod
with two containers named web
and db
, each with a resource request of 0.5 CPU and 256 MiB of memory, and a resource limit of 1 CPU and 512 MiB of memory, use the following YAML file:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: web
image: nginx
resources:
requests:
cpu: 0.5
memory: 256Mi
limits:
cpu: 1
memory: 512Mi
- name: db
image: mysql
resources:
requests:
cpu: 0.5
memory: 256Mi
limits:
cpu: 1
memory: 512Mi
Use the kubectl create
command to create the pod from the YAML file:
$ kubectl create -f my-pod.yaml
Use the kubectl apply
command to update the resource requests and limits of an existing pod from a YAML file:
$ kubectl apply -f my-pod.yaml
Setting resource requests and limits is a necessary part of resource management in Kubernetes. It helps you to optimize the performance, availability, and cost of your pods and containers, as well as the stability and security of your cluster. You should set resource requests and limits that match your workload type, your desired QoS class, and your cluster capacity. You should also monitor your resource usage and adjust your resource requests and limits accordingly. You should also use tools such as the Vertical Pod Autoscaler, the Horizontal Pod Autoscaler, the Cluster Autoscaler, and the Node Auto Provisioning to automate and simplify your resource management tasks.
Resource types
CPU
CPU, measured in cores or millicores, signifies the processing power available to a container. One core equals one physical CPU core, while a millicore is a thousandth of a core (e.g., 0.5 CPU means half a core, and 500m CPU means 500 millicores).
Containers receive CPU allocation based on their resource requests and limits. Those with higher CPU requests take precedence and are scheduled to nodes with sufficient CPU capacity. Containers with lower CPU requests might be scheduled to nodes with less CPU capacity but risk throttling or eviction if the node exhausts CPU resources. Containers with CPU limits adhere strictly to the specified amount, potentially facing throttling if they exceed these limits.
CPU consumption by containers depends on their workload and the node’s available CPU resources. Containers may use more or less CPU than requested, influenced by CPU demand and the shares of other containers. While containers can utilize up to their CPU limits, they cannot surpass the node’s CPU capacity.
To define CPU specifications for pods and containers, utilize the cpu
field in the resources
section of the YAML file. For instance, creating a pod named my-pod
with a container named web
requesting 0.5 CPU and limiting to 1 CPU is accomplished through the following YAML file:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: web
image: nginx
resources:
requests:
cpu: 0.5
limits:
cpu: 1
Use the kubectl set resources
command to set the CPU type for existing pods and containers. For example, to set a CPU request of 0.5 and a CPU limit of 1 for a container named web
in a pod named my-pod
, Use the following command:
$ kubectl set resources pod my-pod -c=web --requests='cpu=0.5' --limits='cpu=1'
Memory
Memory, measured in bytes or power-of-two units like KiB, MiB, GiB, etc., represents the RAM available to a container (e.g., 256 MiB equals 256 megabytes, and 1 GiB equals 1 gibibyte).
Containers receive memory allocation based on their resource requests and limits. Those with higher memory requests gain priority and are scheduled to nodes with sufficient memory capacity. Containers with lower memory requests may be scheduled to nodes with less memory capacity but face termination or eviction if the node depletes memory resources. Containers adhering to memory limits are confined to the specified amount and may be terminated if they surpass these limits.
Memory consumption by containers is influenced by their workload and the node’s available memory resources. Containers may use more or less memory than requested, depending on the memory demand and pressure from other containers on the node. Containers can utilize memory up to their limits but cannot exceed the node’s memory capacity.
To designate the memory type for pods and containers, employ the memory
field in the resources
section of the YAML file. For instance, creating a pod named my-pod
with a container named web
requesting 256 MiB of memory and limiting to 512 MiB of memory is achieved through the following YAML file:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: web
image: nginx
resources:
requests:
memory: 256Mi
limits:
memory: 512Mi
Use the kubectl set resources
command to set the memory type for existing pods and containers. For example, to set a memory request of 256 MiB and a memory limit of 512 MiB for a container named web
in a pod named my-pod
, you can use the following command:
$ kubectl set resources pod my-pod -c=web --requests='memory=256Mi' --limits='memory=512Mi'
Storage
Storage, measured in bytes or power-of-two units like KiB, MiB, GiB, etc., denotes the available disk space for a container (e.g., 1 GiB equals 1 gibibyte, and 10 GiB equals 10 gibibytes).
Containers get storage allocation based on volume and persistent volume claim specifications. Volumes, attachable to pods and containers, and persistent volume claims, requests for storage resources bound to persistent volumes, are crucial. Persistent volumes, provisioned by the cluster administrator or dynamically by the storage class, facilitate this process. For instance, a persistent volume claim requesting 10 GiB can be bound to a persistent volume providing the same amount.
Storage consumption by containers depends on workload and node storage availability. Containers may use more or less storage than requested, influenced by demand and node capacity. Containers can utilize storage up to their limits but cannot exceed the node’s capacity.
To specify storage type for pods and containers, the YAML file can include the volumeMounts
and volumes
fields. For example, creating a pod named my-pod
with a container named web
that mounts a volume named my-volume
to the /data
directory is achieved through this YAML file:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: web
image: nginx
volumeMounts:
- name: my-volume
mountPath: /data
volumes:
- name: my-volume
persistentVolumeClaim:
claimName: my-pvc
Use the kubectl create
command to create a persistent volume claim that requests 10 GiB of storage and a storage class that dynamically provisions a persistent volume that provides 10 GiB of storage. For example, to create a persistent volume claim named my-pvc
and a storage class named my-sc
, you can use the following commands:
$ kubectl create -f my-pvc.yaml
$ kubectl create -f my-sc.yaml
Network
Network, measured in bits per second or power-of-ten units like Kbps, Mbps, Gbps, etc., represents the bandwidth available to a container (e.g., 1 Mbps equals 1 megabit per second, and 10 Gbps equals 10 gigabits per second).
Containers receive network allocation based on network policy and service specifications. Network policy, defining rules for pod and container communication with each other and external entities, and service, a logical abstraction for pods and containers providing a network service, are pivotal. For instance, creating a network policy allowing specific pods to access a service exposing a web application is achievable.
Network consumption by containers relies on workload and node network resource availability. Containers may use more or less network than requested, influenced by demand and node network congestion. Containers can utilize networks up to their limits but cannot surpass the node’s network capacity.
To specify the network type for pods and containers, the YAML file can include the networkPolicy
and service
fields. For example, creating a network policy named my-np
that permits only pods with the label app=web
to access a service named my-svc
exposing port 80 is done through this YAML file:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: my-np
spec:
podSelector:
matchLabels:
app: web
ingress:
- from:
- podSelector:
matchLabels:
app: web
ports:
- protocol: TCP
port: 80
Use the kubectl create
command to create a service named my-svc
that exposes port 80 and selects pods with the label app=web
. For example, to create a service named my-svc
, use the following command:
$ kubectl create service clusterip my-svc --tcp=80:80 --selector=app=web
Ephemeral
Ephemeral is the amount of temporary storage that a container can use. Ephemeral storage is measured in bytes or power-of-two units, such as KiB, MiB, GiB, etc. Ephemeral storage is used for storing data that does not need to persist across container restarts, such as logs, caches, or temporary files. Ephemeral storage is allocated from the node’s local disk where the container is running. Each container has a limit and a request for ephemeral storage, which are used by the scheduler to place the container on a suitable node and to avoid overcommitting the node’s resources. Ephemeral storage can be configured using the ephemeral-storage
resource name in the container spec.
Resource allocation
The scheduler
The scheduler determines pod-to-node assignments, employing algorithms and rules based on pod resource requests, node capacity, and factors like node affinity, pod priority, taints, and tolerations.
Node affinity designates preferred or required nodes for pod deployment, ensuring, for instance, placement on a node with a specific label like zone=us-east-1
. Pod priority establishes relative pod importance; higher-priority pods get scheduled first. Taints and tolerations mark nodes and pods with attributes affecting scheduling, ensuring, for example, pod placement on nodes with specific taints like key=value:NoSchedule
.
To specify resource allocation, utilize the affinity
, priorityClassName
, and tolerations
fields in the YAML file. For instance, creating a pod named my-pod
with a web
container requesting 0.5 CPU, 256 MiB of memory, node affinity for nodes labeled zone=us-east-1
, a pod priority of high-priority
, and a toleration for nodes with the taint key=value:NoSchedule
is achieved through this YAML file:
apiVersion: v1
kind: Pod
metadata:
name: my-pod
spec:
containers:
- name: web
image: nginx
resources:
requests:
cpu: 0.5
memory: 256Mi
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: zone
operator: In
values:
- us-east-1
priorityClassName: high-priority
tolerations:
- key: key
operator: Equal
value: value
effect: NoSchedule
You can also use the kubectl create
command to create the pod from the YAML file:
$ kubectl create -f my-pod.yaml
The kubelet
The kubelet ensures resource isolation and prevents starvation on the node by using pod resource requests and limits. It employs cgroups to establish a hierarchy, restricting CPU, memory, and other container resources. Utilizing eviction, the kubelet reclaims resources during node resource pressure, guided by thresholds and signals like memory.available, nodefs.available, imagefs.available.
Cluster administrators configure kubelet’s resource allocation settings through command-line flags or a configuration file. For instance, setting a memory eviction threshold of 100 MiB, a nodefs eviction threshold of 10%, and an imagefs eviction threshold of 15% can be achieved with the following command-line flags:
kubelet --eviction-hard=memory.available<100Mi,nodefs.available<10%,imagefs.available<15%
or the following configuration file:
evictionHard:
memory.available: 100Mi
nodefs.available: 10%
imagefs.available: 15%
The container runtime
The container runtime executes containers on the node, managing their lifecycle, resources, and interactions with the kubelet and scheduler. It creates, starts, stops, and deletes containers, communicating commands, status, and resource usage.
Using pod resource requests and limits, the container runtime allocates and limits CPU, memory, and other resources for each container. It adjusts resource shares and priorities based on these specifications and employs Quality of Service (QoS) to categorize pods into classes like Guaranteed, Burstable, and BestEffort. QoS guides resource allocation, limitation, and handling of resource contention and throttling on the node.
Cluster administrators configure the container runtime’s resource allocation settings using command-line flags or a configuration file. For instance, setting a CPU period of 100 microseconds and a CPU quota of 200 microseconds for a container with a 2-CPU limit involves using specific command-line flags.
docker run --cpu-period=100 --cpu-quota=200 ...
or the following configuration file:
cpuPeriod: 100
cpuQuota: 200
The admission controller
The admission controller validates and adjusts pod and container specs before creation and scheduling, enforcing resource allocation policies like quotas, limits, and default requests.
Quotas and limits set maximum resource usage for pods, containers, or groups, ensuring compliance with CPU, memory, or other resource constraints. Default requests and limits establish preset resource amounts for pods or containers in the absence of user specifications, guaranteeing minimum or maximum CPU, memory, or other resource levels.
Resource allocation settings for pods and containers can be specified using ResourceQuota
, LimitRange
, and PodPreset
objects in the YAML file. For instance, creating a resource quota named my-rq
to restrict total CPU and memory usage for all pods and containers in namespace my-ns
is achieved through a YAML file.
apiVersion: v1
kind: ResourceQuota
metadata:
name: my-rq
namespace: my-ns
spec:
hard:
requests.cpu: 4
requests.memory: 8Gi
limits.cpu: 8
limits.memory: 16Gi
Use the kubectl create
command to create the resource quota from the YAML file:
$ kubectl create -f my-rq.yaml
Resource monitoring
Resource monitoring involves different tools and metrics that Kubernetes provides to monitor resource usage and performance, such as kubectl, dashboard, node, pod, container, and custom metrics. Each of these tools and metrics has a different role and responsibility in resource monitoring.
kubectl
kubectl is the command-line tool for Kubernetes, enabling API server interaction and operations like pod and container management, scaling, and metrics visualization. Commands like kubectl top
, kubectl describe
, or kubectl get
offer access to resource metrics.
kubectl top
displays current CPU and memory usage for pods and containers. To view usage in a namespace named my-ns
, use the command:
$ kubectl top pod --containers --namespace=my-ns
Output:
NAME CONTAINER CPU(cores) MEMORY(bytes)
my-pod web 0m 10Mi
my-pod db 1m 20Mi
my-other-pod app 2m 30Mi
kubectl describe
is a command that displays the detailed information and status of your pods and containers, including their resource requests, limits, and QoS class. For example, to display the information and status of a pod named my-pod
in a namespace named my-ns
, you can use the following command:
$ kubectl describe pod my-pod --namespace=my-ns
Output
Name: my-pod
Namespace: my-ns
Priority: 0
Node: node-1/10.0.0.1
Start Time: Mon, 22 Jan 2024 07:07:26 GMT+00:00
Labels: app=web
Annotations: <none>
Status: Running
IP: 10.0.0.2
IPs:
IP: 10.0.0.2
Containers:
web:
Container ID: docker://1234567890abcdef
Image: nginx
Image ID: docker-pullable://nginx@sha256:abcdef1234567890
Port: <none>
Host Port: <none>
State: Running
Started: Mon, 22 Jan 2024 07:07:28 GMT+00:00
Ready: True
Restart Count: 0
Requests:
cpu: 0.5
memory: 256Mi
Limits:
cpu: 1
memory: 512Mi
Environment: <none>
Mounts:
/data from my-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-xyz (ro)
QoS Class: Burstable
db:
Container ID: docker://abcdef1234567890
Image: mysql
Image ID: docker-pullable://mysql@sha256:1234567890abcdef
Port: <none>
Host Port: <none>
State: Running
Started: Mon, 22 Jan 2024 07:07:29 GMT+00:00
Ready: True
Restart Count: 0
Requests:
cpu: 0.5
memory: 256Mi
Limits:
cpu: 1
memory: 512Mi
Environment: <none>
Mounts:
/data from my-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-xyz (ro)
QoS Class: Burstable
Volumes:
my-volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: my-pvc
ReadOnly: false
default-token-xyz:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-xyz
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events: <none>
kubectl get
is a command that displays the basic information and status of your pods and containers, such as their name, age, ready state, and restart count. For example, to display the information and status of all pods and containers in a namespace named my-ns
, use the following command:
$ kubectl get pod --namespace=my-ns
Output:
NAME READY STATUS RESTARTS AGE
my-pod 2/2 Running 0 10m
my-other-pod 1/1 Running 0 5m
Dashboard
Dashboard is a web-based interface for interacting with the Kubernetes API server, facilitating operations like pod and container management, scaling, and metrics visualization. You can access Dashboard via a web browser using the dashboard service URL (e.g., https://<master-ip>:<port>/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
). Alternatively, the kubectl proxy
command creates a proxy server for accessing Dashboard through the Kubernetes API server. For example, use kubectl proxy
with the following command to access the dashboard:
$ kubectl proxy
Output:
Starting to serve on 127.0.0.1:8001
Then, launch a web browser and navigate to the URL http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
.
Dashboard displays the resource metrics for your pods and containers in various ways, such as using graphs, charts, or tables:
Node, Pod, Container, and Custom Metrics
Kubernetes collects and exposes metrics, categorized into node, pod, container, and custom types, for resource monitoring. These numerical values reflect resource usage and performance, covering CPU, memory, disk, network, etc., and can be sourced from the node, pod, container, or the application.
Node Metrics
- Source: Gathered by the kubelet.
- Exposed Through: Metrics-server.
- Examples: Node CPU and memory.
- Access and Visualization: Tools like kubectl, dashboard, or Prometheus.
Pod Metrics
- Source: Collected by the kubelet.
- Exposed Through: Metrics-server.
- Examples: Reflect the entire pod’s resource usage.
- Access and Visualization: Tools like kubectl, dashboard, or Prometheus.
Container Metrics
- Source: Collected by the kubelet.
- Exposed Through: Metrics-server.
- Access and Visualization: Tools like kubectl, dashboard, or Prometheus.
Custom Metrics
- Reflecting: Application or service performance.
- Examples: Requests per second and latency.
- Collection and Exposure: By the application or service, leveraging the custom metrics API, extending the Kubernetes API.
- Access and Visualization: Tools like Prometheus, Grafana, or Heapster.
Conclusion
In this article, you learned how to use Kubernetes to manage resources effectively and efficiently. You learned about the following topics:
- Resource requests and limits: how to specify the amount of CPU and memory that each container needs and the maximum amount that each container can use.
- Resource types: how to choose the appropriate type of resource for your workload, such as CPU, memory, storage, network, or ephemeral.
- Resource allocation: how Kubernetes allocates resources to pods and containers based on their requests and limits, and the available capacity of the nodes.
- Resource monitoring: how to monitor the resource usage and performance of your pods and containers using Kubernetes tools and metrics.
By applying the concepts and techniques from this article, you can optimize the performance, availability, and cost of your pods and containers, as well as the stability and security of your cluster.
Resources
- Collabnix.com
- Official Kubernetes documentation: The official documentation for Kubernetes.
- Microsoft Learn module: A free online learning module that teaches you how to optimize resource utilization in Kubernetes using Azure Kubernetes Service (AKS).
- How I learned to love Kubernetes with resource and cost optimization – Kubernetes Optimization