Table of Contents

Kubernetes is a popular container orchestration system that allows you to manage, deploy, and scale containerized applications. While Kubernetes provides many benefits, it can be challenging to debug applications running in a Kubernetes cluster. This is where kubectl-debug comes in. Kubectl-debug is a powerful tool for debugging applications running in a Kubernetes cluster. In this blog post, we will take a closer look at kubectl-debug and how it can help you debug your Kubernetes applications effectively.

What is kubectl-debug?

Kubectl-debug is a command-line tool that allows you to debug Kubernetes pods by launching a new container in the same pod with additional debugging tools. Kubectl-debug is an open-source tool developed by the team at OpenAI and is available on GitHub. Kubectl-debug is designed to work with any Kubernetes cluster, and it can be used to debug both stateless and stateful applications.

How does kubectl-debug work?

Kubectl-debug works by creating a new container in the same pod as the application you want to debug. The new container runs a debugging tool, such as GDB or strace, and attaches to the existing application container. This allows you to debug the application in real-time, without interrupting the application’s normal operation.

Kubectl-debug supports several debugging tools, including GDB, strace, lsof, tcpdump, and more. You can specify the debugging tool you want to use with the –image option.

Kubectl-debug also provides several other options, such as the ability to attach to a specific container in a pod, specify a command to run in the debugging container, and specify environment variables to pass to the debugging container.

How to use kubectl-debug?

Using kubectl-debug is straightforward. First, you need to install kubectl-debug on your local machine. Kubectl-debug is available as a binary or can be installed using the Krew plugin manager.

Once kubectl-debug is installed, you can use it to debug any Kubernetes pod by running the following command:

kubectl debug <pod-name> -it --image=<debugging-tool-image>

In this command, is the name of the pod you want to debug, and is the name of the debugging tool image you want to use.

For example, if you want to debug a pod named my-app, you can run the following command to launch a new container in the same pod with the GDB debugger:

kubectl debug my-app -it --image=gcr.io/cloud-debugger/gdb

Once the debugging container is launched, you can use the debugging tool to debug your application. For example, if you are using GDB, you can attach to the application process and set breakpoints, inspect variables, and more.

Kubectl-debug also supports other options, such as attaching to a specific container in a pod, specifying a command to run in the debugging container, and specifying environment variables to pass to the debugging container. You can learn more about these options in the kubectl-debug documentation.

Benefits of using kubectl-debug

Kubectl-debug provides several benefits that make it a valuable tool for debugging Kubernetes applications. Some of these benefits include:

Real-time debugging: Kubectl-debug allows you to debug your application in real-time, without interrupting the application’s normal operation. This makes it easier to identify and fix issues quickly.
Multiple debugging tools: Kubectl-debug supports several debugging tools, including GDB, strace, lsof, and tcpdump. This allows you to choose the debugging tool that best fits your needs.
Easy to use: Kubectl-debug is easy to use and can be installed using a binary or with the K
Works with any Kubernetes cluster: Kubectl-debug is designed to work with any Kubernetes cluster, whether it’s running on-premises or in the cloud. This makes it a versatile tool that can be used in any environment.
Saves time and effort: Debugging applications in a Kubernetes cluster can be time-consuming and challenging. Kubectl-debug simplifies the process by providing a straightforward way to launch a debugging container in the same pod as the application. This saves time and effort and makes the debugging process more efficient.
Can be used for both stateful and stateless applications: Kubectl-debug can be used to debug both stateful and stateless applications running in a Kubernetes cluster. This makes it a versatile tool that can be used for a wide range of applications.

Getting Started

First, you need to download the binary as shown in the following steps:

export RELEASE_VERSION=1.0.0
# linux x86_64
curl -Lo kubectl-debug https://github.com/JamesTGrant/kubectl-debug/releases/download/v${RELEASE_VERSION}/kubectl-debug

# make the binary executable
chmod +x kubectl-debug

# run the binary pointing at whatever cluster kubectl points at
./kubectl-debug --namespace NAMESPACE TARGET_POD_NAME -c TARGET_CONTAINER_NAME

Build from source

Clone the repo:

# clone the repo
git clone https://github.com/JamesTGrant/kubectl-debug

# to use this kubectl-debug utility, you only need to take the resultant kubectl-debug binary 
# file which is created by:
make kubectl-debug-binary

# to 'install' the kubectl-debug binary, make it executable and either call it directy, put 
# it in your PATH, or move it to a location which is already in your PATH:

chmod +x kubectl-debug
mv kubectl-debug /usr/local/bin



#####################
# Extra options
######################

# build 'debug-agent' binary only - you wont need this. This is the binary/executable that 
# the 'debug-agent container' contains. 
# The dockerfile of the debug-agent container refers to this binary.
make debug-agent-binary

# build 'debug-agent' binary, and the 'debug-agent docker image'
# a docker image `jamesgrantmediakind/debug-agent:latest` will be created locally
make debug-agent-docker-image

# make everything; kubectl-debug-binary, debug-agent-binary, and 'debug-agent-docker-image'
make

Usage instructions

# kubectl 1.12.0 or higher

# print the help
kubectl-debug -h

# start the debug container in the same namespace, and cgroup etc as container 'TARGET_CONTAINER_NAME' in
#  pod 'POD_NAME' in namespace 'NAMESPACE'
kubectl-debug --namespace NAMESPACE POD_NAME -c TARGET_CONTAINER_NAME

# in case of your pod stuck in `CrashLoopBackoff` state and cannot be connected to,
# you can fork a new pod and diagnose the problem in the forked pod
kubectl-debug --namespace NAMESPACE POD_NAME -c CONTAINER_NAME --fork

# In 'fork' mode, if you want the copied pod to retain the labels of the original pod, you can use 
# the --fork-pod-retain-labels parameter (comma separated, no spaces). If not set (default), this parameter 
# is empty and so any labels of the original pod are not retained, and the labels of the copied pods are empty.
# Example of fork mode:
kubectl-debug --namespace NAMESPACE POD_NAME -c CONTAINER_NAME --fork --fork-pod-retain-labels=<labelKeyA>,<labelKeyB>,<labelKeyC>

# in order to interact with the debug-agent pod on a node which doesn't have a public IP or direct access 
# (firewall and other reasons) to access, port-forward mode is enabled by default. If you don't want 
# port-forward mode, you can use --port-forward false to turn off it. I don't know why you'd want to do 
# this, but you can if you want.
kubectl-debug --port-forward=false --namespace NAMESPACE POD_NAME -c CONTAINER_NAME

# you can choose a different debug container image. By default, nicolaka/netshoot:latest will be 
# used but you can specify anything you like
kubectl-debug --namespace NAMESPACE POD_NAME -c CONTAINER_NAME --image nicolaka/netshoot:latest 

# you can set the debug-agent pod's resource limits/requests, for example:
# default is not set
kubectl-debug --namespace NAMESPACE POD_NAME -c CONTAINER_NAME --agent-pod-cpu-requests=250m --agent-pod-cpu-limits=500m --agent-pod-memory-requests=200Mi --agent-pod-memory-limits=500Mi

# use primary docker registry, set registry kubernetes secret to pull image
# the default registry-secret-name is kubectl-debug-registry-secret, the default namespace is default
# please set the secret data source as {Username: <username>, Password: <password>}
kubectl-debug --namespace NAMESPACE POD_NAME --image nicolaka/netshoot:latest --registry-secret-name <k8s_secret_name> --registry-secret-namespace <namespace>

# in addition to passing cli arguments, you can use a config file if you would like to 
# non-default values for various things.
kubectl-debug --configfile /PATH/FILENAME --namespace NAMESPACE POD_NAME -c TARGET_CONTAINER_NAME

Debugging examples

This guide shows a few typical example of debugging a target container.

Basic

When you run kubectl-debug it causes a ‘debug container’ to be created on the same node, and which runs in the same pid, network, ipc and user namespace, as the target container. By default, kubectl-debug uses nicolaka/netshoot as container image for the ‘debug container’. The netshoot project documentation provides excellent guides and examples for using various tools to troubleshoot your target container.

Here are a few examples to show netshoot working with kubectl-debug:

Connect to a running container ‘demo-container’ in pod ‘demo-pod’ in the default namespace:

➜  ~ kubectl-debug --namespace default target-pod -c target-container

Agent Pod info: [Name:debug-agent-pod-da46a000-8429-11e9-a40c-8c8590147766, Namespace:default, Image:jamesgrantmediakind/debug-agent:latest, HostPort:10027, ContainerPort:10027]
Waiting for pod debug-agent-pod-da46a000-8429-11e9-a40c-8c8590147766 to run...
pod target-pod pod IP: 10.233.111.78, agentPodIP 172.16.4.160
wait for forward port to debug agent ready...
Forwarding from 127.0.0.1:10027 -> 10027
Forwarding from [::1]:10027 -> 10027
Handling connection for 10027
                             pulling image nicolaka/netshoot:latest...
latest: Pulling from nicolaka/netshoot
Digest: sha256:5b1f5d66c4fa48a931ff54f2f34e5771eff2bc5e615fef441d5858e30e9bb921
Status: Image is up to date for nicolaka/netshoot:latest
starting debug container...
container created, open tty...

 [1] 🐳  → hostname
target-container

Navigating the filesystem of the target container:

The root filesystem of target container is located in /proc/{pid}/root/, and the pid is typically ‘1’. You can chroot to the root filesystem of target container to navigate the target container filesystem or cd /proc/1/root works just as well (assuming PID ‘1’ is the correct PID).

root @ /
 [2] 🐳  → chroot /proc/1/root

 root @ /
 [3] 🐳  → cd /proc/1/root
 
root @ /
 [#] 🐳  → ls
 bin            entrypoint.sh  home           lib64          mnt            root           sbin           sys            tmp            var
 dev            etc            lib            media          proc           run            srv            usr
 (you can navigate the target containers filesystem and view/edit files)

root @ /
 [#] 🐳  → ./entrypoint.sh
 (you can attempt to run the target containers entrypoint.sh script and perhaps see what errors are produced)

Using iftop to inspect network traffic:

root @ /
 [4] 🐳  → iftop -i eth0
interface: eth0
IP address is: 10.233.111.78
MAC address is: 86:c3:ae:9d:46:2b
(CLI graph omitted)

Using drill to diagnose DNS:

root @ /
 [5] 🐳  → drill -V 5 demo-service
;; ->>HEADER<<- opcode: QUERY, rcode: NOERROR, id: 0
;; flags: rd ; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;; demo-service.	IN	A

;; ANSWER SECTION:

;; AUTHORITY SECTION:

;; ADDITIONAL SECTION:

;; Query time: 0 msec
;; WHEN: Sat Jun  1 05:05:39 2019
;; MSG SIZE  rcvd: 0
;; ->>HEADER<<- opcode: QUERY, rcode: NXDOMAIN, id: 62711
;; flags: qr rd ra ; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 0
;; QUESTION SECTION:
;; demo-service.	IN	A

;; ANSWER SECTION:

;; AUTHORITY SECTION:
.	30	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2019053101 1800 900 604800 86400

;; ADDITIONAL SECTION:

;; Query time: 58 msec
;; SERVER: 10.233.0.10
;; WHEN: Sat Jun  1 05:05:39 2019
;; MSG SIZE  rcvd: 121

`proc` filesystem and FUSE

It is common to use tools like top, free to inspect system metrics like CPU usage and memory. Using these commands will display the metrics from the host system by default. Because they read the metrics from the proc filesystem (/proc/*), which is mounted from the host system. This can be extremely useful (you can still inspect the pod/container metrics of as part of the host metrics) You may find this blog post useful.

Debug Pod in “CrashLoopBackoff”

Troubleshooting kubernetes containers in the CrashLoopBackoff state can be tricky. Using kubectl-debug ‘normally’ probably wont help you as the debug container processed will be terminated reaped once the target container (process with pid 1) exits. To tackle with this, kubectl-debug provides the --fork flag, which borrows the idea from the oc debug command: copy the currently crashing pod and (hopefully) the issue will reproduce in the forked Pod with the added ability to debug via the debug container.

Under the hood, kubectl debug --fork will copy the entire Pod spec and:

strip all the labels, so that no traffic will be routed from service to this pod (seeReadme.md for instructions on duplicating the labels);
modify the entry-point of target container in order to hold the pid namespace and avoid the Pod crash again;

Here’s an example:

➜  ~ kubectl-debug demo-pod -c demo-container --fork
Agent Pod info: [Name:debug-agent-pod-dea9e7c8-8439-11e9-883a-8c8590147766, Namespace:default, Image:jamesgrantmediakind/debug-agent:latest, HostPort:10027, ContainerPort:10027]
Waiting for pod debug-agent-pod-dea9e7c8-8439-11e9-883a-8c8590147766 to run...
Waiting for pod demo-pod-e23c1b68-8439-11e9-883a-8c8590147766-debug to run...
pod demo-pod PodIP 10.233.111.90, agentPodIP 172.16.4.160
wait for forward port to debug agent ready...
Forwarding from 127.0.0.1:10027 -> 10027
Forwarding from [::1]:10027 -> 10027
Handling connection for 10027
                             pulling image nicolaka/netshoot:latest...
latest: Pulling from nicolaka/netshoot
Digest: sha256:5b1f5d66c4fa48a931ff54f2f34e5771eff2bc5e615fef441d5858e30e9bb921
Status: Image is up to date for nicolaka/netshoot:latest
starting debug container...
container created, open tty...

 [1] 🐳  → ps -ef
PID   USER     TIME  COMMAND
    1 root      0:00 sh -c -- while true; do sleep 30; done;
    6 root      0:00 sleep 30
    7 root      0:00 /bin/bash -l
   15 root      0:00 ps -ef

Debug init container

Just like debugging the ordinary container, we can debug the init-container of a pod. You must specify the container name of init-container:

➜  ~ kubectl-debug demo-pod -c init-container

How it works?

kubectl-debug consists of 3 components:

the ‘kubectl-debug’ executable serves the kubectl-debug command and interfaces with the kube-api-server
the ‘debug-agent’ pod is a temporary pod that is started in the cluster by kubectl-debug. The ‘debug-agent’ container is responsible for starting and manipulating the ‘debug container’. The ‘debug-agent’ will also act as a websockets relay for remote tty to join the output of the ‘debug container’ to the terminal from which the kubectl-debug command was issued
the ‘debug container’ which is the container that provides the debugging utilities and the shell in which the human user performs their debugging activity. kubectl-debug doesn’t provide this – it’s an ‘off-the-shelf container image (nicolaka/netshoot:latest by default), it is invoked and configured by ‘debug-agent’.

The following occurs when the user runs the command: kubectl-debug --namespace <namespace> <target-pod> -c <container-name>

‘kubectl-debug’ gets the target-pod info from kube-api-server and extracts the host information (the target-pod within the namespace )
‘kubectl-debug’ sends a ‘debug-agent’ pod specification to kube-api-server with a node-selector matching the host. By default the container image is docker.io/jamesgrantmediakind/debug-agent:latest
kube-api-server requests the creation of ‘debug-agent’ pod. ‘debug-agent’ pod is created in the default namespace (doesn’t have to be the same namespace as the target pod)
‘kubectl-debug’ sends an HTTP request to the ‘debug-agent’ pod running on the host which includes a protocol upgrade from HTTP to SPDY
debug-agent’ checks if the target container is actively running, if not, write an error to client
‘debug-agent’ interfaces with containerd (or dockerd if applicable) on the host to request the creation of the ‘debug-container’. debug container with tty and stdin opened, the ‘debug-agent’ configures the debug container‘s pid, network, ipc and user namespace to be that of the target container
‘debug-agent’ pipes the connection into the debug container using attach
Human performs debugging/troubleshooting on the target container from ‘within’ in the debug container with access to the target container process/network/ipc namespaces and root filesystem
debugging complete, user exits the debug-container shell which closes the SPDY connection
‘debug-agent’ closes the SPDY connection, then waits for the debug container to exit and do the cleanup
‘debug-agent’ pod is deleted

Kubectl-debug is a powerful tool for debugging Kubernetes applications. It allows you to launch a new container in the same pod as the application with additional debugging tools, making it easier to identify and fix issues in real-time. Kubectl-debug supports several debugging tools and works with any Kubernetes cluster, making it a versatile tool that can be used in any environment. With kubectl-debug, debugging applications in a Kubernetes cluster has never been easier.

Debug Your Kubernetes Applications with Kubectl-debug Tool

What is kubectl-debug?

How does kubectl-debug work?

How to use kubectl-debug?

Benefits of using kubectl-debug

Getting Started

Build from source

Usage instructions

Debugging examples

Basic

`proc` filesystem and FUSE

Debug Pod in “CrashLoopBackoff”

Debug init container

How it works?

Kubernetes MCP Server: Step by Step Guide

Running Distributed ML Training with JobSet on Kubernetes

Kubectl Quick Reference 2025

Debug Your Kubernetes Applications with Kubectl-debug Tool

What is kubectl-debug?

How does kubectl-debug work?

How to use kubectl-debug?

Benefits of using kubectl-debug

Getting Started

Build from source

Usage instructions

Debugging examples

Basic

proc filesystem and FUSE

Debug Pod in “CrashLoopBackoff”

Debug init container

How it works?

Kubernetes MCP Server: Step by Step Guide

Running Distributed ML Training with JobSet on Kubernetes

Kubectl Quick Reference 2025

`proc` filesystem and FUSE