0% found this document useful (0 votes)

32 views24 pages

The Kubernetes Troubleshooting Handbook

The Kubernetes Troubleshooting Handbook provides essential techniques and tools for debugging Kubernetes applications, focusing on pod lifecycle events, command usage, and monitoring resources. It emphasizes the importance of using commands like 'kubectl get' and 'kubectl describe' for analyzing pod statuses, as well as utilizing the Kubernetes Dashboard for managing clusters. Additionally, it covers log retrieval, interactive troubleshooting with 'kubectl exec', and the significance of audit logs for security and compliance.

Uploaded by

teguhteja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views24 pages

The Kubernetes Troubleshooting Handbook

Uploaded by

teguhteja

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

The Kubernetes Troubleshooting

Handbook
Debugging Tips, Tools, and Techniques

Piotr · Follow
Published in ITNEXT · 16 min read · 1 day ago

Introduction

Debugging Kubernetes applications can feel like navigating a labyrinth. With its
distributed nature and myriad of components, identifying and resolving issues in
Kubernetes requires a robust set of tools and techniques.

In this blog we will explore various techniques and tools to help with troubleshooting
and debugging Kubernetes. Whether you’re an experienced Kubernetes user or just
getting started, this guide will provide valuable insights into efficient debugging
practices.

Although in this blog I do my best to compile useful advice based on my experience,

the authoritative source of truth should always be the official Kubernetes
documentation.

Analyzing Pod Lifecycle Events

Understanding the lifecycle of a pod is crucial for debugging and maintaining
applications running in Kubernetes. Each pod goes through several phases from
creation to termination, and analyzing these events can help you identify and resolve
issues.

Pod Lifecycle Phases

A pod in Kubernetes goes through the following phases:

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 1/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

Pods Lifecycle Events

Using kubectl get and kubectl describe

To analyze the lifecycle events of a pod, you can use the kubectl get and kubectl

describe commands.

The kubectl get command provides a high-level overview of the status of pods:

kubectl get pods

Output:

NAME READY STATUS RESTARTS AGE

web-server-pod 1/1 Running 0 5m
db-server-pod 1/1 Pending 0 2m
cache-server-pod 1/1 Completed 1 10m

This output shows the current status of each pod, which can help you identify pods
that need further investigation.

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 2/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

The kubectl describe command provides detailed information about a pod,

including its lifecycle events:

kubectl describe pod <pod-name>

Output snippet:

Name: web-server-pod
Namespace: default
Node: node-1/192.168.1.1
Start Time: Mon, 01 Jan 2024 10:00:00 GMT
Labels: app=web-server
Status: Running
IP: 10.244.0.2
Containers:
web-container:
Container ID: docker://abcdef123456
Image: nginx:latest
State: Running
Started: Mon, 01 Jan 2024 10:01:00 GMT
Ready: True
Restart Count: 0
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned default/web-
Normal Pulled 9m kubelet, node-1 Container image "nginx:latest" alr
Normal Created 9m kubelet, node-1 Created container web-container
Normal Started 9m kubelet, node-1 Started container web-container

Analyzing Pod Events

The Events section in the kubectl describe output provides a chronological log of
significant events that have occurred for the pod. These events can help you
understand the lifecycle transitions and identify issues such as:

Scheduling Delays: Delays in scheduling the pod can indicate resource

constraints or issues with the scheduler.

Image Pull Errors: Failures in pulling container images can indicate network
issues or problems with the container registry.

Container Crashes: Repeated container crashes can be diagnosed by examining

the events leading up to the crash.

Kubernetes Events and Audit Logs

Kubernetes generates cluster wide events resources kind: Event which we can use
for a quick overview of what’s happening on the cluster.

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 3/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

Audit logs kind: Policy on the other hand are useful for ensuring compliance and
securtity on the cluster. They can show login attempts, pod priviledges escalation and
more.

Kubernetes Events

Kubernetes events provide a timeline of significant occurrences within your cluster,

such as pod scheduling, container restarts, and errors. They are useful for
understanding the state transitions and identifying the root causes of issues.

Viewing Events

To view events in your cluster, use the kubectl get events command:

kubectl get events

Output example:

LAST SEEN TYPE REASON OBJECT

12s Normal Scheduled pod/web-server-pod
10s Normal Pulling pod/web-server-pod
8s Normal Created pod/web-server-pod
7s Normal Started pod/web-server-pod
5s Warning BackOff pod/db-server-pod

Filtering Events

You can filter events to focus on specific namespaces, resource types, or time
periods. For example, to view events related to a specific pod:

kubectl get events --field-selector involvedObject.name=web-server-pod

Describing Resources

The kubectl describe command includes events in its output, providing detailed
information about a specific resource along with its event history:

kubectl describe pod web-server-pod

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 4/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

Output snippet:

Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 10m default-scheduler Successfully assigned default/web-
Normal Pulled 9m kubelet, node-1 Container image "nginx:latest" alr
Normal Created 9m kubelet, node-1 Created container web-container
Normal Started 9m kubelet, node-1 Started container web-container

Kubernetes Audit Logs

Audit logs provide a detailed record of all API requests made to the Kubernetes API
server, including the user, the action performed, and the outcome. They are essential
for security auditing and compliance.

Enabling Audit Logging

To enable audit logging, configure the API server with the appropriate flags and audit
policy. Here’s an example of an audit policy configuration:

# audit-policy.yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: Metadata
resources:
- group: ""
resources: ["pods"]
- level: RequestResponse
users: ["admin"]
verbs: ["update", "patch"]
resources:
- group: ""
resources: ["configmaps"]

Configuring the API Server

Specify the audit policy file and log file location when starting the API server:

kube-apiserver --audit-policy-file=/etc/kubernetes/audit-policy.yaml --audit-log

Viewing Audit Logs

Audit logs are typically written to a file. You can use standard log analysis tools to
view and filter the logs. Here’s an example of an audit log entry:

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 5/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "Metadata",
"auditID": "12345",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/default/pods",
"verb": "create",
"user": {
"username": "admin",
"groups": ["system:masters"]
},
"sourceIPs": ["192.168.1.1"],
"objectRef": {
"resource": "pods",
"namespace": "default",
"name": "web-server-pod"
},
"responseStatus": {
"metadata": {},
"code": 201
},
"requestReceivedTimestamp": "2024-01-01T12:00:00Z",
"stageTimestamp": "2024-01-01T12:00:01Z"
}

Kubernetes Dashboard
The Kubernetes Dashboard is a web-based UI that provides an easy way to manage
and troubleshoot your Kubernetes cluster. It allows you to visualize cluster resources,
deploy applications, and perform various administrative tasks.

Installing the Kubernetes Dashboard

Please refer to the kubernetes documentaiton for details on installing and accessing
the dashboard.

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 6/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

https://kubernetes.io/docs/tasks/access-application-cluster/web-ui-dashboard/

Using the Dashboard

The Dashboard provides various features to help manage and troubleshoot your
Kubernetes cluster:

1. Cluster Overview: View the overall status of your cluster, including nodes,
namespaces, and resource usage.

2. Workloads: Monitor and manage workloads, such as Deployments, ReplicaSets,

StatefulSets, and DaemonSets.

3. Services and Ingress: Manage services and ingress resources to control network
traffic.

4. Config and Storage: Manage ConfigMaps, Secrets, PersistentVolumeClaims,

and other storage resources.

5. Logs and Events: View logs and events for troubleshooting and auditing
purposes.

Monitoring Resource Usage

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 7/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

Monitoring resource usage helps you understand how your applications consume
resources and identify opportunities for optimization.

Tools for Monitoring

kubectl top: Provides real-time resource usage metrics.

Prometheus: Collects and stores metrics for detailed analysis.

Grafana: Visualizes metrics and provides dashboards for monitoring.

Using kubectl top

The kubectl top command shows the current CPU and memory usage of pods and
nodes.

kubectl top pods

kubectl top nodes

Example output:

NAME CPU(cores) MEMORY(bytes)

my-app-pod 100m 120Mi

Using kubectl logs

When it comes to debugging Kubernetes applications, kubectl logs is one of the
most essential tools in our arsenal. This command helps to retrieve logs from a
specific container in a pod, allowing you to diagnose and troubleshoot issues
effectively.

Basic Usage

The simplest way to retrieve logs from a pod is by using the kubectl logs command
followed by the pod name and namespace. Here’s a basic example for a pod running
in a default namespace:

kubectl logs <pod-name>

This command fetches the logs from the first container in the specified pod. If your
pod has multiple containers, you need to specify the container name as well:

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 8/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

kubectl logs <pod-name> -c <container-name>

Real-time Logs with -f Flag

To stream logs in real-time, similar to tail -f in Linux, use the -f flag:

kubectl logs -f <pod-name>

This is particularly useful for monitoring logs as your application runs and observing
the output of live processes.

There are projects that enchance the log tailing with additional capabilities, for
example stern.

Retrieving Previous Logs

If a pod has restarted, you can view the logs from the previous instance using the --

previous flag:

kubectl logs <pod-name> --previous

This helps in understanding what caused the pod to restart by examining the logs
before the failure.

Filtering Logs with Labels

You can also filter logs from pods that match certain labels using kubectl along
with jq for advanced filtering:

kubectl get pods -l <label-selector> -o json | jq -r '.items[] | .metadata.name'

Replace <label-selector> with your specific labels, such as app=myapp .

Combining with Other Tools

You can combine kubectl logs with other Linux commands to enhance your
debugging process. For example, to search for a specific error message in the logs,
you can use grep :

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 9/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

kubectl logs web-server-pod | grep "Error"

For a continuous search in real-time logs:

kubectl logs -f web-server-pod | grep --line-buffered "Error"

Practical Tips

Log Rotation and Retention: Ensure that your application handles log rotation to
prevent the logs from consuming excessive disk space.

Structured Logging: Use structured logging (e.g., JSON format) to make it easier to
parse and analyze logs using tools like jq .

Centralized Logging: Consider setting up a centralized logging system (e.g.,

Elasticsearch, Fluentd, and Kibana — EFK stack) to aggregate and search logs from
all your Kubernetes pods.

Using kubectl exec for Interactive Troubleshooting

kubectl exec allows us to execute commands directly inside a running container.
This is particularly useful for interactive troubleshooting, enabling the inspection of
the container’s environment, run diagnostic commands, and perform real-time fixes.

Basic Usage

The basic syntax for kubectl exec is as follows:

kubectl exec <pod-name> -- <command>

To execute a command in a specific container within a pod, use the -c flag. Note
that this will execute a command and immediatelly exit the container.

kubectl exec <pod-name> -c <container-name> -- <command>

Running an Interactive Shell

One of the most common uses of kubectl exec is to open an interactive shell
session within a container. This allows you to run multiple commands interactively.
Here’s how to do it:

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 10/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

kubectl exec -it <pod-name> -- /bin/bash

For containers using sh instead of bash :

kubectl exec -it <pod-name> -- /bin/sh

Example: Inspecting Environment Variables

To check the environment variables inside a container, you can use

the env command:

kubectl exec <pod-name> -- env

If you need to check environment variables in a specific container:

kubectl exec <pod-name> -c <container-name> -- env

Example: Checking Configuration Files

Suppose you need to inspect a configuration file inside the container. You can
use cat or any text editor available inside the container:

kubectl exec <pod-name> -- cat /path/to/config/file

For a specific container:

kubectl exec <pod-name> -c <container-name> -- cat /path/to/config/file

Copying Files to and from Containers

If you don’t have a binary you need inside a container, it’s easy to files to and from
containers using kubectl cp . For example, to copy a file from your local machine to
a container:

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 11/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

kubectl cp /local/path/to/file <pod-name>:/container/path/to/file

To copy a file from a container to your local machine:

kubectl cp <pod-name>:/container/path/to/file /local/path/to/file

Practical Tips

Use the -i and -t Flags: The -i flag makes the session interactive, and the -

t flag allocates a pseudo-TTY. Together, -it enables a fully interactive session.

Run as a Specific User: Use the --user flag to execute commands as a specific
user inside the container, if required.

kubectl exec --user=<username> -it <pod-name> -- /bin/bash

Security Considerations: Be cautious when running kubectl exec with elevated

privileges. Ensure you have appropriate RBAC (Role-Based Access Control) policies
in place to prevent unauthorized access.

Node-Level Debugging with kubectl debug

Most of the debugging techniquest focus on the application level, however it’s also
possible to debug a specific kubernetes node using kubectl debug node command.

Node-level debugging is crucial for diagnosing issues that affect the Kubernetes
nodes themselves, such as resource exhaustion, misconfigurations, or hardware
failures.

This way the debugging Pod can access the root filesystem of the Node, mounted
at /host in the Pod.

Create a Debugging Session:

Use the kubectl debug command to start a debugging session on a node. This
command creates a pod running a debug container on the specified node.

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 12/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

kubectl debug node/<node-name> -it --image=busybox

Replace <node-name> with the name of the node you want to debug. The -it flag
opens an interactive terminal, and --image=busybox specifies the image to use for
the debug container.

For more details, refer to the official Kubernetes documentation on node-level

debugging.

Application-Level Debuging with Debug Containers

For more complex issues, consider using a debug container with pre-installed tools.
There are a lot of good docker images with tooling and scripts for debugging, one
that stands out to me is https://github.com/nicolaka/netshoot. It can quicky be created
using:

kubectl run tmp-shell --rm -i --tty --image nicolaka/netshoot

Example: Using the debug container as a sidecar

apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-netshoot
labels:
app: nginx-netshoot
spec:
replicas: 1
selector:
matchLabels:
app: nginx-netshoot
template:
metadata:
labels:
app: nginx-netshoot
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
- name: netshoot
image: nicolaka/netshoot
command: ["/bin/bash"]
args: ["-c", "while true; do ping localhost; sleep 60;done"]

Apply the configuration:

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 13/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

kubectl apply -f debug-pod.yaml

Practical Tips

Set Restart Policies: Ensure that your pod specifications have appropriate restart
policies to handle different failure scenarios.

Automated Monitoring: Set up automated monitoring and alerting for critical issues
such as CrashLoopBackOff using Prometheus and Alertmanager.

Ephemeral Containers for Debugging

Ephemeral containers are temporary and created specifically for debugging purposes.
They are useful for running diagnostic tools and commands without affecting the
running application. This chapter will explore how to create and use ephemeral pods
for interactive troubleshooting in Kubernetes.

Why Use Ephemeral Pods?

Isolation: Debugging in an isolated environment prevents accidental changes to
running applications.

Tool Availability: Allows the use of specialized tools that may not be present in
the application container.

Temporary Nature: These pods can be easily created and destroyed as needed,
without leaving residual impact on the cluster.

Creating Ephemeral Pods

There are several ways to create ephemeral pods in Kubernetes. One common
method is to use the kubectl run command.

Example: Creating an Ephemeral Pod

Using kubectl run :

kubectl debug mypod -it --image=nicolaka/netshoot

This command creates a debug pod using the netshoot image and opens an
interactive shell.

Practical Tips for Using Ephemeral Pods

Tool Availability: Ensure the debug container image includes all necessary tools for
troubleshooting, such as curl , netcat , nslookup , df , top , and others.

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 14/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

Security Considerations: Be mindful of security when creating ephemeral pods.

Ensure they have limited access and are used by authorized personnel only.

Example: Advanced Debugging with Custom Debug Container

Let’s walk through an example of using a custom debug container for advanced
debugging tasks.

Create an Ephemeral Pod with Custom Debug Container:

kubectl debug -it redis5 --image=nicolaka/netshoot

Defaulting debug container name to debugger-v4hfv.

If you don't see a command prompt, try pressing enter.
dP dP dP
88 88 88
88d888b. .d8888b. d8888P .d8888b. 88d888b. .d8888b. .d8888b. d8888P
88' `88 88ooood8 88 Y8ooooo. 88' `88 88' `88 88' `88 88
88 88 88. ... 88 88 88 88 88. .88 88. .88 88
dP dP `88888P' dP `88888P' dP dP `88888P' `88888P' dP

Welcome to Netshoot! (github.com/nicolaka/netshoot)

Version: 0.13

redis5  ~ 

Run Diagnostic Commands:

Inside the debug container we can run various commands.

# Check DNS resolution

nslookup kubernetes.default.svc.cluster.local

Server: 10.96.0.10
Address: 10.96.0.10#53

Name: kubernetes.default.svc.cluster.local
Address: 10.96.0.1

# Test network connectivity

curl http://my-service:8080/healthBy using ephemeral pods, you can effectively d

Handling DNS and Network Issues

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 15/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

Now we will go through 2 common troubleshooting scenarios; DNS issues and

stateful pods debugging. Let’s see what we have learned in action.

Common Network Issues

DNS Resolution Failures: Issues resolving service names to IP addresses.

Service Unreachable: Services are not accessible within the cluster.

Pod Communication Issues: Pods cannot communicate with each other.

Network Policy Misconfigurations: Incorrect network policies blocking traffic.

Tools and Commands for Troubleshooting

kubectl exec: Run commands in a container to diagnose network issues. nslookup:
Check DNS resolution. ping: Test connectivity between pods and services. curl:
Verify HTTP connectivity and responses. traceroute: Trace the path packets take to
reach a destination.

Example: Diagnosing a DNS Resolution Issue

Let’s walk through an example of diagnosing a DNS resolution issue for a pod
named my-app-pod trying to reach a service my-db-service .

Check DNS Resolution:

kubectl exec -it my-app-pod -- nslookup my-db-service

Alternatively we can use debug pod or ephemeral containers.

Output indicating a problem:

Server: 10.96.0.10
Address:10.96.0.10#53
** server can't find my-db-service: NXDOMAIN

Check CoreDNS Logs:

Inspect the logs of CoreDNS pods to identify any DNS resolution issues.

kubectl logs -l k8s-app=kube-dns -n kube-system

Look for errors or warnings indicating DNS resolution failures.

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 16/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

Verify Service and Endpoints:

Ensure that the service and endpoints exist and are correctly configured.

kubectl get svc my-db-service

kubectl get endpoints my-db-service

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

my-db-serviceClusterIP 10.96.0.11 <none> 5432/TCP 1h

NAME ENDPOINTS AGE

my-db-service10.244.0.5:5432 1h

Restart CoreDNS Pods:

Restart CoreDNS pods to resolve potential transient issues.

kubectl rollout restart deployment coredns -n kube-system

Verify DNS Resolution Again:

After resolving the issue, verify DNS resolution again:

kubectl exec -it my-app-pod -- nslookup my-db-service

Expected output:

Server: 10.96.0.10
Address:10.96.0.10#53

Name: my-db-service.default.svc.cluster.local
Address:10.96.0.11

Practical Tips

Use Network Debug Containers: Use network debug containers

like nicolaka/netshoot for comprehensive network troubleshooting.

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 17/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

kubectl run netshoot --rm -it --image nicolaka/netshoot -- /bin/bash

Monitor Network Metrics: Use Prometheus and Grafana to monitor network

metrics and set up alerts for network issues.

Implement Redundancy: Configure redundant DNS servers and failover

mechanisms to enhance network reliability.

Debugging Stateful Applications

Stateful applications in Kubernetes require special considerations for debugging due
to their reliance on persistent storage and consistent state across restarts. This section
will explore techniques for handling and debugging issues specific to stateful
applications.

What are Stateful Applications?

Stateful applications maintain state information across sessions and restarts, often
using persistent storage. Examples include databases, message queues, and other
applications that require data persistence.

Common Issues in Stateful Applications

Persistent Storage Issues: Problems with PVCs or PVs can lead to data loss or
unavailability.

Pod Start-up Failures: Errors during pod initialization due to state

dependencies.

Network Partitioning: Network issues affecting communication between

stateful pods.

Data Consistency Problems: Inconsistent data across replicas or restarts.

Example: Debugging a MySQL StatefulSet

Let’s walk through an example of debugging a MySQL StatefulSet named my-mysql .

Inspect the StatefulSet:

kubectl describe statefulset my-mysql

Output snippet:

Name: my-mysql
Namespace: default

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 18/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT
Selector: app=my-mysql
Replicas: 3 desired | 3 total
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 1m statefulset-controller create Pod my-mysql-0
Normal SuccessfulCreate 1m statefulset-controller create Pod my-mysql-1
Normal SuccessfulCreate 1m statefulset-controller create Pod my-mysql-2

Check Persistent Volume Claims:

kubectl get pvc

kubectl describe pvc data-my-mysql-0

Output snippet:

Name: data-my-mysql-0
Namespace: default
Status: Bound
Volume: pvc-1234abcd-56ef-78gh-90ij-klmnopqrstuv
...

Check Pod Logs:

kubectl logs my-mysql-0

Output snippet:

2024-01-01T00:00:00.000000Z 0 [Note] mysqld (mysqld 8.0.23) starting as process

2024-01-01T00:00:00.000000Z 1 [ERROR] InnoDB: Unable to lock ./ibdata1 error: 11

Execute Commands in Pods:

kubectl exec -it my-mysql-0 -- /bin/sh

Inside the pod:

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 19/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

# Check mounted volumes

df -h
# Verify MySQL data directory
ls -l /var/lib/mysql
# Check MySQL status
mysqladmin -u root -p status

Check Network Connectivity:

kubectl exec -it my-mysql-0 -- ping my-mysql-1.my-mysql.default.svc.cluster.loca

Output snippet:

PING my-mysql-1.my-mysql.default.svc.cluster.local (10.244.0.6): 56 data bytes

64 bytes from 10.244.0.6: icmp_seq=0 ttl=64 time=0.047 ms

Advanced Debugging Techniques

Advanced debugging techniques in Kubernetes involve using specialized tools and
strategies to diagnose and resolve complex issues. This chapter will cover tracing
instrumentation and remote debugging.

Profiling with Jaeger

Jaeger is an open-source, end-to-end distributed tracing tool that helps monitor and
troubleshoot transactions in complex distributed systems. Profiling with Jaeger can
provide insights into the performance of your microservices and help identify latency
issues.

You can install Jaeger in your Kubernetes cluster using the Jaeger Operator or Helm.

helm repo add jaegertracing https://jaegertracing.github.io/helm-charts

helm repo update
helm install jaeger jaegertracing/jaeger

Instrument Your Application:

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 20/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

Ensure your application is instrumented to send tracing data to Jaeger. This typically
involves adding Jaeger client libraries to your application code and configuring them
to report to the Jaeger backend.

Example in a Go application:

import (
"github.com/opentracing/opentracing-go"
"github.com/uber/jaeger-client-go"
"github.com/uber/jaeger-client-go/config"
)
func initJaeger(service string) (opentracing.Tracer, io.Closer) {
cfg := config.Configuration{
ServiceName: service,
Sampler: &config.SamplerConfig{
Type: "const",
Param: 1,
},
Reporter: &config.ReporterConfig{
LogSpans: true,
LocalAgentHostPort: "jaeger-agent.default.svc.cluster.local:6831",
},
}
tracer, closer, _ := cfg.NewTracer()
opentracing.SetGlobalTracer(tracer)
return tracer, closer
}

Access the Jaeger UI to view and analyze traces.

kubectl port-forward svc/jaeger-query 16686:16686

Open http://localhost:16686 in your browser.

Remote Debugging with mirrord

Mirrord is an open-source tool that enables remote debugging of Kubernetes services

by running local processes in the context of your Kubernetes cluster and remote
infrastructure.

Setting Up mirrord

curl -fsSL https://raw.githubusercontent.com/metalbear-co/mirrord/main/scripts/i

Connect to Your Cluster:

Start a mirrord session to connect your local environment to your Kubernetes cluster.
https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 21/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

mirrord connect

Swap Deployment:

Use mirrord to swap a deployment in your cluster with your local service.

mirrord exec --target-namespace devops-team --target deployment/foo-app-deployme

This command redirects traffic, environment variables, and file operations from your
Kubernetes cluster to your local machine, allowing you to debug the service as if it
were running locally.

Once the mirrord session is set up, you can use your favorite debugging tools and
IDEs to debug the service running on your local machine.

Set Breakpoints: Use your IDE to set breakpoints and step through the code.

Inspect Variables: Inspect variables and application state to identify issues.

Make Changes: Make code changes and immediately see the effects without
redeploying to the cluster.

For a detailed example and more information on using mirrord for debugging,
read this blog post.

Additional Tools
In addition to the core Kubernetes commands and open-source tools, there are
several other tools available that can enhance your troubleshooting capabilities
across various categories. Here are a few noteworthy tools:

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 22/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

Closing Thoughts
Debugging Kubernetes applications can be a complex and challenging task, but with
the right tools and techniques, it becomes much more manageable.

Remember, effective debugging is not just about resolving issues as they arise but
also about proactive monitoring, efficient resource management, and a deep
understanding of your application’s architecture and dependencies.

By implementing the strategies and best practices outlined in this guide, you can
build a robust debugging framework that empowers you to quickly identify,
diagnose, and resolve issues, ensuring the smooth operation of your Kubernetes
deployments.

Thanks for taking the time to read this post. I hope you found it interesting and
informative.

🔗 Connect with me on LinkedIn

🌐 Visit my Website
https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 23/24
7/20/24, 4:06 PM The Kubernetes Troubleshooting Handbook | by Piotr | Jul, 2024 | ITNEXT

📺 Subscribe to my YouTube Channel

Kubernetes Troubleshooting DevOps

https://itnext.io/the-kubernetes-troubleshooting-handbook-7596a1fdf2ff 24/24

Kubernetes Troubleshooting Guide
No ratings yet
Kubernetes Troubleshooting Guide
30 pages
K8 Notes
No ratings yet
K8 Notes
8 pages
CKA-Application Introspection and Debugging
No ratings yet
CKA-Application Introspection and Debugging
16 pages
DevOps Shack - 100 Common Kubernetes Errors and Solutions
No ratings yet
DevOps Shack - 100 Common Kubernetes Errors and Solutions
54 pages
Tutorials
No ratings yet
Tutorials
168 pages
Kubernetes Troubleshooting Guide
No ratings yet
Kubernetes Troubleshooting Guide
16 pages
Assignment 3
No ratings yet
Assignment 3
21 pages
Cka Kubernetes Application Developer Crash Course
No ratings yet
Cka Kubernetes Application Developer Crash Course
172 pages
From Kubectl Get Pods To Production-Grade Kubernetes Admin
100% (1)
From Kubectl Get Pods To Production-Grade Kubernetes Admin
44 pages
Kubernetes Troubleshooting
No ratings yet
Kubernetes Troubleshooting
16 pages
Kubernetes Deployment Troubleshooting
No ratings yet
Kubernetes Deployment Troubleshooting
22 pages
CKA Kubernetes Cheatsheet Guide
No ratings yet
CKA Kubernetes Cheatsheet Guide
9 pages
Kubernetes
No ratings yet
Kubernetes
7 pages
Kubernetes Tutorials
No ratings yet
Kubernetes Tutorials
151 pages
Kubernetes Tips and Tricks for DevOps
100% (1)
Kubernetes Tips and Tricks for DevOps
21 pages
Https Bestdotnettraining - Azureedge.net Documents Kubernetes 10 Logging and Monitoring 10 Logging and Monitoring
No ratings yet
Https Bestdotnettraining - Azureedge.net Documents Kubernetes 10 Logging and Monitoring 10 Logging and Monitoring
9 pages
Kubernetes Context and Configuration Guide
No ratings yet
Kubernetes Context and Configuration Guide
9 pages
Kubectl Cheat Sheet for Kubernetes
No ratings yet
Kubectl Cheat Sheet for Kubernetes
23 pages
Kubectl Cheat Sheet for Kubernetes
No ratings yet
Kubectl Cheat Sheet for Kubernetes
23 pages
50 Kubernetes Tips & Useful Tricks With Usecases Part-1,2,3
No ratings yet
50 Kubernetes Tips & Useful Tricks With Usecases Part-1,2,3
10 pages
Create Cluster Objects by Using Imperative Commands
No ratings yet
Create Cluster Objects by Using Imperative Commands
7 pages
k8s Scenario Based Questions With The Expected Answers-1
No ratings yet
k8s Scenario Based Questions With The Expected Answers-1
11 pages
Kubernetes Notes
No ratings yet
Kubernetes Notes
41 pages
50 Kubernetes Errors & Solutions
No ratings yet
50 Kubernetes Errors & Solutions
15 pages
50 Common Kubernetes Errors & Fixes
No ratings yet
50 Common Kubernetes Errors & Fixes
10 pages
55+ K8s Issues and Remediations You Should Be Aware of
No ratings yet
55+ K8s Issues and Remediations You Should Be Aware of
21 pages
Kubernetes Commands Cheat Sheet
No ratings yet
Kubernetes Commands Cheat Sheet
1 page
Kubernetes Notessss
No ratings yet
Kubernetes Notessss
15 pages
The Kubernetes Guidebook - Mastering Cloud-Native Orchestration From Fundamentals To Production
No ratings yet
The Kubernetes Guidebook - Mastering Cloud-Native Orchestration From Fundamentals To Production
176 pages
Kubernetes Minikube Tutorial Guide
No ratings yet
Kubernetes Minikube Tutorial Guide
175 pages
Kubernetes Cheat Sheet Overview
No ratings yet
Kubernetes Cheat Sheet Overview
20 pages
Kubernetes Outages
No ratings yet
Kubernetes Outages
44 pages
Kubernetes Command Cheat Sheet
No ratings yet
Kubernetes Command Cheat Sheet
12 pages
Kubernetes Cheat Sheet r1v1
100% (1)
Kubernetes Cheat Sheet r1v1
10 pages
Kubernetes
No ratings yet
Kubernetes
66 pages
52
No ratings yet
52
5 pages
Kubernetes Beginner S Guide 1735499097
No ratings yet
Kubernetes Beginner S Guide 1735499097
8 pages
SRE K8s Scenario Based Questions and Answers
No ratings yet
SRE K8s Scenario Based Questions and Answers
6 pages
Kubernetes Troubleshooting Guide
No ratings yet
Kubernetes Troubleshooting Guide
7 pages
Workload Management - Kubernetes
No ratings yet
Workload Management - Kubernetes
73 pages
Kubernetes
No ratings yet
Kubernetes
7 pages
Kubernetes For World PDF
No ratings yet
Kubernetes For World PDF
9 pages
Tutorials - Kubernetes
100% (2)
Tutorials - Kubernetes
155 pages
Kubernetes Error Troubleshooting Guide
No ratings yet
Kubernetes Error Troubleshooting Guide
3 pages
Kubernetes Commands for Interviews
No ratings yet
Kubernetes Commands for Interviews
5 pages
Kubernetes-Personal Notes-Naresh Kumar Chityala
No ratings yet
Kubernetes-Personal Notes-Naresh Kumar Chityala
9 pages
Kubernetes Interview Command Guide
No ratings yet
Kubernetes Interview Command Guide
3 pages
Kubernetes Deployment User Guide
No ratings yet
Kubernetes Deployment User Guide
20 pages
Kubernetes Troubleshooting - Dec24
No ratings yet
Kubernetes Troubleshooting - Dec24
20 pages
Start: Pods Are Running Correctly
No ratings yet
Start: Pods Are Running Correctly
1 page
Kubernetes - Sunil
No ratings yet
Kubernetes - Sunil
6 pages
Kubernetes Command Guide
No ratings yet
Kubernetes Command Guide
42 pages
Kubernetes Essentials Guide
100% (1)
Kubernetes Essentials Guide
239 pages
Docker vs Kubernetes: Key Differences
No ratings yet
Docker vs Kubernetes: Key Differences
34 pages
Kubernetes Setup Guide on Windows
No ratings yet
Kubernetes Setup Guide on Windows
10 pages
Tutorial Membuat RAG AI ChatBot API Dengan Python FastAPI Dan Open Source LLMs
No ratings yet
Tutorial Membuat RAG AI ChatBot API Dengan Python FastAPI Dan Open Source LLMs
41 pages
Understanding Large Language Models
No ratings yet
Understanding Large Language Models
1 page
Demystifying The Secrets of Docker by Empowering Containerization With Ease.X
No ratings yet
Demystifying The Secrets of Docker by Empowering Containerization With Ease.X
191 pages
Csi Data Pillar ZT
No ratings yet
Csi Data Pillar ZT
15 pages
Avoiding Critical Mistakes in Requirement Gathering Practical Steps To Ensure Accurate, Effective, and Outcome Aligned Requirements.1718500158522
No ratings yet
Avoiding Critical Mistakes in Requirement Gathering Practical Steps To Ensure Accurate, Effective, and Outcome Aligned Requirements.1718500158522
6 pages
How To Create A Generative AI App Using Gemini and Flutter
No ratings yet
How To Create A Generative AI App Using Gemini and Flutter
28 pages
Dividing Fractions: Student Name: - Score
No ratings yet
Dividing Fractions: Student Name: - Score
2 pages
Basic IP2 Win Tutorial
No ratings yet
Basic IP2 Win Tutorial
32 pages
Anwana
No ratings yet
Anwana
16 pages
Emb Vlsi
No ratings yet
Emb Vlsi
4 pages
Explain ANSYS Ncode
100% (2)
Explain ANSYS Ncode
34 pages
Model 435 437 Datasheet Sensor de Presion
No ratings yet
Model 435 437 Datasheet Sensor de Presion
7 pages
Qualitative Inorganic Analysis - New
No ratings yet
Qualitative Inorganic Analysis - New
10 pages
Math 5 DLP 9 - Finding The Greatest Common Factor of Two or More Numbers PDF
No ratings yet
Math 5 DLP 9 - Finding The Greatest Common Factor of Two or More Numbers PDF
8 pages
Catalogo Enxerto Pediatrico Gore
No ratings yet
Catalogo Enxerto Pediatrico Gore
12 pages
Groq LPU: Fast AI Inference Analysis
No ratings yet
Groq LPU: Fast AI Inference Analysis
8 pages
A Mathematical Simulation Model of A CH-47B Helicopter
No ratings yet
A Mathematical Simulation Model of A CH-47B Helicopter
136 pages
Diesel Engine Tuning Specifications 75071
No ratings yet
Diesel Engine Tuning Specifications 75071
7 pages
ITwin Technology
No ratings yet
ITwin Technology
20 pages
ETL Testing Goals and Strategies
No ratings yet
ETL Testing Goals and Strategies
3 pages
Understanding Abelian Groups in Mathematics
No ratings yet
Understanding Abelian Groups in Mathematics
13 pages
Line Segment Division Ratios
No ratings yet
Line Segment Division Ratios
12 pages
Technical Terms About Reinforcement
100% (1)
Technical Terms About Reinforcement
13 pages
12th Physics 1mrks
No ratings yet
12th Physics 1mrks
151 pages
Java Loops
No ratings yet
Java Loops
2 pages
Prometheus Part 12 Internals Storage & Security
No ratings yet
Prometheus Part 12 Internals Storage & Security
27 pages
Fog Computing for Smart Farming Solutions
No ratings yet
Fog Computing for Smart Farming Solutions
6 pages
Physics Continuity Equation Guide
No ratings yet
Physics Continuity Equation Guide
5 pages
SR en ISO 22232-1-2020 Examinări Nedistructive. Caracterizarea Şi Verificarea Echipamentului Pentru Examinare Cu Ultrasunete. Defectoscoape PDF
100% (2)
SR en ISO 22232-1-2020 Examinări Nedistructive. Caracterizarea Şi Verificarea Echipamentului Pentru Examinare Cu Ultrasunete. Defectoscoape PDF
52 pages
Mercedes-Benz WDB2110411B324751 AllSystemDTC 20250722141937
No ratings yet
Mercedes-Benz WDB2110411B324751 AllSystemDTC 20250722141937
6 pages
Understanding LSD Load Tables
No ratings yet
Understanding LSD Load Tables
4 pages
A6-Governing Scheme Write Up
No ratings yet
A6-Governing Scheme Write Up
4 pages
Seismic Isolation in Building Design
No ratings yet
Seismic Isolation in Building Design
311 pages
Golang Mysql Tutorial
No ratings yet
Golang Mysql Tutorial
3 pages
Data Communication: A Historical Overview
0% (1)
Data Communication: A Historical Overview
33 pages
SAP BODS Circulum PDF
No ratings yet
SAP BODS Circulum PDF
39 pages