Appearance
Kubernetes Resource Troubleshooting: CPU, Memory & OOMKilled Issues
Common commands for finding pods with high resource use
Primary commands used to identify pods consuming excessive resources.
To effectively troubleshoot high resource use, you must distinguish between live usage (what the pod is consuming right now) and resource reservations (what the pod has requested from the scheduler). Understanding kubectl top pod output columns (CPU(cores) and MEMORY(bytes)) is essential for effective troubleshooting.
1. Monitoring Live Usage (kubectl top)
The standard method for checking real-time resource consumption is kubectl top. This command queries the Metrics API (served by the Metrics Server add-on) to retrieve CPU and memory usage data,.
Basic usage: To view the resource usage of pods in the default namespace:
bash
kubectl top podsThis displays the current CPU (in cores/millicores) and Memory (in bytes) consumption.
Sorting by resource consumption: To quickly identify the heaviest consumers, you can sort the output. This is essential in namespaces with many pods.
bash
# Find the pods using the most CPU
kubectl top pods --sort-by=cpu
# Find the pods using the most Memory
kubectl top pods --sort-by=memoryDrilling down to containers: If a pod contains multiple containers (e.g., a service mesh sidecar or a log forwarder), the pod-level metric might hide which specific container is the culprit. You can inspect individual containers within a specific pod:
bash
kubectl top pod <pod-name> --containersNote: This command requires the Metrics Server to be installed and running in your cluster.
Checking Swap Usage: If your nodes and kubelet are configured to allow swap memory, you can view swap usage statistics to see if a pod is utilizing swap space:
bash
kubectl top pods --show-swapThis outputs a SWAP(bytes) column alongside CPU and Memory.
2. Auditing Resource Allocations (kubectl describe)
Sometimes "high resource use" isn't about active consumption, but about requests blocking other pods from scheduling. A pod requesting 10GB of RAM blocks that capacity even if it is currently using only 100MB.
To find pods that are reserving high amounts of resources on a specific node:
bash
kubectl describe node <node-name>Look for the "Non-terminated Pods" section in the output. It lists every pod on that node along with its CPU and Memory Requests and Limits, and the percentage of the node's total capacity they consume,.
Example Output Analysis:
text
Non-terminated Pods:
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits
--------- ---- ------------ ---------- --------------- -------------
default high-req-pod 500m (25%) 500m (25%) 128Mi (6%) 128Mi (6%)This allows you to identify pods that are "greedy" with their reservations, potentially causing cluster inefficiency.
3. Advanced Filtering with JSONPath
If you need to audit all pods across the entire cluster to find those with specific limit configurations (e.g., finding all pods with a memory limit greater than 1Gi), you can use kubectl get with custom formatting.
List pods with their resource limits:
bash
kubectl get pods --all-namespaces -o custom-columns="POD:.metadata.name,NAMESPACE:.metadata.namespace,CPU_LIMIT:.spec.containers[*].resources.limits.cpu,MEM_LIMIT:.spec.containers[*].resources.limits.memory"This provides a static view of the configuration (limits) rather than live usage, which is critical for capacity planning audits.
4. Understanding kubectl top Output Columns
When you execute kubectl top pod, the API server returns a tabular representation of the resource usage. The output strictly relies on three primary columns:
1. NAME
This column simply displays the name of the Pod. Because Pods are ephemeral entities, their names are often generated dynamically by controllers (like a Deployment or ReplicaSet) and include a random hash.
2. CPU(cores)
This column reports the average CPU core usage across all containers within the Pod.
- How it is calculated: The value is derived by calculating the rate of change over a cumulative CPU time counter provided by the underlying Linux or Windows kernel. It represents the average consumption over a specific, short time window (typically 30 seconds) rather than an instantaneous spike.
- What it means: If the output shows that your Pod is using
1CPU, it means the Pod is fully consuming the equivalent of 1 physical CPU core, 1 cloud provider vCPU, or 1 hardware hyper-thread. - Unit (Millicores
m): CPU is always expressed as an absolute quantity. To allow for fractional CPU usage, Kubernetes uses the suffixm. For example,100mmeans 0.1 CPU, or 10% of a single core. Precision finer than1mis not allowed.
3. MEMORY(bytes)
This column reports the memory usage of the Pod. However, it does not simply report all allocated memory; it specifically reports the working set memory at the instant the metric was scraped.
- How it is calculated: The kubelet reads memory statistics directly from the cgroupfs hierarchy.
- What working set means: The working set is the amount of memory currently in-use that the kernel cannot easily free up under memory pressure. It includes anonymous memory (memory actively used by the application's processes) as well as some cached, file-backed memory.
- Unit (Mebibytes
Mi): Memory is measured in bytes, but is typically displayed using power-of-two suffixes such asKi(kibibytes),Mi(mebibytes), orGi(gibibytes). For example,400Mirequests 400 mebibytes.
Example Output:
text
NAME CPU(cores) MEMORY(bytes)
nginx-deployment-7d64c9f5b4-abc12 50m 128Mi
nginx-deployment-7d64c9f5b4-def34 45m 132Mi
redis-cache-6f8b9d7c5a-ghi56 974m 512Mi
postgres-db-5c7d8e9f6b-jkl78 200m 1Gi5. Advanced Filtering Techniques
Beyond the basic sorting commands, you can use advanced filters to find specific resource consumption patterns.
Filter Pods by Node
To check resource usage for pods running on a specific node:
bash
# Get pods on specific node
kubectl get pods --all-namespaces --field-selector spec.nodeName=worker-1 -o name
# Check their metrics
kubectl top pod -A | grep -f <(kubectl get pods -A --field-selector spec.nodeName=worker-1 -o jsonpath='{.items[*].metadata.name}')Find Pods Exceeding Thresholds
bash
# Find pods using more than 500m CPU
kubectl top pod --all-namespaces | awk '$3 ~ /[0-9]+m/ && $3+0 > 500 {print}'
# Find pods using more than 1Gi memory
kubectl top pod -A | awk '$4 ~ /Gi/ {print}'
# Find pods using more than 500Mi
kubectl top pod -A | awk 'NR==1 || ($4 ~ /Mi/ && $4+0 > 500) || $4 ~ /Gi/ {print}'Filter by Label Selector
bash
# Get pods with specific label
kubectl top pod -l app=nginx
# Multiple labels
kubectl top pod -l app=nginx,env=production
# Exclude certain labels
kubectl top pod -l app=nginx,env!=stagingExport to CSV for Analysis
bash
# Export to CSV format
kubectl top pod -A | tr -s ' ' ',' > pod-metrics.csv
# With custom headers
echo "NAMESPACE,NAME,CPU,MEMORY" > pod-metrics.csv
kubectl top pod -A | tail -n +2 | tr -s ' ' ',' >> pod-metrics.csv6. Real-Time Monitoring Workflows
Watch Resource Usage Over Time
bash
# Refresh every 2 seconds
watch -n 2 "kubectl top pod -l app=myapp"
# Monitor for memory leaks
while true; do
echo "=== $(date) ==="
kubectl top pod -n production | grep myapp
sleep 300 # Check every 5 minutes
doneCompare Usage vs Limits
bash
# Show configured limits
kubectl get pod -o custom-columns=NAME:.metadata.name,CPU_LIMIT:.spec.containers[*].resources.limits.cpu,MEM_LIMIT:.spec.containers[*].resources.limits.memory
# Compare with actual usage
kubectl top pod7. Interpreting High CPU vs. CPU Throttling
When a container's CPU usage approaches its limit, the Linux kernel enforces this boundary using the Completely Fair Scheduler (CFS) quota mechanism.
Because CPU is a "compressible" resource, the kernel does not kill the application when it hits the limit. Instead, it physically restricts the process's access to the CPU during each scheduling time slice, a process known as CPU throttling.
Practical Scenario: CPU Throttling Suppose you deploy a container with a CPU request of 500m (0.5 CPU) and a strict limit of 1 CPU. You configure the application to run a stress test designed to consume 2 CPUs. If you run kubectl top pod, you will see an output like this:
text
NAME CPU(cores) MEMORY(bytes)
cpu-demo 974m 128MiNotice that the CPU usage is 974m (approaching 1 core), not the 2000m the application is actually trying to use. This is the hallmark signature of CPU throttling. The application is starved for CPU cycles because the kernel is strictly enforcing the 1 CPU (1000m) limit. The process will experience high latency and slow execution, but it will remain in a Running state.
8. Interpreting High Memory vs. OOMKilled
Unlike CPU, memory is an "incompressible" resource. If a container tries to allocate memory beyond its configured limit, the system cannot simply slow the application down.
When the actual memory usage (the working set reported by kubectl top) hits the hard memory limit configured in the cgroup, the Linux kernel's Out-Of-Memory (OOM) subsystem intervenes to protect the node. It identifies the process attempting the illegal allocation and forcefully terminates it.
Practical Scenario: Memory Leak Suppose you configure a container with a memory request of 50Mi and a limit of 100Mi. Due to a memory leak, the application attempts to allocate 250Mi of RAM. If you repeatedly run kubectl top pod, you will observe the MEMORY(bytes) column rapidly climbing toward 104857600 bytes (100Mi). The moment it breaches that limit, the kubectl top output for that Pod will temporarily vanish or reset as the container dies.
To confirm why the usage dropped, you must transition from kubectl top to kubectl describe pod:
text
State: Terminated
Reason: OOMKilled
Exit Code: 137The OOMKilled reason definitively proves that the container's working set memory exceeded its declarative limits, forcing the kubelet to terminate and restart the container.
9. Interpreting Storage Latency with PSI (Pressure Stall Information)
While CPU and Memory are easily tracked via kubectl top, I/O bottlenecks are historically difficult to identify. Standard metrics usually track I/O throughput (IOPS or throughput bytes), which fail to indicate if an application is suffering from latency because it is blocked waiting for the disk subsystem to respond.
Pressure Stall Information (PSI) identifies these hidden bottlenecks by measuring the time processes spend hopelessly waiting for resources.
PSI is a feature of the Linux kernel (leveraging cgroup v2) exposed via the Kubelet's cAdvisor endpoint:
container_pressure_io_stalled_seconds_total: The total time processes in the container were completely stalled (unable to run) because they were waiting for I/O operations to complete.container_pressure_io_waiting_seconds_total: The time processes spent in a wait state due to I/O contention.
Practical Scenario: Hidden Latency Suppose an application reports low CPU usage and moderate memory consumption but has extremely high API request latency.
- Without PSI: An operator might mistakenly assume the application code is inefficient or waiting on an external network call.
- With PSI: If you query the metrics and see
container_pressure_io_stalled_seconds_totalrapidly increasing, it definitively proves the delay is caused by the storage subsystem. The application is literally "stalled" waiting for data to be written to or read from the disk. This confirms that the underlying StorageClass (AWS EBS, GCP PD) lacks the required IOPS or bandwidth limits for the workload.
PSI Requirements
To leverage PSI metrics in Kubernetes, your worker nodes must run Linux kernel 4.20+, be configured to use cgroup v2, and have the KubeletPSI feature gate enabled in the kubelet configuration.
Summary of Commands
| Goal | Command | Data Source |
|---|---|---|
| Find active CPU hogs | kubectl top pods --sort-by=cpu | Metrics API |
| Find active Memory hogs | kubectl top pods --sort-by=memory | Metrics API |
| Inspect specific pod usage | kubectl top pod <name> --containers | Metrics API |
| Check Swap usage | kubectl top pods --show-swap | Kubelet/Summary API |
| Filter by node | kubectl top pod -A | grep -f <(kubectl get pods -A --field-selector spec.nodeName=worker-1 -o jsonpath='{.items[*].metadata.name}') | Metrics API |
| Filter by threshold | kubectl top pod -A | awk '$3+0 > 500' | Metrics API |
| Filter by labels | kubectl top pod -l app=nginx | Metrics API |
| Find high Reservations | kubectl describe node <node-name> | API Server / Scheduler |