Appearance
How do I trace container logs to their source on the node filesystem?
Here is a breakdown of logging and monitoring in a Kubernetes environment, focusing on where data lives and how to access it using Linux primitives.
1. Where Container Logs Live
Kubernetes separates "system component" logs from "application" logs. Application logs (Standard Output/Error) are captured by the container runtime and stored on the node's filesystem.
/var/log/pods/: This is the primary storage location for container logs managed by the kubelet.- Structure: The directory structure follows the pattern
namespace_pod_uid/container_name/retry_count.log. - Mechanism: The kubelet instructs the Container Runtime (via CRI) to write logs to this location. When you run
kubectl logs, the kubelet reads directly from these files on the node.
- Structure: The directory structure follows the pattern
/var/log/containers/: This directory typically contains symbolic links (symlinks) pointing to the actual log files in/var/log/pods/.- Purpose: These symlinks exist primarily for backward compatibility with older log collection agents that expect logs in a flat list format like
pod_name_namespace_container_id.log.
- Purpose: These symlinks exist primarily for backward compatibility with older log collection agents that expect logs in a flat list format like
- Container Runtime Logs:
- The container runtime (e.g.,
containerd,CRI-O) and thekubeletitself do not write to/var/log/pods. Instead, they write to the host operating system's native logger. On systemd-based systems, this is the systemd journal.
- The container runtime (e.g.,
2. Systemd Journal Integration
System components that run as native system services (daemons), such as the kubelet and the container runtime, log to journald rather than files.
- Accessing Logs: You use
journalctlto view these logs on the node.- Kubelet:
journalctl -u kubelet - Runtime:
journalctl -u containerdorjournalctl -u crio
- Kubelet:
- Log Rotation: On systemd systems, log rotation is handled by
journaldconfiguration. If systemd is not present, these components may write to.logfiles in/var/log/, which requires external log rotation (e.g., vialogrotate) to prevent disk exhaustion.
How do I use dmesg to find kernel-level OOM kills?
The dmesg command prints the message buffer of the kernel. In a Kubernetes context, this is critical for diagnosing issues that happen at the operating system level, below the container runtime.
- Usage: You can access these logs by running
dmesgon the node, or by reading/var/log/kern.log(if available on the distro). - What to look for:
- Hardware faults: Disk I/O errors or network card failures.
- Security denials: AppArmor or SELinux denials often log verbose details here regarding what action was blocked.
- OOM Kills: The kernel's decision to kill a process due to memory starvation.
Reading Kernel Logs for OOM Kills
When a container exceeds its memory limit, the Linux kernel (not Kubernetes directly) kills the process. This is known as an OOM (Out of Memory) Kill.
- The Mechanism:
- Memory Limit Enforcement: Kubelet configures cgroups limits. If a container tries to allocate more memory than its limit, the kernel invokes the OOM Killer.
- OOM Score: The kubelet assigns an
oom_score_adjto containers based on their QoS class (Guaranteed, Burstable, BestEffort) to influence which process the kernel kills first during system-wide pressure.
- Identifying an OOM Kill:
- Pod Status:
kubectl get podshows the statusOOMKilled. - Kernel Log Message: Running
dmesgor checking/var/log/kern.logon the node will show a message similar to:Memory cgroup out of memory: Kill process <PID> (stress) score <SCORE> or sacrifice child.
- This confirms that the container hit its specific cgroup limit, rather than the node running out of total physical memory.
- Pod Status:
How can I run node-level monitoring tools inside a cluster?
To use standard Linux monitoring tools on a Kubernetes node, you typically use a debugging container because nodes often run minimal OS images that lack these tools.
Command to access node shell:
bash
kubectl debug node/<node-name> -it --image=ubuntuOnce inside the debug pod, you may need to install these tools (e.g., apt update && apt install sysstat net-tools).
A. Process Monitoring (top, htop)
- Purpose: Real-time view of running processes, CPU usage, and memory consumption.
- K8s Context: While kubectl top pods gives high-level metrics, running
topon the node shows the raw PID usage. You can see if a "zombie" process is consuming resources or if a system daemon (non-containerized) is starving the kubelet.
B. System Statistics (iostat, vmstat)
iostat: Reports CPU statistics and input/output statistics for devices and partitions.- Use case: Diagnosing "DiskPressure" node conditions. High I/O wait (iowait) indicates storage is too slow for the workload.
vmstat(Virtual Memory Statistics): Reports information about processes, memory, paging, block IO, traps, and CPU activity.- Use case: analyzing swap usage (if enabled/supported) and system-wide memory bottlenecks.
C. Network Connections (netstat, ss)
ss(Socket Statistics): The modern replacement fornetstat.- Use case:
ss -tulndisplays all open TCP/UDP ports and sockets. This helps verify if aNodePortis actually listening or if there are port conflicts on the host network.
- Use case:
netstat: Legacy tool for network connections, routing tables, and interface statistics.- Use case:
netstat -rncan display the kernel routing table, useful for debugging CNI (Container Network Interface) routing issues between pods and nodes.
- Use case: