How do I trace container logs to their source on the node filesystem?

Here is a breakdown of logging and monitoring in a Kubernetes environment, focusing on where data lives and how to access it using Linux primitives.

1. Where Container Logs Live

Kubernetes separates "system component" logs from "application" logs. Application logs (Standard Output/Error) are captured by the container runtime and stored on the node's filesystem.

/var/log/pods/: This is the primary storage location for container logs managed by the kubelet.
- Structure: The directory structure follows the pattern namespace_pod_uid/container_name/retry_count.log.
- Mechanism: The kubelet instructs the Container Runtime (via CRI) to write logs to this location. When you run kubectl logs, the kubelet reads directly from these files on the node.
/var/log/containers/: This directory typically contains symbolic links (symlinks) pointing to the actual log files in /var/log/pods/.
- Purpose: These symlinks exist primarily for backward compatibility with older log collection agents that expect logs in a flat list format like pod_name_namespace_container_id.log.
Container Runtime Logs:
- The container runtime (e.g., containerd, CRI-O) and the kubelet itself do not write to /var/log/pods. Instead, they write to the host operating system's native logger. On systemd-based systems, this is the systemd journal.

2. Systemd Journal Integration

System components that run as native system services (daemons), such as the kubelet and the container runtime, log to journald rather than files.

Accessing Logs: You use journalctl to view these logs on the node.
- Kubelet: journalctl -u kubelet
- Runtime: journalctl -u containerd or journalctl -u crio
Log Rotation: On systemd systems, log rotation is handled by journald configuration. If systemd is not present, these components may write to .log files in /var/log/, which requires external log rotation (e.g., via logrotate) to prevent disk exhaustion.

How do I use dmesg to find kernel-level OOM kills?

The dmesg command prints the message buffer of the kernel. In a Kubernetes context, this is critical for diagnosing issues that happen at the operating system level, below the container runtime.

Usage: You can access these logs by running dmesg on the node, or by reading /var/log/kern.log (if available on the distro).
What to look for:
- Hardware faults: Disk I/O errors or network card failures.
- Security denials: AppArmor or SELinux denials often log verbose details here regarding what action was blocked.
- OOM Kills: The kernel's decision to kill a process due to memory starvation.

Reading Kernel Logs for OOM Kills

When a container exceeds its memory limit, the Linux kernel (not Kubernetes directly) kills the process. This is known as an OOM (Out of Memory) Kill.

The Mechanism:
- Memory Limit Enforcement: Kubelet configures cgroups limits. If a container tries to allocate more memory than its limit, the kernel invokes the OOM Killer.
- OOM Score: The kubelet assigns an oom_score_adj to containers based on their QoS class (Guaranteed, Burstable, BestEffort) to influence which process the kernel kills first during system-wide pressure.
Identifying an OOM Kill:
1. Pod Status: kubectl get pod shows the status OOMKilled.
2. Kernel Log Message: Running dmesg or checking /var/log/kern.log on the node will show a message similar to: Memory cgroup out of memory: Kill process <PID> (stress) score <SCORE> or sacrifice child.
- This confirms that the container hit its specific cgroup limit, rather than the node running out of total physical memory.

How can I run node-level monitoring tools inside a cluster?

To use standard Linux monitoring tools on a Kubernetes node, you typically use a debugging container because nodes often run minimal OS images that lack these tools.

Command to access node shell:

bash

kubectl debug node/<node-name> -it --image=ubuntu

Once inside the debug pod, you may need to install these tools (e.g., apt update && apt install sysstat net-tools).

A. Process Monitoring (`top`, `htop`)

Purpose: Real-time view of running processes, CPU usage, and memory consumption.
K8s Context: While kubectl top pods gives high-level metrics, running top on the node shows the raw PID usage. You can see if a "zombie" process is consuming resources or if a system daemon (non-containerized) is starving the kubelet.

B. System Statistics (`iostat`, `vmstat`)

iostat: Reports CPU statistics and input/output statistics for devices and partitions.
- Use case: Diagnosing "DiskPressure" node conditions. High I/O wait (iowait) indicates storage is too slow for the workload.
vmstat (Virtual Memory Statistics): Reports information about processes, memory, paging, block IO, traps, and CPU activity.
- Use case: analyzing swap usage (if enabled/supported) and system-wide memory bottlenecks.

C. Network Connections (`netstat`, `ss`)

ss (Socket Statistics): The modern replacement for netstat.
- Use case: ss -tuln displays all open TCP/UDP ports and sockets. This helps verify if a NodePort is actually listening or if there are port conflicts on the host network.
netstat: Legacy tool for network connections, routing tables, and interface statistics.
- Use case: netstat -rn can display the kernel routing table, useful for debugging CNI (Container Network Interface) routing issues between pods and nodes.

How do I trace container logs to their source on the node filesystem?

1. Where Container Logs Live ​

2. Systemd Journal Integration ​

How do I use dmesg to find kernel-level OOM kills?

Reading Kernel Logs for OOM Kills ​