Kubernetes Node Boot Process: systemd & kubelet Initialization

How does the kubelet start, and how do you troubleshoot node initialization failures?

While the general Linux kernel boot sequence (BIOS/UEFI → Bootloader → Kernel) brings the server online, the operating system's init system—specifically systemd—is responsible for launching and supervising the Kubernetes components once userspace is reached.

1. The Kubelet as a Systemd Daemon

The kubelet is a daemon that must run on every worker and control plane node. Because it is a long-running process that governs container lifecycles, it requires an init system to maintain it. On most Linux distributions installed via DEB or RPM packages, systemd takes this role.

Base Service (kubelet.service): The core configuration sets Restart=always and specifies the ExecStart command to run the binary.
Kubeadm Drop-ins: When using kubeadm, the configuration is augmented by a drop-in file at /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf. This file dynamically injects locations for certificates, config files (/var/lib/kubelet/config.yaml), and environment variables required for the cluster to form.
Startup Sequence: When the node boots, systemd reads these unit files. If enabled (systemctl enable kubelet), systemd launches the kubelet process automatically.

2. Systemd Target Dependencies

Dependencies in systemd ensure that the kubelet starts only when the system is actually ready.

Multi-User Target: The service is configured with WantedBy=multi-user.target. This ensures it starts only after the normal boot sequence brings networking and essential host services online.
Swap Dependencies: Systemd can delay the kubelet start until swap is configured. Conversely, because the kubelet typically requires swap to be disabled (or explicitly tolerated), systemd dependencies help enforce this before the kubelet attempts to launch.
Slice Configuration: To prevent system daemons from starving workload resources, it is best practice to place the kubelet and container runtime under specific systemd slices (e.g., runtime.slice or system.slice). Configuring io.latency here protects critical K8s daemons from I/O starvation.

3. Understanding Initialization Failures

If the kubelet fails to start at boot, it is almost always due to misconfiguration of the underlying Linux resources or security credentials.

Swap Memory Constraint

By default, the kubelet will critically fail to start if it detects swap memory enabled on the host node. To fix this, you must disable swap globally (swapoff -a) or configure the kubelet with failSwapOn: false.

Cgroup Driver Mismatch

A critical cluster failure occurs if the kubelet and the container runtime (e.g., containerd or CRI-O) use different cgroup drivers.

Scenario: If systemd is the init system, it acts as the primary cgroup manager. If the kubelet is configured to use cgroupfs instead of systemd, the node ends up with two competing cgroup managers. This causes severe instability, eviction loops, and resource management failures.
Diagnostic: The kubelet process will continually crash, logging errors about "cgroup hierarchy" or "driver incompatibility".

Other Failure Modes:

Certificate Expiration: The kubelet uses certificates in /var/lib/kubelet/pki to authenticate with the API server. If these expire, the kubelet crashes. Apiserver logs will show x509: certificate has expired.
Initial CrashLoop: During initial cluster bootstrapping, it is entirely normal for the kubelet to rapidly restart (CrashLoop) while waiting for kubeadm to finish generating the foundational configuration files.

4. Recovery Procedures

To recover a worker node where the kubelet continually fails to boot:

1. Logging and Diagnostic Status:

bash

systemctl status kubelet
journalctl -xeu kubelet

2. Fixing a Cgroup Driver Mismatch: If your logs show cgroup errors, you must realign the runtime and the kubelet to both use systemd.

Drain the node (from another machine): kubectl drain <node-name> --ignore-daemonsets
Stop the failing service: systemctl stop kubelet
Update Container Runtime: Edit config.toml for containerd/CRI-O to use SystemdCgroup = true.
Update Kubelet: Edit /var/lib/kubelet/config.yaml to set cgroupDriver: systemd.
Restart: systemctl restart containerd && systemctl start kubelet

3. Recovering from Expired Certificates: If automated client certificate rotation fails on the node:

Back up and delete the old configuration (/etc/kubernetes/kubelet.conf) and PKI files (/var/lib/kubelet/pki/kubelet-client*).
On the control plane, generate a new configuration file using kubeadm kubeconfig user --client-name=system:node:<nodeName>.
Copy the new config to the broken node at /etc/kubernetes/kubelet.conf.
Restart the kubelet service.

How does the kubelet start, and how do you troubleshoot node initialization failures?

1. The Kubelet as a Systemd Daemon ​

2. Systemd Target Dependencies ​

3. Understanding Initialization Failures ​

4. Recovery Procedures ​

1. The Kubelet as a Systemd Daemon

2. Systemd Target Dependencies

3. Understanding Initialization Failures

4. Recovery Procedures