Appearance
Kubernetes Node Boot Process: systemd & kubelet Initialization
How does the kubelet start, and how do you troubleshoot node initialization failures?
While the general Linux kernel boot sequence (BIOS/UEFI → Bootloader → Kernel) brings the server online, the operating system's init system—specifically systemd—is responsible for launching and supervising the Kubernetes components once userspace is reached.
1. The Kubelet as a Systemd Daemon
The kubelet is a daemon that must run on every worker and control plane node. Because it is a long-running process that governs container lifecycles, it requires an init system to maintain it. On most Linux distributions installed via DEB or RPM packages, systemd takes this role.
- Base Service (
kubelet.service): The core configuration setsRestart=alwaysand specifies theExecStartcommand to run the binary. - Kubeadm Drop-ins: When using
kubeadm, the configuration is augmented by a drop-in file at/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf. This file dynamically injects locations for certificates, config files (/var/lib/kubelet/config.yaml), and environment variables required for the cluster to form. - Startup Sequence: When the node boots,
systemdreads these unit files. If enabled (systemctl enable kubelet), systemd launches thekubeletprocess automatically.
2. Systemd Target Dependencies
Dependencies in systemd ensure that the kubelet starts only when the system is actually ready.
- Multi-User Target: The service is configured with
WantedBy=multi-user.target. This ensures it starts only after the normal boot sequence brings networking and essential host services online. - Swap Dependencies: Systemd can delay the
kubeletstart until swap is configured. Conversely, because thekubelettypically requires swap to be disabled (or explicitly tolerated), systemd dependencies help enforce this before thekubeletattempts to launch. - Slice Configuration: To prevent system daemons from starving workload resources, it is best practice to place the
kubeletand container runtime under specific systemd slices (e.g.,runtime.sliceorsystem.slice). Configuringio.latencyhere protects critical K8s daemons from I/O starvation.
3. Understanding Initialization Failures
If the kubelet fails to start at boot, it is almost always due to misconfiguration of the underlying Linux resources or security credentials.
Swap Memory Constraint
By default, the kubelet will critically fail to start if it detects swap memory enabled on the host node. To fix this, you must disable swap globally (swapoff -a) or configure the kubelet with failSwapOn: false.
Cgroup Driver Mismatch
A critical cluster failure occurs if the kubelet and the container runtime (e.g., containerd or CRI-O) use different cgroup drivers.
- Scenario: If systemd is the init system, it acts as the primary cgroup manager. If the
kubeletis configured to usecgroupfsinstead ofsystemd, the node ends up with two competing cgroup managers. This causes severe instability, eviction loops, and resource management failures. - Diagnostic: The
kubeletprocess will continually crash, logging errors about "cgroup hierarchy" or "driver incompatibility".
Other Failure Modes:
- Certificate Expiration: The
kubeletuses certificates in/var/lib/kubelet/pkito authenticate with the API server. If these expire, thekubeletcrashes. Apiserver logs will showx509: certificate has expired. - Initial CrashLoop: During initial cluster bootstrapping, it is entirely normal for the
kubeletto rapidly restart (CrashLoop) while waiting forkubeadmto finish generating the foundational configuration files.
4. Recovery Procedures
To recover a worker node where the kubelet continually fails to boot:
1. Logging and Diagnostic Status:
bash
systemctl status kubelet
journalctl -xeu kubelet2. Fixing a Cgroup Driver Mismatch: If your logs show cgroup errors, you must realign the runtime and the kubelet to both use systemd.
- Drain the node (from another machine):
kubectl drain <node-name> --ignore-daemonsets - Stop the failing service:
systemctl stop kubelet - Update Container Runtime: Edit
config.tomlfor containerd/CRI-O to useSystemdCgroup = true. - Update Kubelet: Edit
/var/lib/kubelet/config.yamlto setcgroupDriver: systemd. - Restart:
systemctl restart containerd && systemctl start kubelet
3. Recovering from Expired Certificates: If automated client certificate rotation fails on the node:
- Back up and delete the old configuration (
/etc/kubernetes/kubelet.conf) and PKI files (/var/lib/kubelet/pki/kubelet-client*). - On the control plane, generate a new configuration file using
kubeadm kubeconfig user --client-name=system:node:<nodeName>. - Copy the new config to the broken node at
/etc/kubernetes/kubelet.conf. - Restart the
kubeletservice.