Appearance
CNI Fundamentals
What is a CNI plugin in Kubernetes and how does it actually work?
The Container Network Interface (CNI) is a standardized specification and set of libraries designed to configure network interfaces in Linux containers. In the context of Kubernetes, the CNI is the foundational architectural layer that brings the cluster's network to life.
Kubernetes is fundamentally about sharing machines among distributed applications. To do this seamlessly, Kubernetes enforces a strict, flat network model:
- Every Pod must receive its own unique, cluster-wide IP address.
- All Pods must be able to communicate with each other directly across the cluster without the use of Network Address Translation (NAT).
However, Kubernetes itself does not contain the low-level code required to manipulate the Linux kernel's networking stack, create virtual network bridges, or establish routing protocols. Instead, Kubernetes defines the networking APIs and completely delegates the actual implementation of this network model to external components known as CNI plugins.
This strict delegation ensures that Kubernetes remains platform-agnostic, allowing cluster operators to choose a network fabric—ranging from simple local bridges to complex, eBPF-powered service meshes—that perfectly matches their specific infrastructure, performance requirements, and security postures.
1. The Archiectural Delegation Model
The CNI specification dictates that plugins should be executed as short-lived binaries rather than long-running background daemons.
Historically, the Kubernetes kubelet directly managed CNI plugins using command-line flags. However, modern Kubernetes relies on a strict chain of command: the kubelet delegates network setup entirely to the node's Container Runtime (such as containerd or CRI-O) via the Container Runtime Interface (CRI).
When the container runtime receives a CRI request from the kubelet to create a Pod, the runtime is responsible for loading the CNI configuration and executing the specified CNI binaries.
2. Key CNI Directory Locations
The container runtime discovers how to configure the network by looking in specific default directories on the worker node's filesystem:
/etc/cni/net.d/(The Configuration Directory): This directory contains the JSON configuration files (often ending in.confor.conflist) that define the CNI network topology, IPAM settings, and the specific chain of plugins the runtime needs to execute./opt/cni/bin/(The Binary Directory): This is where the actual executable CNI plugin binaries (such asbridge,loopback,portmap, andhost-local) are installed.
When the container runtime needs to execute a network operation, it reads the JSON configuration, locates the corresponding binary in /opt/cni/bin/, and executes it. It passes the configuration data via standard input (stdin) and execution context (like the Pod's network namespace path and container ID) via environment variables. The plugin executes the requested operation (like adding an interface), prints the resulting IP address and routing details to standard output (stdout), and then safely terminates.
3. The Lifecycle of Pod Network Creation
To truly understand how a CNI plugin works, we must trace the exact sequence of events that wires a Pod into the network from the moment the scheduler assigns it to a worker node.
- Admission and Sandbox Creation: Once a Pod is scheduled to a node, the
kubeletissues a request via the CRI to the container runtime to create a "Pod Sandbox". At this moment, the container runtime creates an isolated Linux network namespace for the Pod, but it is completely empty—it has no interfaces and no IP address. Because the network is not yet configured, thekubeletsets the Pod'sPodReadyToStartContainerscondition toFalse. - The Loopback Interface: Kubernetes strictly requires that every Pod sandbox is provided with a local loopback interface (
lo) so containers within the same Pod can communicate overlocalhost. The container runtime achieves this by executing the CNIloopbackplugin, which enters the Pod's network namespace and brings up thelointerface. - Physical Wiring (
vethPair Creation): Next, the runtime parses the primary configuration in/etc/cni/net.d/and invokes the primary CNI plugin. For a standard bridge setup, the plugin asks the Linux kernel to create a virtual ethernet (veth) pair (think of avethpair as a virtual patch cable). The plugin leaves one end of this virtual cable in the host's root network namespace and attaches it to a virtual network bridge (likecni0). It forcefully moves the other end of the cable directly into the Pod's isolated network namespace, typically renaming it toeth0. - IP Address Management (IPAM): The primary CNI plugin must now assign an IP address. It delegates this specific task to an IPAM plugin (such as
host-local). The IPAM plugin calculates a unique IP, assigns it to theeth0interface inside the Pod, and configures the default routing rules to push all outbound traffic through thevethcable to the host's bridge. - Container Startup: The CNI binary exits with a success code. The container runtime relays this success to the
kubelet, which updates thePodReadyToStartContainerscondition toTrue. Only then does thekubeletinstruct the runtime to pull the application images and start the workload containers, which instantly inherit the configuredeth0interface and IP address.
4. IP Address Management (IPAM) in Depth
Assigning IP addresses dynamically in a distributed cluster containing thousands of rapidly churning Pods requires a highly scalable architecture. If every node had to query a central database for every new Pod, the latency would cripple the cluster.
Kubernetes solves this using a two-tiered delegation model:
- Global Delegation (Control Plane): The cluster administrator configures a large global CIDR for the entire cluster network (e.g.,
10.244.0.0/16). The Kubernetes Node Controller (running in the control plane) slices this large network into smaller subnets (e.g., a/24yielding 254 IPs) and assigns one exclusive subnet to every worker node in the cluster (recorded in the Node object's.spec.podCIDRfield). - Local Assignment (CNI IPAM): When a Pod is created on a specific node, the local CNI plugin invokes the
host-localIPAM plugin. This plugin reads the specific subnet assigned to its node. It then dynamically selects an unused IP address from that local subnet and writes a lease to a local text file on the node's disk to prevent double-allocation.
This ensures rapid, O(1) IP allocation with absolutely zero network calls to the control plane, mathematically guaranteeing that no two nodes will ever issue overlapping IP addresses.
5. Overlay vs. Underlay Routing Strategies
When choosing a CNI plugin, cluster architects must decide between two fundamental data plane routing strategies: Overlay and Underlay.
Overlay Networks
An overlay network creates a virtual network on top of the existing physical network infrastructure. It takes packets originating from a Pod, encapsulates them within an outer packet header (using protocols like VXLAN or IP-in-IP), and sends them across the physical network.
The physical routers only see traffic moving between the IP addresses of the Kubernetes worker nodes; they are entirely unaware of the Pod IP addresses hidden inside. This makes overlay networks incredibly easy to deploy because they work on almost any cloud or bare-metal infrastructure without requiring physical router reconfiguration. The tradeoff is a slight performance penalty due to the CPU overhead of packet encapsulation.
Underlay (Native) Networks
An underlay network does not use encapsulation. Instead, it exposes the Pod IP addresses directly to the underlying physical network. CNI plugins achieve this by utilizing routing protocols like BGP (Border Gateway Protocol) to announce Pod subnets directly to the physical top-of-rack switches, or by integrating directly with cloud provider APIs (like AWS VPC CNI).
This provides bare-metal network performance and lower latency, but requires deep integration with the physical or cloud infrastructure.
6. The Crucial Role of CNI in NetworkPolicies
Understanding the relationship between the Kubernetes API and the CNI plugin is absolutely critical for cluster security. Kubernetes provides the NetworkPolicy API object, which allows you to declaratively state how Pods are allowed to communicate with each other (e.g., establishing a "default deny" posture for a namespace).
However, Kubernetes itself does not enforce NetworkPolicies.
The API server simply stores the policy as a record of intent. It is the absolute responsibility of the CNI network plugin to read that intent and implement the actual packet filtering rules (using mechanisms like iptables, ipsets, or eBPF).
If you use a CNI plugin that does not support NetworkPolicy (such as standard Flannel), you can successfully create and view NetworkPolicy objects in the Kubernetes API, but the cluster will silently ignore them. All traffic will continue to flow unrestricted, creating a severe, silent security vulnerability. Therefore, selecting an advanced CNI like Calico or Cilium is mandatory if your security posture requires network segmentation.
7. Popular CNI Plugins
The Kubernetes ecosystem offers a rich variety of plugins optimized for different operational paradigms.
- Flannel: One of the oldest and simplest plugins. It focuses purely on providing a highly reliable overlay network (typically using VXLAN). It is exceptionally easy to install, but strictly handles basic networking—it does not include support for enforcing NetworkPolicies.
- Calico: An industry-standard networking and security provider. It is highly flexible, allowing operators to run it as a pure, high-performance underlay network using BGP, or as an overlay network. Calico's standout feature is its powerful routing and policy engine, capable of enforcing complex NetworkPolicies using
ipsetsacross hosts and pods. - Cilium: Cilium represents the modern, high-performance edge of Kubernetes networking. Instead of relying on traditional Linux
iptablesrules for routing and security, Cilium utilizes an eBPF (Extended Berkeley Packet Filter) data plane. By attaching eBPF programs directly into the Linux kernel, Cilium performs highly efficient hash-table lookups for routing, completely bypasseskube-proxy, and provides advanced, identity-based security policies. - Weave Net: Weave provides a resilient overlay network mesh that requires almost zero configuration and includes an integrated Network Policy Controller to enforce security rules via
iptables.
8. Operational Commands and Troubleshooting
If you bootstrap a Kubernetes cluster (e.g., using kubeadm) but fail to install a CNI plugin, the kubelet will repeatedly fail to construct Pod sandboxes. Critical system Pods (like CoreDNS) will remain stuck indefinitely in a Pending or ContainerCreating state until a CNI DaemonSet is successfully deployed.
As an operator, you must know how to inspect the network state using CLI tools.
Inspect CNI DaemonSets: Most CNI plugins deploy their agents as DaemonSets in the system namespace.
bash
kubectl get pods -n kube-system -o wideVerifying CNI Readiness via CoreDNS: If the DNS pods are running, your CNI has successfully initialized the network overlay or routing tables.
bash
kubectl get pods --namespace=kube-system -l k8s-app=kube-dnsDiscover CNI Version Compatibility Errors: Execute a binary directly to check its supported specification versions:
bash
/opt/cni/bin/bridge --versionIf your runtime logs show incompatible CNI versions, it means the JSON configuration file in /etc/cni/net.d/ specifies a cniVersion that is newer than what your physical binaries in /opt/cni/bin/ support.
Inspect Node Network Interfaces: To verify the physical or virtual network interfaces created by the CNI (such as the cni0 bridge or host-side veth interfaces), use standard Linux networking commands on the node:
bash
ip link
ip addr showInspect the Node Routing Table: This command reveals how the CNI has configured the host's routing table, showing how traffic destined for specific Pod CIDRs is being directed to the correct virtual bridges or overlay tunnel interfaces.
bash
ip route