Skip to content

IPAM Delegation

Explain how IP Address Management (IPAM) assigns CIDRs to Nodes and delegates individual Pod IPs.

In a Kubernetes architecture, ensuring that every Pod receives a unique, routable IP address without overlapping with any other Pod in the cluster is a critical networking requirement.

To achieve this at scale without the latency and complexity of querying a centralized control-plane database for every single Pod creation, Kubernetes utilizes a highly efficient delegation model.

Instead of assigning individual Pod IPs from the API server, Kubernetes carves up a global address space into smaller subnets (CIDRs) and delegates exclusive ownership of these subnets to individual worker nodes.

Here is the detailed architectural workflow of how IP Address Management (IPAM) delegates IP blocks while mathematically guaranteeing zero overlap.

1. Defining the Global Address Space

The foundation of this delegation begins at the cluster configuration level. The cluster administrator defines a large, global IP range for the entire cluster's Pod network using the --cluster-cidr flag on the kube-controller-manager (or cloud-controller-manager).

The administrator also configures the size of the subnet that will be handed out to each individual node using the --node-cidr-mask-size-ipv4 and --node-cidr-mask-size-ipv6 flags.

  • By default, Kubernetes uses a /24 mask size for IPv4 (yielding 254 usable IPs per node) and a /64 mask size for IPv6.
  • Example: If the --cluster-cidr is 10.244.0.0/16 and the mask size is /24, the control plane has $2^8$ (256) non-overlapping /24 subnets (blocks of 254 IPs) available to distribute across the cluster's worker nodes.

2. Centralized Delegation (Node IPAM Controller)

When a new worker node joins the cluster, it registers itself with the API server. At this point, the node has no idea what IP addresses it is allowed to assign to its Pods.

The responsibility of assigning an exclusive subnet falls to the Node IPAM Controller (often operating as part of the broader Node Controller within the kube-controller-manager).

  1. The controller detects the newly registered, uninitialized worker node.
  2. It references its internal, centralized map of the global --cluster-cidr to find the next contiguous, unallocated block of IP addresses.
  3. It assigns this exclusive block to the node, writing the allocation directly into the Node API object's .spec.podCIDR field (and .spec.podCIDRs in dual-stack clusters).

How this ensures zero overlap: Because the Node IPAM controller acts as the strict, single source of truth for the global cluster CIDR, it mathematically guarantees that no two worker nodes are ever delegated the same IP block.

3. Local Authorization (host-local IPAM)

Once the node has been delegated its exclusive CIDR block, the control plane completely steps back. The actual assignment of specific IP addresses to individual Pods is handled entirely locally on the worker node.

When the Kubernetes scheduler assigns a Pod to the node, the kubelet uses the Container Runtime Interface (CRI) to create the container sandbox, which in turn invokes a Container Network Interface (CNI) plugin to wire up the network.

  • Most CNI plugins (like Flannel, Calico, or standard bridge plugins) utilize a localized IPAM plugin, most commonly the host-local IPAM binary.
  • The host-local plugin parses the .spec.podCIDR subnet assigned strictly to its host node.
  • When a Pod starts, the host-local plugin automatically selects an unused IP address from this localized /24 block, assigns it to the Pod's virtual ethernet interface (eth0), and writes the IP lease to a local text file on the node's disk.

By writing this lease directly to the node's disk, the host-local plugin prevents local double-allocation.

4. Summary of the Zero-Overlap Guarantee

This two-tier architecture provides a robust, zero-overlap guarantee with immense performance scalability:

  1. Macro-level Isolation (The Control Plane): The central control plane ensures that Node A receives 10.244.1.0/24 and Node B receives 10.244.2.0/24.
  2. Micro-level Isolation (The Node CNI): The localized CNI IPAM plugin on Node A only issues IPs from 10.244.1.0/24. A Pod on Node A can never accidentally receive an IP belonging to Node B's subnet.

By pushing the responsibility of individual Pod IP assignments down to the nodes while securely locking the subnet allocation at the control plane, Kubernetes completely avoids network collisions and O(N) database latency while scaling highly parallel O(1) Pod scheduling events.

Based on Kubernetes v1.35 (Timbernetes). Changelog.