Skip to content

Kube-Proxy Modes

Explain the packet traversal difference between kube-proxy iptables mode versus IPVS mode, and when mathmatically iptables begins to degrade.

kube-proxy acts as a network daemon running on every node in a Kubernetes cluster, orchestrating Service Virtual IP (VIP) management and ensuring traffic reaches the correct backend Pods.

The exact packet traversal and underlying Linux kernel architecture differ fundamentally depending on whether kube-proxy is configured to use iptables or IPVS mode.


1. Packet Traversal in iptables Mode

In iptables mode, kube-proxy relies on the standard Linux iptables packet filtering framework to intercept and route traffic. The traversal follows a strict, localized chain of rules evaluated linearly by the kernel.

When a packet destined for a Service arrives at a worker node, it follows this exact path:

  1. The Entry Point (KUBE-SERVICES): The packet is first intercepted by the KUBE-SERVICES chain. For each port of each Service in the entire cluster, kube-proxy creates exactly one rule in this chain.
  2. Service Resolution (KUBE-SVC-*): The matching entry rule acts as a pointer, instructing the packet to jump to a specific KUBE-SVC-<hash> chain dedicated entirely to that specific Service.
  3. Endpoint Selection (Load Balancing): The KUBE-SVC-<hash> chain acts as the software load balancer. For each backend Pod endpoint associated with the Service, there is a small set of rules inside this chain. These rules use a specialized statistic module with a random mode and a mathematically calculated --probability to evenly distribute traffic among the available endpoints.
  4. Destination NAT (KUBE-SEP-*): Once the sequential probability evaluation algorithm successfully selects an endpoint, the packet jumps to a final KUBE-SEP-<hash> (Service Endpoint) chain. This chain contains a few rules that execute Destination NAT (DNAT). The DNAT operation rewrites the packet's destination IP address from the virtual Service IP to the actual, physical IP address of the chosen backend Pod.

2. Packet Traversal in IPVS Mode

In ipvs (IP Virtual Server) mode, kube-proxy completely abandons sequential firewall rules in favor of a dedicated Layer-4 load-balancing facility built deep into the Linux kernel.

  1. Virtual Servers: For each port of each Service—including NodePorts, external IPs, and load-balancer IPs—kube-proxy creates an IPVS "virtual server".
  2. Real Servers: For each backend Pod endpoint, kube-proxy creates a corresponding "real server" attached down to that specific virtual server.
  3. Direct Mapping: When a packet arrives, IPVS directly maps the destination IP and port to the virtual server and seamlessly routes it to one of the real servers using highly optimized load-balancing algorithms (such as round-robin, least connection, or destination hashing).

3. Mathematical Complexity & Degradation Thresholds

To understand why IPVS exists, you must understand the mathematical limits of iptables.

The Linear Trap: O(N) Degradation

iptables was designed as a strict, linear firewall filter, not as a highly scalable software load balancer. Because iptables evaluates rules sequentially from top to bottom, the algorithmic complexity for routing a single packet is O(N), where $N$ is the total number of rules.

Because kube-proxy generates multiple rules for every Service and every endpoint attached to every Service, the rule set grows exponentially as the cluster expands.

The Degradation Threshold: Performance begins to visibly and severely degrade around 5,000 Services (which roughly translates to ~50,000 iptables rules). At this exact scale, the Linux kernel must traverse tens of thousands of rules sequentially just to route a single new HTTP connection. This linear traversal induces massive network latency and spikes CPU utilization on worker nodes.

Furthermore, iptables does not support incremental updates. Adding or removing a single backing Pod requires kube-proxy to read, modify, and rewrite the entire 50,000-rule table back into the kernel lock simultaneously, causing devastating control-plane delays.

The Hash Table Solution: O(1) Optimization

IPVS solves this degradation by completely avoiding it. IPVS stores its virtual and real server mappings inside a highly optimized kernel hash table.

Mathematically, hash table lookups possess an algorithmic complexity of O(1) (constant time). Regardless of whether your cluster has 1,000 Services or 100,000 Services, IPVS can locate the correct backend endpoint instantaneously. Furthermore, IPVS supports incremental updates to backend endpoints without locking the host network stack.

Based on Kubernetes v1.35 (Timbernetes). Changelog.