Appearance
How does Kubernetes handle storage, and what are PersistentVolumes, PersistentVolumeClaims, and StorageClasses?
The Kubernetes storage architecture.
Handling storage in Kubernetes requires a shift in mindset from traditional server management. In a containerized environment, files on disk are ephemeral; if a container crashes or is rescheduled to a different node, the data is lost by default.
To solve this, Kubernetes abstracts the details of how storage is provided from how it is consumed. This decoupling is achieved through three API resources: PersistentVolumes (PV), PersistentVolumeClaims (PVC), and StorageClasses (SC).
The Architecture: Decoupling Provider from Consumer
The goal of this system is to allow developers to request storage without needing to know the specific details of the underlying infrastructure (like whether it is an AWS EBS volume, an NFS share, or an iSCSI target).
1. PersistentVolume (PV)
The Representation of Physical Storage A PersistentVolume is a cluster-wide resource that represents a piece of actual storage in your cluster.
- Independent Lifecycle: Crucially, a PV has a lifecycle independent of any individual Pod that uses it. If the Pod is deleted, the PV (and its data) remains until explicitly reclaimed.
- Implementation Details: The PV object captures the specific details of the storage implementation, such as the NFS server address or the cloud provider's volume ID.
- Scope: PVs are cluster resources, meaning they are notnamespaced; they are available globally across the cluster, similar to Nodes.
2. PersistentVolumeClaim (PVC)
The User's Request A PersistentVolumeClaim is a request for storage by a user or a workload.
- Abstraction: Just as a Pod requests specific compute resources (CPU and Memory), a PVC requests specific storage resources (e.g., "I need 10 GiB of ReadWriteOnce storage").
- Scope: PVCs are namespaced objects. A Pod in a specific namespace references a PVC in the same namespace to access the storage.
- The Claim Check: The PVC acts as a "claim check" to the resource. When a user creates a PVC, the Kubernetes control plane looks for a PV that satisfies the request and binds them together.
3. StorageClass (SC)
The Template for Dynamic Provisioning A StorageClass provides a way for administrators to describe the "classes" or "profiles" of storage they offer (e.g., "fast-ssd", "standard-hdd", or "replication-high").
- Dynamic Provisioning: Without StorageClasses, administrators must manually create (provision) PVs ahead of time (Static Provisioning). StorageClasses enable Dynamic Provisioning: when a user creates a PVC asking for a specific StorageClass, the cluster automatically calls the storage provider API to create the volume and the corresponding PV object on demand.
- Parameters: It allows admins to define parameters like IOPS, replication factors, or backup policies without exposing these complexities to the user.
How It Works: The Provisioning Workflow
The interaction between these components typically follows one of two workflows:
Workflow A: Dynamic Provisioning (The Standard Standard)
This is the preferred method for most modern clusters.
- Admin creates a StorageClass (e.g.,
standard) defining the provisioner (e.g.,ebs.csi.aws.com). - User creates a PVC requesting
storageClassName: standardand10Giof space. - Kubernetes observes the PVC, uses the StorageClass to talk to the storage provider, creates the physical volume, and creates a PV object representing it.
- Kubernetes binds the new PV to the User's PVC.
- User creates a Pod that references the PVC in its
volumessection. The cluster mounts the bound PV into the Pod.
Workflow B: Static Provisioning
- Admin manually creates a storage volume in the backend (e.g., an NFS export).
- Admin creates a PV object in Kubernetes containing the details of that storage.
- User creates a PVC requesting storage size and access modes.
- Kubernetes finds the existing PV that matches the PVC's requirements and binds them.
Critical Configuration Concepts
When defining PVs and PVCs, you must manage two key configurations to ensure the storage behaves as expected:
1. Access Modes This defines how the volume can be mounted:
- ReadWriteOnce (RWO): Mounted as read-write by a single node.
- ReadOnlyMany (ROX): Mounted as read-only by many nodes.
- ReadWriteMany (RWX): Mounted as read-write by many nodes (requires a storage backend that supports this, like NFS).
- ReadWriteOncePod (RWOP): Mounted as read-write by a single Pod (supported by CSI volumes).
2. Reclaim Policy This dictates what happens to the underlying storage when the user deletes the PVC:
- Retain: The PV is released but the data remains. An admin must manually reclaim the space.
- Delete: (Default for dynamic provisioning) Deleting the PVC automatically deletes the PV and the underlying storage infrastructure.
In summary, PVs are the inventory of storage, PVCs are the requests for that inventory, and StorageClasses are the automated factories that create inventory on demand.