Skip to content

Overview

Containers are built with statelessness as a principle and this makes spinning containers up or down easy. With stateless workloads, since there is no data to be saved and to be migrated, the Kubernetes cluster does not have to deal with nuances of storage. However, most real life workloads are stateful and therefore providing resilient storage is critical.

As Kubernetes adoption within an enterprise grows, developers will increasingly expect it to also be able to support transactional workloads like PostgreSQL and high-performance workloads.


Storage

Kubernetes natively provides several solutions to manage storage: ephemeral options, persistent storage in terms of Persistent Volumes, Persistent Volume Claims, Storage Classes, or StatefulSets.

PV and PVCs

  • Persistent Volumes (PV): These are storage units that have been provisioned by an administrator. They are independent of any single pod, breaking them free from the ephemeral life cycle of pods.

  • Persistent Volume Claims (PVC): These are requests for the storage (i.e. PVs). With a PVC, it is possible to bind storage to a particular node, making it available to that node for usage.

PV and PVCs


Volumes

Kubernetes uses control plane interfaces to interact with external storage via volume plugins. These plugins decouple and abstract away storage and grant storage portability.

Important

Volume plugins can be deployed on a cluster anytime allowing users with the ability to dynamically add support for their preferred storage provider


Dynamic Provisioning

With dynamic provisioning, the cluster admin has to just create multiple profiles of storage. When a developer requests a PVC, depending on the requirements of the request, one of these templates is created at the time of the request, and attached to the pod.


Lifecycle of Dynamic Storage

Dynamic provisioning of volumes can free applications from worring about storage models altogether. Instead, they can leverage the existing Kubernetes framework to manage the volume lifecycle.

The general sequence that is followed for the lifecycle of dynamic storage is as follows:

  • A pod is created which references a PersistentVolumeClaim (PVC) by the user.
  • A PVC is created by the user, and is unbound.
  • If the binding mode = wait for first consumer, nothing is done.
  • Else, the StorageClass of the claim is checked by the Kubernetes volume controller and if known, a volume is created.
  • There is now a PersistentVolume, which is bound to the PVC (see above)
  • The pod created is now scheduled to run on a node, and the kubelet on that node mounts the volume when the container is started.

Storage Classes

Storage Classes are a prerequisite for dynamic provisioning, and there are several options in the StorageClass API definition that allow you to communicate preferred storage semantics in a uniform manner to a Kubernetes cluster.

A few things to consider wrt storage classes

  1. StorageClasses is an interface for defining storage requirements of a pod, not an implementation
  2. StorageClasses are declarative whereas PersistentVolumes are imperative.
  3. PersistentVolumeClaims can be fulfilled without a StorageClass.
  4. StorageClasses require a provisioner which understands them.

Integrations

Controller provisioned upstream Kubernetes clusters provide cluster administrators turnkey integrations storage integrations where they can easily make multiple storageclasses available to applications.

Local Storage

The controller provides a turnkey integration with OpenEBS's Dynamic Persistent Volume (PV) provisioner for k8s local volumes. These storage volumes are available only from a single node.

Ideal workload types are

  • Replicated databases like MongoDB, Cassandra etc.
  • Stateful workloads that can be configured with their own high-availability configuration like Elastic, Minio
  • Edge workloads that typically run on a single node or in Single node Kubernetes Clusters.

Distributed Storage

The controller provides a turnkey integration with GlusterFS for distributed, networked storage.

Ideal workload types are stateful workloads that need access to the underlying PV across all nodes in the cluster. i.e. K8s should be able to schedule the workload's pods on any node on the cluster without constraints.