The Problem
By default, data inside a container is lost when the container restarts:
Pod starts → Container writes data
↓
Pod crashes → Container data lost
↓
Kubernetes restarts Pod → Data gone!
To persist data across Pod restarts, we use Volumes.
Volume Types
Ephemeral Volumes (Temporary)
emptyDir
Temporary directory shared by containers in a Pod, deleted when Pod is deleted.
apiVersion: v1
kind: Pod
metadata:
name: cache-pod
spec:
containers:
- name: app
image: myapp:latest
volumeMounts:
- name: cache
mountPath: /var/cache
- name: sidecar
image: cache-sidecar:latest
volumeMounts:
- name: cache
mountPath: /cache
volumes:
- name: cache
emptyDir: {}Use cases:
- Temporary scratch space
- Caching between containers
- Shared state in multi-container Pods
Persistent Volumes (Long-term)
1. PersistentVolume (PV)
Represents actual storage in the cluster.
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-database
spec:
capacity:
storage: 100Gi
accessModes:
- ReadWriteOnce # Single pod can read/write
storageClassName: fast-ssd
awsElasticBlockStore:
volumeID: vol-12345678
fsType: ext42. PersistentVolumeClaim (PVC)
A request for storage (like filling out a purchase order).
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: database-claim
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast-ssd
resources:
requests:
storage: 50Gi # Need 50 GBKubernetes matches the PVC to a PV with enough capacity.
3. Pod Using PVC
apiVersion: v1
kind: Pod
metadata:
name: db-pod
spec:
containers:
- name: postgres
image: postgres:14
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumes:
- name: data
persistentVolumeClaim:
claimName: database-claimStorageClass (Dynamic Provisioning)
Instead of manually creating PVs, use StorageClass to automatically provision storage:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: ebs.csi.aws.com # AWS EBS provisioner
parameters:
type: gp3
iops: "3000"
throughput: "125"
fstype: ext4
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: app-storage
spec:
accessModes:
- ReadWriteOnce
storageClassName: fast-ssd # Reference StorageClass
resources:
requests:
storage: 100Gi
# Kubernetes automatically provisions an EBS volume!Built-in Storage Classes
kubectl get storageclassCommon provisioners:
- AWS:
ebs.csi.aws.com - Google Cloud:
pd.csi.storage.gke.io - Azure:
disk.csi.azure.com - NFS:
nfs.csi.k8s.io
Access Modes
ReadWriteOnce (RWO)
Data can be read/written by a single Pod. Most commonly used.
accessModes:
- ReadWriteOnceReadOnlyMany (ROX)
Data can be read by multiple Pods but not written.
accessModes:
- ReadOnlyManyUse case: Shared read-only configuration
ReadWriteMany (RWX)
Data can be read/written by multiple Pods simultaneously.
accessModes:
- ReadWriteManyUse case: Shared storage for multiple workers (NFS, EFS)
Practical Examples
Database with Persistent Storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pvc
spec:
accessModes:
- ReadWriteOnce
storageClassName: standard
resources:
requests:
storage: 100Gi
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
spec:
replicas: 1
selector:
matchLabels:
app: database
template:
metadata:
labels:
app: database
spec:
containers:
- name: postgres
image: postgres:14
env:
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: db-secret
key: password
ports:
- containerPort: 5432
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumes:
- name: data
persistentVolumeClaim:
claimName: postgres-pvcShared Config Volume (ReadOnlyMany)
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
config.yaml: |
server:
port: 8080
workers: 4
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: app
image: myapp:latest
volumeMounts:
- name: config
mountPath: /etc/config
readOnly: true
volumes:
- name: config
configMap:
name: app-configVolume Lifecycle
PV States
Available
↓
(PVC created)
↓
Bound (attached to PVC)
↓
(PVC deleted)
↓
Released (no longer bound, data retained)
↓
Delete/Recycle (based on reclaim policy)
Reclaim Policies
What happens to a PV when a PVC is deleted:
spec:
persistentVolumeReclaimPolicy: Delete # Delete storage (default for cloud)
# or
persistentVolumeReclaimPolicy: Retain # Keep storage (manual cleanup)
# or
persistentVolumeReclaimPolicy: Recycle # Clear data, make available for reuseSnapshots
Create point-in-time backups of volumes:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: database-snapshot
spec:
volumeSnapshotClassName: csi-snapshotter
source:
persistentVolumeClaimName: postgres-pvc
---
# Restore from snapshot
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-restored
spec:
dataSource:
name: database-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100GiBest Practices
✅ Use StorageClass for automatic provisioning
- Easier than manual PV creation
- Scales automatically
✅ Set resource requests for storage
resources:
requests:
storage: 100Gi✅ Monitor storage usage
kubectl get pvc
kubectl describe pvc database-claim✅ Backup important data Use volume snapshots or external backup solutions
✅ Use appropriate reclaim policy
- Retain for important databases (manual cleanup)
- Delete for temporary storage
❌ Don't lose track of orphaned PVs Regularly check:
kubectl get pv
kubectl get pvc -A❌ Don't exceed PV capacity Monitor and resize as needed