G
GuideDevOps
Lesson 11 of 17

StatefulSets & DaemonSets

Part of the Kubernetes tutorial series.

StatefulSets

The Problem with Deployments

Deployments are great for stateless applications where all Pods are identical:

web-deployment
├─ pod-abc123 (can die)
├─ pod-def456 (can die)
└─ pod-ghi789 (can die)  
# All identical, any can handle traffic

But for stateful applications like databases, Pods are NOT interchangeable:

MySQL Cluster
├─ mysql-0 (Master - primary database)
├─ mysql-1 (Slave - replica)
└─ mysql-2 (Slave - replica)
# Each has a different role!

What StatefulSets Provide

  1. Stable Identity: Pods get predictable names (mysql-0, mysql-1, mysql-2)
  2. Stable Storage: Each Pod gets its own PersistentVolume
  3. Ordered Deployment: Pods are created/terminated in order
  4. Headless Service: Direct DNS access to each Pod

StatefulSet Example

apiVersion: v1
kind: Service
metadata:
  name: mysql
spec:
  clusterIP: None       # Headless service
  selector:
    app: mysql
  ports:
  - port: 3306
 
---
 
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: mysql    # Must match Service name
  replicas: 3
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: root-password
        ports:
        - containerPort: 3306
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
  
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 100Gi

What this creates:

mysql-0
  ├─ PersistentVolume: mysql-data-0 (100Gi)
  ├─ DNS: mysql-0.mysql.default.svc.cluster.local
  └─ Container: mysql

mysql-1
  ├─ PersistentVolume: mysql-data-1 (100Gi)
  ├─ DNS: mysql-1.mysql.default.svc.cluster.local
  └─ Container: mysql

mysql-2
  ├─ PersistentVolume: mysql-data-2 (100Gi)
  ├─ DNS: mysql-2.mysql.default.svc.cluster.local
  └─ Container: mysql

Accessing StatefulSet Pods

# Access specific Pod by index
kubectl exec -it mysql-0 -- mysql -uroot -p
kubectl exec -it mysql-1 -- mysql -uroot -p
 
# From another Pod
kubectl run -it mysqlclient --image=mysql:8.0 -- sh
 
# Inside that Pod
mysql -h mysql-0.mysql -uroot -p    # Connect to master
mysql -h mysql-1.mysql -uroot -p    # Connect to replica

StatefulSet Lifecycle

Scaling up:

Desired: 3 → 5
   ↓
mysql-0: Running
mysql-1: Running
mysql-2: Running
mysql-3: Pending → Creating
mysql-4: Pending → Creating

Scaling down:

Desired: 5 → 3
   ↓
mysql-4: Terminating → Deleted (most recent)
mysql-3: Terminating → Deleted
mysql-2: Running
mysql-1: Running
mysql-0: Running

Update Strategy

Rolling Update (default):

spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      partition: 2  # Only update mysql-2+

Useful for canary deployments: update stateful Pods one at a time.

On Delete:

spec:
  updateStrategy:
    type: OnDelete

Manually delete Pods to trigger update.


DaemonSets

The Purpose

A DaemonSet ensures that a Pod runs on every node in the cluster.

Use Cases

  1. Log Collection (Fluentd, Filebeat)

    • Every node needs to collect its logs
  2. Monitoring (Node Exporter, Datadog Agent)

    • Monitor every machine in the cluster
  3. Networking (Weave, Calico)

    • Network plugin on every node
  4. Security Scanning (Trivy, Falco)

    • Scan every host for vulnerabilities

DaemonSet Example

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-exporter
spec:
  selector:
    matchLabels:
      app: node-exporter
  template:
    metadata:
      labels:
        app: node-exporter
    spec:
      # Run on all nodes, including masters
      tolerations:
      - operator: Exists    # Tolerate all taints
      
      hostNetwork: true     # Use host's network namespace
      hostPID: true         # Access host processes
      
      containers:
      - name: exporter
        image: prom/node-exporter:latest
        args:
        - --path.procfs=/host/proc
        - --path.sysfs=/host/sys
        - --path.rootfs=/rootfs
        ports:
        - containerPort: 9100
        volumeMounts:
        - name: proc
          mountPath: /host/proc
        - name: sys
          mountPath: /host/sys
        - name: root
          mountPath: /rootfs
          readOnly: true
      
      volumes:
      - name: proc
        hostPath:
          path: /proc
      - name: sys
        hostPath:
          path: /sys
      - name: root
        hostPath:
          path: /

Result:

┌─────────────────┐
│   Master Node   │
│ node-exporter-x │
└─────────────────┘

┌─────────────────┐
│  Worker Node 1  │
│ node-exporter-y │
└─────────────────┘

┌─────────────────┐
│  Worker Node 2  │
│ node-exporter-z │
└─────────────────┘

# Every node has exactly one Pod

Taints and Tolerations

DaemonSets often need to run on nodes that have taints (like master nodes):

# Taint on master node
tolerations:
- key: node-role.kubernetes.io/control-plane
  operator: Exists
  effect: NoSchedule

DaemonSet Update Strategy

Rolling Update (default):

spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1  # Take down 1 node at a time

Comparison

FeatureDeploymentStatefulSetDaemonSet
PodsIdentical, interchangeableUnique, ordered identityOne per node
StorageShared or no storageEach Pod has own volumeNode-local
DNSRandom namesPredictable namesRandom names
ScalingScale replicasScale replicasAutomatic per node
Use caseWeb apps, APIsDatabases, cachesLogging, monitoring

Best Practices

Use StatefulSets for:

  • Databases (PostgreSQL, MySQL, MongoDB)
  • Caches (Redis, Memcached)
  • Message queues (RabbitMQ, Kafka)
  • Any app that needs stable identity/storage

Use DaemonSets for:

  • Node-level agents (monitoring, logging)
  • Network plugins
  • Security scanning
  • Anything that needs to run on every host

Label nodes for targeting:

kubectl label nodes node-1 disktype=ssd
affinity:
  nodeAffinity:
    requiredDuringScheduling:
      nodeSelectorTerms:
      - matchExpressions:
        - key: disktype
          operator: In
          values: [ssd]

Monitor Pod creation order

kubectl get pods -w

Don't modify PVs created by StatefulSet

  • Can break synchronization
  • Delete StatefulSet if you need to change storage