G
GuideDevOps
Lesson 2 of 17

Kubernetes Architecture

Part of the Kubernetes tutorial series.

High-Level Architecture

Every Kubernetes cluster consists of two types of resources:

┌─────────────────────────────────────────────┐
│         CONTROL PLANE (Master)              │
│  (Manages cluster, makes scheduling decisions)
│                                             │
│  ┌──────────────┐  ┌──────────────────┐   │
│  │  API Server  │  │      etcd        │   │
│  │ (front-door) │  │  (cluster DB)    │   │
│  └──────────────┘  └──────────────────┘   │
│                                             │
│  ┌──────────────┐  ┌──────────────────┐   │
│  │  Scheduler   │  │   Controller     │   │
│  │ (assign pods)│  │   Manager        │   │
│  └──────────────┘  └──────────────────┘   │
└─────────────────────────────────────────────┘
              │
              │ (communicates with)
              ▼
┌─────────────────────────────────────────────┐
│        WORKER NODES (Machines)              │
│     (Run your actual containers)            │
│                                             │
│  ┌─────────────────┐  ┌─────────────────┐ │
│  │    Node 1       │  │    Node 2       │ │
│  │                 │  │                 │ │
│  │  ┌───┐ ┌───┐   │  │  ┌───┐ ┌───┐   │ │
│  │  │Pod│ │Pod│   │  │  │Pod│ │Pod│   │ │
│  │  └───┘ └───┘   │  │  └───┘ └───┘   │ │
│  │                 │  │                 │ │
│  │ kubelet         │  │ kubelet         │ │
│  │ kube-proxy      │  │ kube-proxy      │ │
│  │ container-time  │  │ container-time  │ │
│  └─────────────────┘  └─────────────────┘ │
└─────────────────────────────────────────────┘

The Control Plane (The Brain)

The Control Plane manages the cluster. It consists of:

1. API Server (kube-apiserver)

  • Role: The front-door to Kubernetes. All communication goes through here.
  • Function:
    • Receives requests from kubectl and other clients
    • Validates requests
    • Stores desired state in etcd
    • Returns responses with cluster information
  • Access: Exposes a RESTful API (on port 6443 by default)
  • Example: When you run kubectl apply -f deployment.yaml, you're talking to the API server

2. State Store (etcd)

  • Role: The cluster's database. The single source of truth.
  • Function:
    • Stores all cluster data (desired state)
    • Stores configuration, secrets, and runtime data
    • Is the only stateful component in the control plane
    • Uses Raft consensus algorithm for high availability
  • Warning: Losing etcd data = losing the cluster state. Always back it up!

3. Scheduler (kube-scheduler)

  • Role: Decides which node each Pod should run on.
  • Decision Factors:
    • Available CPU/memory on nodes
    • Node selectors and affinity rules
    • Pod resource requests
    • Pod tolerations for node taints
    • Custom scoring functions
  • Example: When you deploy a Pod, the scheduler finds the best node to place it

4. Controller Manager (kube-controller-manager)

  • Role: Runs various "controllers" that handle background tasks.
  • Key Controllers:
    • ReplicaSet Controller: Ensures desired number of Pod replicas are running
    • Node Controller: Notices when a node goes down and evicts Pods
    • Service Account Controller: Creates default service accounts
    • Endpoints Controller: Keeps track of Pods for Services
  • Function: Constantly watches for desired state mismatches and corrects them

5. Cloud Controller Manager (cloud-controller-manager)

  • Role: Integrates with cloud providers (AWS, Google Cloud, Azure, etc.)
  • Responsibilities:
    • Create load balancers
    • Manage storage volumes
    • Handle node lifecycle
  • Note: Only used in cloud deployments, not in on-premise or local clusters

Worker Nodes (The Muscles)

Worker nodes are the machines that actually run your applications.

1. Kubelet

  • Role: The node's agent. Ensures containers are running in Pods.
  • Responsibility:
    • Registers the node with the API server
    • Reports node resource availability (CPU, memory, disk)
    • Watches for Pod specs assigned to this node
    • Starts/stops containers via the container runtime
    • Reports container health and status
  • Important: Kubelet is always running, even if the control plane is down

2. Kube-proxy

  • Role: Manages networking on the node.
  • Responsibility:
    • Maintains network rules on the node
    • Implements Services (load balancing across Pods)
    • Routes traffic to the correct Pod IP
    • Can use different modes: iptables, IPVS, or userspace
  • Port Range: Services expose Pods on dynamic ports (30000-32767 for NodePort)

3. Container Runtime

  • Role: Actually runs the containers.
  • Options:
    • containerd (most common, Docker's runtime)
    • CRI-O (OpenShift's runtime)
    • Docker (still supported but deprecated)
    • Kata (for enhanced isolation)
  • Interface: Kubelet communicates via the Container Runtime Interface (CRI)

4. Node Status

Each node broadcasts:

  • CPU available: The amount of CPU the node can allocate
  • Memory available: The amount of RAM the node can allocate
  • Disk space: Space for logs, container image layers, etc.
  • Network connectivity: Whether the node can reach other nodes
  • Node conditions: Ready, Pressure (low resources), Unreachable, etc.

Control Plane High Availability

In production, the control plane should be replicated for redundancy:

┌────────────────────────────────────────────┐
│     LOAD BALANCER (443)                    │
└────────────────┬───────────────────────────┘
                 │
    ┌────────────┼────────────┐
    ▼            ▼            ▼
┌────────┐  ┌────────┐  ┌────────┐
│ Master │  │ Master │  │ Master │
│  API   │  │  API   │  │  API   │
│ Server │  │ Server │  │ Server │
└────────┘  └────────┘  └────────┘
    │            │            │
    └────────────┼────────────┘
                 │
            ┌────▼─────┐
            │  etcd     │
            │ (cluster) │
            └───────────┘
  • Multiple API Servers handle requests
  • Scheduler and Controller Manager run in active-passive mode
  • etcd is replicated across 3 or 5 nodes for fault tolerance

Communication Flow: Deploying a Pod

Here's what happens when you deploy a Pod:

1. kubectl apply -f pod.yaml
   │
   ▼
2. API Server validates and stores in etcd
   │
   ▼
3. Scheduler watches for unscheduled Pods
   │
   ▼
4. Scheduler finds best node, updates Pod with nodeNa­me
   │
   ▼
5. API Server updates Pod spec in etcd
   │
   ▼
6. Kubelet on target node notices new Pod
   │
   ▼
7. Kubelet contacts container runtime
   │
   ▼
8. Container runtime creates and starts container
   │
   ▼
9. Kubelet updates Pod status back to API Server
   │
   ▼
10. kubectl get pod returns "Running"

Network Architecture

Pod-to-Pod Communication

  • Every Pod gets a unique IP address (across the whole cluster, not just the node)
  • Pods can communicate with other Pods on other nodes without translation
  • Uses a Container Network Interface (CNI) plugin (Flannel, Weave, Calico, etc.)

Pod-to-Service Communication

  • Services provide a stable DNS name and IP
  • Traffic is load-balanced across Pods
  • Service discovery happens via DNS (e.g., http://my-service:8080)

External Communication

  • Services with type LoadBalancer get an external IP
  • Services with type NodePort are accessible on ports 30000-32767 on every node
  • Ingress resources provide HTTP/HTTPS routing

Resource Management

Pod Resources

Every Pod consumes:

  • CPU: Measured in millicores (1000m = 1 CPU core)
  • Memory: Measured in bytes (1Gi = 1 gigabyte)
  • Requests: Minimum guaranteed resources
  • Limits: Maximum resources a Pod can use

Node Capacity

Every node has:

  • Allocatable CPU: Total CPU - reserved for OS
  • Allocatable Memory: Total Memory - reserved for OS
  • Max Pods: Typically 110 (configurable)

The Reconciliation Loop

Kubernetes operates on a constant reconciliation loop:

Desired State (in etcd)
        │
        ▼
Compare with Current State
        │
    ┌───┴───┐
    │       │
   No      Same?     Yes
    │             ▼
    ▼        Wait for next check
Execute Actions
    │
    ▼
Update Current State

This happens continuously, ensuring the cluster always matches your desired configuration.


Common Deployment Topologies

Single Master (Development)

┌─────────┐
│ Master  │
└─────────┘
    │
    ├───────┬───────┬───────┐
    ▼       ▼       ▼       ▼
  Node1   Node2   Node3   Node4

Multi-Master (Production)

    ┌─────────────────────┐
    │  Load Balancer      │
    └──────────┬──────────┘
               │
    ┌──────────┼──────────┐
    ▼          ▼          ▼
 Master1    Master2    Master3 (with etcd cluster)
    │          │          │
    └──────────┼──────────┘
               │
    ┌──────────┴──────────┬───────────┐
    ▼          ▼          ▼           ▼
  Node1      Node2      Node3      Node4+