Kubernetes Orchestration
Kubernetes architecture: pods, services, deployments, StatefulSets, ConfigMaps, autoscaling, and the control plane. What problems K8s solves.
Why Kubernetes?
Running containers in production means answering: Which node has enough memory? How do I restart a crashed container? How do I roll out a new version without downtime? How do I scale under load? Kubernetes (K8s) is an open-source container orchestrator that answers all of these. It treats your cluster of machines as a single pool of compute and automatically places, heals, scales, and connects your containers.
Control Plane vs Data Plane
| Component | Location | Role |
|---|---|---|
| kube-apiserver | Control plane | The REST API gateway — all kubectl and internal calls go through it |
| etcd | Control plane | Distributed key-value store that holds all cluster state |
| kube-scheduler | Control plane | Assigns unscheduled Pods to nodes based on resources and constraints |
| kube-controller-manager | Control plane | Runs reconciliation loops (ReplicaSet, Node, Endpoint controllers) |
| kubelet | Each worker node | Ensures containers declared for a node are running and healthy |
| kube-proxy | Each worker node | Maintains iptables/IPVS rules for Service virtual IP routing |
Core Workload Resources
Pod is the atomic unit — one or more containers sharing a network namespace and storage. You almost never create Pods directly. Instead you use higher-level controllers:
- Deployment — Manages stateless Pods. Handles rolling updates, rollbacks, and replica scaling. Use for web servers, APIs, workers.
- StatefulSet — Provides stable network identities (`pod-0`, `pod-1`) and per-Pod persistent volumes. Use for databases (Cassandra, Kafka, Redis).
- DaemonSet — Ensures exactly one Pod runs on every (or selected) node. Use for node-level agents: log shippers, metric collectors, CNI plugins.
- Job / CronJob — Run-to-completion workloads. CronJob schedules them on a cron expression.
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: api-server
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1 # one extra Pod during rollout
maxUnavailable: 0 # never drop below desired count
template:
metadata:
labels:
app: api-server
spec:
containers:
- name: api
image: myregistry/api:v2.1.0
ports:
- containerPort: 8080
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "1000m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
readinessProbe:
httpGet:
path: /ready
port: 8080
periodSeconds: 5Services and Ingress
A Service provides a stable virtual IP (ClusterIP) and DNS name in front of a dynamic set of Pods selected by labels. Kubernetes automatically updates the endpoint list as Pods come and go. Service types:
| Service Type | Accessibility | Use Case |
|---|---|---|
| ClusterIP (default) | Internal cluster only | Service-to-service communication |
| NodePort | External via node IP + port | Development, bare-metal clusters |
| LoadBalancer | External via cloud load balancer | Production external traffic |
| ExternalName | DNS alias to external service | Integrating external databases |
An Ingress resource provides HTTP/HTTPS routing (path-based, host-based) in front of multiple Services via an Ingress controller (NGINX, AWS ALB, Traefik). It consolidates external access instead of provisioning one cloud load balancer per Service.
Configuration and Secrets
ConfigMaps hold non-sensitive configuration (feature flags, service URLs). Secrets hold sensitive data (passwords, API keys) — stored in etcd, base64-encoded (not encrypted by default; use etcd encryption at rest or an external vault like HashiCorp Vault). Both are injected into Pods as environment variables or mounted as files.
Secrets are not truly secret without extra steps
Base64 encoding is not encryption. Enable etcd encryption at rest, use RBAC to restrict Secret access, and consider tools like sealed-secrets or HashiCorp Vault with the vault-agent-injector for production-grade secret management.
Autoscaling
Kubernetes provides three autoscaling dimensions. Horizontal Pod Autoscaler (HPA) scales the replica count of a Deployment based on CPU utilization or custom metrics from Prometheus. Vertical Pod Autoscaler (VPA) adjusts resource requests/limits based on actual usage. Cluster Autoscaler adds or removes worker nodes based on pending Pods and idle nodes — it works with cloud provider APIs (AWS, GCP, Azure).
Interview Tip
Interviewers often ask how you'd handle a traffic spike. Walk through the autoscaling chain: HPA detects high CPU → scales Pods → if no node capacity, Cluster Autoscaler provisions a new node → new Pods schedule and become ready → traffic is balanced. Mention that HPA lag (scrape interval + scale cooldown) means pre-warming or KEDA (event-driven scaling) may be needed for sudden spikes.