Menu
Course/Infrastructure & DevOps/Containerization with Docker

Containerization with Docker

How containers work: images, layers, registries, networking, storage, resource limits, and the difference between containers and VMs.

15 min readHigh interview weight

Containers vs Virtual Machines

A virtual machine (VM) bundles an entire OS kernel, system libraries, and your application. A container shares the host OS kernel and isolates only the application and its dependencies using two Linux primitives: namespaces (process, network, filesystem, IPC isolation) and cgroups (CPU, memory, I/O limits). The result: containers start in milliseconds instead of minutes and consume megabytes of RAM rather than gigabytes.

PropertyVirtual MachineContainer
Boot time30–120 seconds< 1 second
SizeGigabytes (full OS)Megabytes (app + libs)
OS kernelEach VM has its ownShared with host
IsolationStrong (hypervisor)Process-level (namespace)
PortabilityMedium (hypervisor-specific)High (any Docker host)
DensityTens per hostHundreds per host
ℹ️

When VMs still win

Use VMs when you need strong multi-tenant isolation (e.g., running untrusted customer code), OS-level customization, or Windows workloads on a Linux host. Containers are not a security boundary as strong as a hypervisor.

Docker Image Layers

Every `Dockerfile` instruction (`FROM`, `RUN`, `COPY`, `ADD`) creates an immutable layer. Docker stacks these layers using a Union File System (typically `overlayfs`). Layers are content-addressed by SHA256 hash and cached — if you change a late instruction, Docker only rebuilds from that point down. This makes iterative builds fast.

Loading diagram...
Docker image layer stack. The container adds a thin writable layer on top of read-only image layers.

Writing Efficient Dockerfiles

Layer ordering matters. Put instructions that change least often (OS packages, dependency installation) early and app code late so cache invalidation is minimal. Use `.dockerignore` to exclude `node_modules`, `.git`, and build artifacts from the build context.

dockerfile
# ── Bad: invalidates dependency cache on every code change ──
FROM node:20-alpine
COPY . .
RUN npm ci
CMD ["node", "src/index.js"]

# ── Good: cache npm install separately from app code ──
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

FROM node:20-alpine AS runner
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY src/ ./src/
EXPOSE 3000
USER node
CMD ["node", "src/index.js"]

The multi-stage build above separates the dependency installation stage from the final runtime image. The `--from=deps` flag copies only `node_modules`, keeping the runner stage clean and small. Running as `USER node` (non-root) is a security best practice.

Container Networking

Docker creates a virtual bridge network (`docker0`) by default. Each container gets its own network namespace with a virtual ethernet pair. Docker's embedded DNS resolver lets containers address each other by service name in custom bridge networks and in Docker Compose.

Network ModeUse CaseIsolation
bridge (default)Multi-container on same hostContainer-level
hostPerformance-critical, low latencyNone (shares host network)
overlayMulti-host (Docker Swarm)Cross-host
noneNo network access neededFull
macvlanContainer needs its own MAC/IPAppears as physical device

Storage: Volumes vs Bind Mounts

Volumes are managed by Docker (`/var/lib/docker/volumes/`) and are the recommended way to persist data. They survive container restarts and can be shared between containers. Bind mounts map a host path directly into a container — useful for development (hot reload) but fragile in production. tmpfs mounts store data in host memory only, useful for sensitive temporary data.

Resource Limits with cgroups

Without limits, a runaway container can starve the host. Docker exposes cgroup controls via flags on `docker run`:

bash
# Limit to 512 MB RAM and 1.5 CPU cores
docker run --memory=512m --cpus=1.5 my-service

# Kubernetes equivalent in a Pod spec
resources:
  requests:
    memory: "256Mi"
    cpu: "250m"
  limits:
    memory: "512Mi"
    cpu: "1500m"
⚠️

OOMKilled: the silent failure

If a container exceeds its memory limit, the kernel OOM killer terminates it — often with no application-level log. Always set memory limits in production and monitor for `OOMKilled` exit codes in your orchestrator.

Container Registries

A registry stores and distributes Docker images. Docker Hub is the public default. Production teams run private registries — AWS ECR, Google Artifact Registry, GitHub Container Registry — to avoid pull-rate limits, improve latency, and control access. Images are referenced as `registry/namespace/name:tag` (e.g., `gcr.io/my-project/api:v1.2.3`).

💡

Interview Tip

When an interviewer asks 'how would you containerize this service?' walk through: (1) base image choice (distroless or alpine for small attack surface), (2) multi-stage build to separate build and runtime, (3) layer ordering for cache efficiency, (4) non-root user, (5) resource limits in the orchestrator. This signals production-readiness awareness.

📝

Knowledge Check

5 questions

Test your understanding of this lesson. Score 70% or higher to complete.

Ask about this lesson

Ask anything about Containerization with Docker