DNS & Domain Resolution
How DNS works end-to-end: recursive resolvers, authoritative servers, caching, TTLs, and DNS-based load balancing strategies.
What Is DNS?
The Domain Name System (DNS) is the internet's distributed phone book. It translates human-readable hostnames like `api.example.com` into IP addresses like `93.184.216.34` that routers can actually use. Without DNS, every user would need to memorize numeric IP addresses — and those addresses would change silently whenever a company migrated servers.
DNS is a hierarchical, globally distributed system. No single server holds all domain-to-IP mappings. Instead, resolution is delegated through a tree of authoritative servers, with aggressive caching at every level to keep the system fast.
The DNS Resolution Process
When your browser navigates to `www.example.com`, a chain of lookups occurs. The entire process typically completes in 20-120ms on a cold cache, and in near-zero time when served from cache.
- Browser cache: The browser checks its own DNS cache first (Chrome stores records for up to 1 minute regardless of TTL).
- OS cache / hosts file: If not in the browser cache, the OS resolver checks its own cache and `/etc/hosts`.
- Recursive resolver (ISP or 8.8.8.8): The OS forwards the query to a configured recursive resolver (often your ISP's or Google's `8.8.8.8`). This resolver does the heavy lifting.
- Root name servers: If the recursive resolver has no cache entry, it asks one of 13 root server clusters (operated by ICANN, VeriSign, etc.) for the TLD nameserver.
- TLD name servers: The root refers the resolver to the `.com` TLD servers, which know which authoritative server is responsible for `example.com`.
- Authoritative name servers: The resolver finally reaches the authoritative server for `example.com`, which returns the actual A record (IPv4) or AAAA record (IPv6).
- Caching and returning: Each step is cached with the record's TTL. The final IP is returned to the browser.
Key DNS Record Types
| Record | Purpose | Example |
|---|---|---|
| A | Maps hostname to IPv4 address | `example.com → 93.184.216.34` |
| AAAA | Maps hostname to IPv6 address | `example.com → 2606:2800::1` |
| CNAME | Alias from one hostname to another | `www.example.com → example.com` |
| MX | Mail server for the domain | `example.com → mail.example.com` |
| TXT | Arbitrary text (SPF, DKIM, verification) | `v=spf1 include:sendgrid.net ~all` |
| NS | Nameservers authoritative for the domain | `example.com → ns1.example.com` |
| SOA | Start of Authority — zone metadata | Serial, refresh, retry intervals |
| SRV | Service location (port + hostname) | `_grpc._tcp.example.com → svc:443` |
TTL and Caching
Every DNS record has a Time To Live (TTL) measured in seconds. Resolvers and browsers must discard cached records after the TTL expires. Choosing the right TTL is an engineering trade-off:
| TTL Setting | Trade-offs | Use Case |
|---|---|---|
| Low TTL (30–300s) | Fresh data, but more DNS queries and higher load on authoritative servers | During deployments, migrations, or failovers where you need fast propagation |
| High TTL (3600–86400s) | Fewer queries, better performance, but slow to propagate changes | Stable production services with infrequent IP changes |
| Very low TTL (0–30s) | Near-real-time updates, significant resolver load | Canary deployments, blue/green switches |
TTL is not instant propagation
When you lower a TTL or change a record, old resolvers continue serving stale data until their existing cached entry expires — which may reflect the previous (higher) TTL. Always lower TTL well in advance of planned migrations.
DNS-Based Load Balancing
DNS can be used as a crude load balancing mechanism by returning multiple A records for a single hostname. DNS clients typically try addresses in order or randomly, distributing traffic across servers.
- Round-robin DNS: Multiple A records are returned; clients cycle through them. Simple but has no health awareness — dead servers remain in responses until the TTL expires.
- Weighted records: Some DNS providers allow assigning weights to records for traffic splitting (e.g., Route 53 weighted routing). Useful for canary releases.
- Geo DNS / latency-based routing: DNS resolver returns different IPs based on the querying client's geographic location. AWS Route 53 and Cloudflare do this natively.
- Failover DNS: Primary record is served normally; if health checks fail, DNS automatically switches to a secondary IP. Recovery depends on TTL.
Interview Tip
In interviews, DNS often comes up when discussing global traffic routing or failover. Mention that DNS-based load balancing is coarse-grained and TTL-dependent — it cannot respond to real-time server health the way an L7 load balancer can. For sub-second failover, you need a combination of low TTL AND health-check-aware routing (e.g., Route 53 health checks). Pure DNS round-robin has no health awareness at all.
DNSSEC and DNS over HTTPS
Traditional DNS is unauthenticated and unencrypted. DNSSEC adds cryptographic signatures to DNS records, allowing resolvers to verify that responses haven't been tampered with (preventing cache poisoning attacks). DNS over HTTPS (DoH) and DNS over TLS (DoT) encrypt DNS queries end-to-end, preventing eavesdropping by ISPs or on-path attackers.
DNS in Production
Major providers like Cloudflare (1.1.1.1), Google (8.8.8.8), and AWS Route 53 support both DoH and DoT. For internal microservices, many teams use service discovery tools like Consul or CoreDNS instead of public DNS, enabling sub-second health-check-aware resolution.