Database Sharding & Partitioning
Horizontal vs vertical partitioning, shard key selection, range vs hash partitioning, hot spots, and rebalancing strategies.
Why Sharding Exists
A single database node has hard limits: CPU, RAM, disk I/O, and network throughput. Vertical scaling (bigger machine) eventually hits diminishing returns and cost ceilings. Sharding — also called horizontal partitioning — distributes data across multiple nodes (shards), each owning a subset of the dataset. The total capacity grows linearly with the number of shards.
Vertical vs Horizontal Partitioning
| Strategy | What Moves | Best For | Limitation |
|---|---|---|---|
| Vertical Partitioning | Columns split across tables (e.g., separate user_profiles from user_preferences) | Reducing table width, separating hot from cold columns | Still on one node; doesn't help with row volume |
| Horizontal Partitioning (Sharding) | Rows split across multiple nodes by shard key | Scaling beyond a single node's capacity | Cross-shard queries become expensive or require scatter-gather |
Sharding Strategies
Range-Based Sharding
Rows are assigned to shards based on a contiguous range of the shard key. Example: shard 1 holds `user_id` 1–1,000,000, shard 2 holds 1,000,001–2,000,000, and so on. Range queries are efficient because they only touch one or a few consecutive shards.
Hot Spot Problem
Range sharding creates hot spots when writes are skewed. If user IDs are monotonically increasing (auto-increment), all new users land on the last shard. One shard becomes a bottleneck while others sit idle. This is a classic interview question: 'What happens if you shard by user_id range and your app is growing rapidly?'
Hash-Based Sharding
Apply a hash function to the shard key and assign `hash(key) % num_shards` to determine the target shard. This distributes data uniformly regardless of key distribution, eliminating hot spots from sequential keys. The downside: range queries require scatter-gather across all shards because adjacent keys hash to different shards.
def get_shard(user_id: int, num_shards: int) -> int:
return hash(user_id) % num_shards
# user_id=1 → shard 1
# user_id=2 → shard 3
# user_id=3 → shard 0
# Keys are scattered — no hot spots, but no range localityDirectory-Based Sharding
A lookup service (the shard directory) maintains a mapping from key to shard. This provides maximum flexibility — you can move individual keys between shards without rehashing. The cost is an extra network hop and a single point of failure if the directory is unavailable.
Choosing a Shard Key
The shard key is the most consequential architectural decision when sharding. A bad shard key causes hot spots, expensive cross-shard queries, or both. A good shard key has three properties:
- High cardinality: Many distinct values so data distributes across shards evenly (not `country_code` with only 200 values).
- Low correlation with write patterns: Avoid monotonically increasing keys (auto-increment IDs) for hash sharding, as they create temporal hot spots for range sharding.
- Query alignment: The shard key should match your most frequent access pattern. If 90% of queries filter by `tenant_id`, shard by `tenant_id`.
| Shard Key | Hot Spot Risk | Range Query | Notes |
|---|---|---|---|
| user_id (hash) | Low | Scatter-gather | Good default for user-centric systems |
| created_at (range) | High (recent writes) | Excellent | Bad — all writes hit latest shard |
| tenant_id (hash) | Medium (whale tenants) | Per-tenant | Good for multi-tenant SaaS |
| random UUID (hash) | None | Impossible | Maximum write distribution, no locality |
Rebalancing Shards
When you add shards, existing data must be redistributed — a process called rebalancing. Naive hash sharding (`hash % n`) is catastrophic to rebalance: changing `n` from 4 to 5 remaps ~80% of all keys, requiring a massive data migration.
Consistent hashing solves this. Keys and nodes are mapped onto a ring. When a node is added, only the keys between the new node and its predecessor move — typically `1/n` of the total data. This is how DynamoDB, Cassandra, and Riak handle rebalancing.
Virtual Nodes (vnodes)
To improve distribution uniformity, each physical node is represented by multiple virtual nodes on the ring. Cassandra defaults to 256 vnodes per node. This ensures that when a node fails or is added, load redistributes evenly across all remaining nodes rather than landing entirely on the adjacent node.
Cross-Shard Queries and Joins
The biggest operational pain with sharding is cross-shard queries. A query like `SELECT * FROM orders WHERE status = 'pending'` must be broadcast to all shards (scatter), results collected (gather), and merged. This is slow, expensive, and complex. Design your schema to avoid cross-shard queries on hot paths by keeping related data on the same shard.
Interview Tip
When asked 'How would you scale the database for X?', structure your answer: (1) Add read replicas first — they handle 80% of load in most apps. (2) Introduce a cache (Redis) to reduce DB load. (3) Only then consider sharding. Sharding adds enormous operational complexity. Interviewers respect candidates who treat it as a last resort, not a first instinct.