| Users / Data Size | 100 Users | 10K Users | 1M Users | 100M Users |
|---|---|---|---|---|
| Data Distribution | Single shard or few shards, simple key | Multiple shards, risk of uneven load if poor key | Many shards, key must evenly distribute data | Hundreds/thousands shards, key must avoid hotspots |
| Query Performance | Direct queries, no cross-shard | Some cross-shard queries if key poorly chosen | Cross-shard queries costly, key must minimize them | Cross-shard queries expensive, key critical for locality |
| Scalability | Vertical scaling enough | Horizontal scaling starts, key affects balance | Horizontal scaling essential, key affects shard size | Massive horizontal scaling, key must support re-sharding |
| Maintenance | Simple backups | Shard rebalancing rare | Shard rebalancing needed if key causes hotspots | Frequent re-sharding, key must support smooth migration |
Shard key selection in HLD - Scalability & System Analysis
When the shard key is poorly chosen, some shards get much more data or traffic than others. This causes hotspots where a few servers are overloaded while others are idle. This imbalance breaks performance and scalability first, even if hardware is sufficient.
- Choose a high-cardinality key: Use a key with many unique values to spread data evenly.
- Use composite keys: Combine multiple fields to improve distribution.
- Hash-based sharding: Hash the key to randomize distribution and avoid hotspots.
- Range-based sharding with careful ranges: Define ranges that balance load.
- Re-sharding: Migrate data to new shards when imbalance occurs.
- Caching and read replicas: Reduce load on hot shards by caching frequent queries.
- Monitoring: Continuously monitor shard load to detect imbalance early.
Assuming 1M users, each making 1 request per second:
- Total requests per second: ~1,000,000 QPS
- Single shard DB handles ~5,000 QPS → Need ~200 shards
- Storage per user: 1MB → Total 1TB data
- Network bandwidth per shard: 5,000 QPS * 1KB/request ≈ 5MB/s (40Mbps)
- Shard key must evenly distribute 1TB and 5,000 QPS per shard
Start by explaining what a shard key is and why it matters. Then discuss how poor key choice leads to hotspots and bottlenecks. Next, describe how to pick a good key (high cardinality, hashing, composite keys). Finally, mention monitoring and re-sharding as ongoing solutions. Use examples and relate to real data distribution challenges.
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Identify if the shard key evenly distributes load. If not, implement or improve sharding with a better key or hashing to spread traffic across more shards. Also consider adding read replicas and caching to reduce load.