| Users / Data Size | 100 Users | 10K Users | 1M Users | 100M Users |
|---|---|---|---|---|
| Data Volume | Small, fits on single DB | Medium, single DB still possible | Large, single DB slow, storage limits | Very large, single DB impossible |
| Traffic (QPS) | Low (~10-100 QPS) | Moderate (~1K QPS) | High (~50K QPS) | Very High (~5M QPS) |
| DB Load | Single instance handles load | Near max capacity, some slow queries | DB CPU/memory bottleneck | DB storage and CPU overwhelmed |
| Latency | Low latency | Latency starts increasing | Latency high without sharding | Latency very high without sharding |
| Sharding Need | Not needed | Optional, for growth | Required for performance | Mandatory, complex sharding |
Database sharding strategies in HLD - Scalability & System Analysis
The database server is the first bottleneck as user count and data grow. It hits CPU, memory, and storage limits. Single-instance databases struggle with high query rates and large data volumes. This causes slow queries and increased latency.
- Horizontal Sharding: Split data across multiple database servers by a shard key (e.g., user ID range, geographic region). Each shard handles a subset of data and queries.
- Vertical Sharding: Separate tables or features into different databases to reduce load per DB.
- Directory-Based Sharding: Use a lookup service to map keys to shards, allowing flexible shard assignment.
- Consistent Hashing: Distribute data evenly and allow easy addition/removal of shards with minimal data movement.
- Caching: Use in-memory caches (e.g., Redis) to reduce DB reads.
- Read Replicas: Offload read queries to replicas to reduce master DB load.
- Connection Pooling: Efficiently manage DB connections to avoid overload.
- At 1M users, expect ~50K QPS. A single DB handles ~5K QPS max, so need ~10 shards.
- Storage per user assumed 1MB -> 1M users = ~1TB data, split across shards.
- Network bandwidth per shard: 5K QPS * 1KB/request ≈ 5MB/s, manageable on 1Gbps links.
- Adding shards increases operational complexity and cost (servers, monitoring).
Start by explaining the current system limits. Identify the database as the bottleneck. Discuss sharding types and trade-offs. Mention shard key choice importance. Explain how caching and replicas help. Show awareness of operational complexity and data consistency challenges.
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Introduce horizontal sharding to split load across multiple database servers. This reduces CPU and memory pressure on each DB instance and maintains low latency.