| Users / Traffic | Data Size | Query Load | System Changes |
|---|---|---|---|
| 100 users | Small (MBs) | Low (few QPS) | Single DB instance, simple indexing |
| 10,000 users | Medium (GBs) | Moderate (hundreds QPS) | Read replicas, connection pooling, optimized queries |
| 1,000,000 users | Large (TBs) | High (thousands QPS) | Sharding, caching layers, partitioning, load balancing |
| 100,000,000 users | Very Large (PBs) | Very High (tens of thousands QPS) | Multi-region clusters, advanced sharding, distributed transactions, strong caching/CDN |
Relational database strengths in HLD - Scalability & System Analysis
At small to medium scale, the database CPU and disk I/O become the first bottleneck because relational databases rely on complex joins and ACID transactions that consume CPU and disk resources.
As traffic grows, connection limits and query latency increase, causing slow responses.
- Vertical scaling: Upgrade server CPU, RAM, and faster disks to handle more queries.
- Read replicas: Offload read queries to replicas to reduce load on primary DB.
- Connection pooling: Reuse DB connections to reduce overhead.
- Caching: Use in-memory caches (e.g., Redis) for frequent queries to reduce DB hits.
- Sharding: Split data horizontally across multiple DB instances by key ranges or user IDs.
- Partitioning: Divide large tables into smaller parts to improve query performance.
- Load balancing: Distribute traffic evenly across app servers and DB replicas.
- Multi-region deployment: Place DBs closer to users to reduce latency.
Assuming 1 million users with 1 request per second each:
- Requests per second: ~1,000,000 QPS (too high for single DB)
- Single PostgreSQL instance handles ~5,000 QPS → need ~200 DB instances or heavy caching
- Storage: 1 TB data growing at 10 GB/day requires scalable storage solutions
- Network bandwidth: 1 Gbps = 125 MB/s, so multiple network interfaces or cloud bandwidth needed
Start by explaining the strengths of relational databases: strong consistency, ACID transactions, and complex queries.
Then discuss how these strengths impact scaling: joins and transactions add CPU and I/O load.
Outline bottlenecks clearly and propose solutions step-by-step: vertical scaling, read replicas, caching, sharding.
Use real numbers to show understanding of limits and trade-offs.
Question: Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first and why?
Answer: Add read replicas to offload read queries and implement connection pooling to reduce overhead. This addresses the immediate CPU and connection bottlenecks without major redesign.