| Users | Infrastructure Focus | Key Changes |
|---|---|---|
| 100 users | Basic setup | Single server, simple network, minimal monitoring |
| 10,000 users | Scaling basics | Load balancers, database replicas, caching introduced |
| 1,000,000 users | Advanced scaling | Multiple data centers, sharding, CDN, automated failover |
| 100,000,000 users | Global infrastructure | Multi-region deployment, microservices, edge computing, extensive monitoring |
Why infrastructure design underpins everything in HLD - Scalability Evidence
At small scale, the infrastructure is simple and can handle all requests. As users grow, the first bottleneck is usually the network and server capacity. Without proper design, servers get overwhelmed, causing slow responses or downtime. Poor infrastructure design means you can't add more servers easily or balance load well, so the system breaks under higher traffic.
- Horizontal scaling: Add more servers behind load balancers to share traffic.
- Vertical scaling: Upgrade server CPU, RAM, and network capacity.
- Caching: Use in-memory caches to reduce database load and speed responses.
- Sharding: Split databases by user or data type to distribute load.
- Content Delivery Network (CDN): Serve static content closer to users to reduce latency and bandwidth use.
- Automation and monitoring: Detect and respond to failures quickly to maintain uptime.
For 1 million users, assuming 10% active concurrently:
- Requests per second (RPS): ~100,000 (assuming 1 request per second per active user)
- Servers needed: ~20 servers (each handles ~5,000 concurrent connections)
- Database QPS: ~100,000 (may require sharding or replicas)
- Bandwidth: 1 Gbps network supports ~125 MB/s, so multiple 1 Gbps links or 10 Gbps needed
- Storage: Depends on data size, but expect terabytes for logs, user data, backups
Start by describing the current infrastructure and its limits. Then explain what breaks first as users grow. Next, propose specific scaling solutions tied to those bottlenecks. Finally, discuss trade-offs and costs. Use clear examples and relate to real systems to show understanding.
Your database handles 1000 queries per second (QPS). Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas and implement caching to reduce direct database load before considering sharding or hardware upgrades.
