| Users | Service Count | Data Volume | Traffic Pattern | Infrastructure |
|---|---|---|---|---|
| 100 users | Few (5-10) | Low (MBs) | Low, simple calls | Single server or small cluster |
| 10K users | 10-20 | GBs | Moderate, some spikes | Multiple servers, basic load balancing |
| 1M users | 20-50 | TBs | High, unpredictable spikes | Multiple clusters, service discovery, caching |
| 100M users | 50+ | Petabytes | Very high, global distribution | Multi-region clusters, advanced orchestration, CDNs |
Why case studies illustrate practical decisions in Microservices - Scalability Evidence
As microservices grow, the first bottleneck is managing communication between services. Network latency and data consistency issues arise because many small services must coordinate. This slows down response times and complicates debugging.
- Service Mesh: Adds a dedicated layer to handle service communication, retries, and security.
- API Gateway: Centralizes requests to reduce complexity for clients.
- Event-Driven Architecture: Uses asynchronous messaging to decouple services and improve scalability.
- Database Sharding: Splits data across multiple databases to reduce load.
- Horizontal Scaling: Add more instances of services behind load balancers.
- Caching: Use distributed caches to reduce database hits.
- Monitoring and Tracing: Implement tools to track requests across services for debugging and optimization.
At 1M users, assuming 10 requests per user per minute, that is about 166,000 requests per second (RPS). Each microservice instance can handle roughly 1000-5000 RPS, so hundreds of instances are needed.
Data storage grows to terabytes, requiring distributed databases and sharding.
Network bandwidth must support high inter-service communication; 1 Gbps links may saturate quickly, requiring multiple network interfaces or cloud bandwidth scaling.
Start by describing the system at a small scale. Then explain what changes as users grow. Identify the first bottleneck clearly. Propose targeted solutions matching the bottleneck. Use real examples or case studies to show practical decisions. Finally, discuss trade-offs and costs.
Your database handles 1000 QPS. Traffic grows 10x. What do you do first?
Answer: Add read replicas and implement caching to reduce load on the primary database before considering sharding or more complex solutions.