| Users | System State | Changes |
|---|---|---|
| 100 users | Legacy monolith handles all requests | Minimal load, no new microservices yet |
| 10,000 users | Initial microservices start replacing parts of monolith | Routing layer added, some services extracted, moderate traffic |
| 1,000,000 users | Most features served by microservices, monolith shrinks | API gateway scales horizontally, microservices scale independently |
| 100,000,000 users | Monolith fully replaced, large microservices ecosystem | Advanced service mesh, global load balancing, data sharding |
Strangler fig pattern in Microservices - Scalability & System Analysis
At small scale, the legacy monolith is the bottleneck because it handles all requests and is hard to scale horizontally.
As traffic grows, the routing layer and API gateway can become bottlenecks if not scaled properly.
Eventually, database access shared by monolith and microservices can limit throughput.
- Incremental migration: Gradually replace monolith features with microservices to reduce risk.
- API Gateway: Use a scalable gateway to route requests to microservices or legacy system.
- Horizontal scaling: Add more instances of microservices and gateway to handle load.
- Caching: Cache common responses to reduce load on backend services.
- Database strategies: Use read replicas, sharding, or separate databases per microservice.
- Service mesh: Manage microservice communication efficiently at large scale.
Assuming 1M users generate 10,000 requests per second total:
- Requests per second: ~10,000 QPS
- API Gateway instances: 3-5 servers (each handles ~2000-3000 QPS)
- Microservices: scaled independently, each 1-3 servers depending on load
- Database: read replicas to handle ~10,000 QPS, write capacity scaled or sharded
- Bandwidth: 10,000 QPS * 1 KB/request = ~10 MB/s (well within 1 Gbps network)
- Storage: depends on data retention, but microservices can use separate DBs to optimize
Start by explaining the legacy monolith and its limitations.
Describe how the strangler fig pattern incrementally replaces parts with microservices.
Discuss bottlenecks at each stage and how to scale routing, services, and data.
Highlight trade-offs between risk, complexity, and scalability.
Your database handles 1000 QPS. Traffic grows 10x. What do you do first?
Answer: Add read replicas and implement caching to reduce database load before scaling writes or sharding.