| Users / Requests | Synchronous Communication | Asynchronous Communication |
|---|---|---|
| 100 users | Direct request-response calls work well; low latency; simple error handling | Message queues lightly used; delays minimal; easy to manage |
| 10,000 users | Increased latency; some request timeouts; servers start to block waiting for responses | Message queues handle bursts; decoupling improves resilience; some message backlog possible |
| 1 million users | High latency; many blocked threads; servers overwhelmed; cascading failures possible | Queues scale with partitions; consumers scale horizontally; eventual consistency accepted; better fault tolerance |
| 100 million users | System likely fails; synchronous calls cause bottlenecks; scaling very costly | Distributed queues with sharding; multiple consumer groups; complex monitoring; high throughput achievable |
Synchronous vs asynchronous communication in Microservices - Scaling Approaches Compared
In synchronous communication, the first bottleneck is the application server's thread pool and CPU waiting on remote calls, causing blocked resources and increased latency.
In asynchronous communication, the bottleneck shifts to the message broker's throughput and storage capacity, as it must handle high message volumes reliably.
- Synchronous: Use load balancers and horizontal scaling of services to increase concurrent handling; implement timeouts and retries; introduce caching to reduce calls.
- Asynchronous: Scale message brokers horizontally with partitioning and replication; add more consumers to process queues in parallel; use backpressure and rate limiting; implement dead-letter queues for failures.
- For both, use circuit breakers to prevent cascading failures and improve system resilience.
Assuming 1 million users generating 10 requests per second:
- Total requests: 10 million requests/sec.
- Synchronous servers: Each server handles ~3000 concurrent requests; need ~3300 servers to handle load.
- Message broker: Needs to handle 10 million messages/sec; a single Kafka cluster can handle ~1 million messages/sec, so at least 10 clusters or partitions needed.
- Storage: Message retention for 24 hours at 1 KB per message = ~864 GB storage.
- Network bandwidth: 10 million requests/sec * 1 KB = ~10 GB/s (~80 Gbps), requiring high network capacity.
Start by defining synchronous and asynchronous communication clearly. Discuss pros and cons related to latency, coupling, and fault tolerance. Identify bottlenecks at different scales. Propose scaling strategies specific to each mode. Use real numbers to justify your approach. Show understanding of trade-offs and system resilience.
Your database handles 1000 QPS. Traffic grows 10x. What do you do first?
Answer: Introduce caching and read replicas to reduce load on the database. If still insufficient, consider sharding the database to distribute data and queries across multiple instances.