| Users/Traffic | Request-response | Event-driven |
|---|---|---|
| 100 users | Single server handles sync calls easily; low latency | Single event broker handles events; simple event queues |
| 10K users | Need load balancers; DB connection pooling; some latency increase | Event broker scales with partitions; async processing smooths spikes |
| 1M users | DB becomes bottleneck; sync calls cause timeouts; scaling app servers costly | Event brokers partitioned; microservices consume events independently; better throughput |
| 100M users | Sync calls cause massive latency; DB sharding needed; complex retries | Multiple event brokers; event storage and replay; eventual consistency accepted |
Request-response vs event-driven in Microservices - Scaling Approaches Compared
In request-response, the database and synchronous waiting cause the first bottleneck as user count grows. The system waits for replies, causing slowdowns and timeouts.
In event-driven, the event broker (message queue) can become the bottleneck if not partitioned or scaled, but async processing reduces direct load on DB and services.
- Request-response: Use load balancers, increase app server instances horizontally, implement DB read replicas and sharding, use caching layers to reduce DB hits.
- Event-driven: Partition event brokers (Kafka partitions), scale consumers horizontally, use durable event storage, implement backpressure and retry mechanisms, adopt eventual consistency.
Assuming 1M users generating 10 requests per second:
- Request-response: 10M QPS total; single DB handles ~10K QPS; need ~1000 DB replicas or sharding; app servers scale to thousands; network bandwidth high due to sync calls.
- Event-driven: 10M events/sec; Kafka cluster can handle ~1M events/sec per cluster; need ~10 clusters or partitions; consumers scale horizontally; network load spread over time.
Start by explaining the difference: request-response is synchronous, event-driven is asynchronous. Discuss bottlenecks for each as traffic grows. Then propose scaling strategies matching those bottlenecks. Highlight trade-offs like latency vs throughput and consistency models.
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas and caching to reduce DB load, then consider sharding or moving to async/event-driven patterns to handle higher throughput.