0
0
Microservicessystem_design~10 mins

Request aggregation in Microservices - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Request aggregation
Growth Table: Request Aggregation at Different Scales
UsersRequest VolumeAggregation LoadLatency ImpactInfrastructure Changes
100 users~200 requests/secSingle aggregator instance handles requestsLow latency, simple aggregationBasic load balancer, 1 aggregator service
10,000 users~20,000 requests/secAggregator CPU/memory starts to strainLatency may increase due to queuingHorizontal scaling of aggregator, caching introduced
1,000,000 users~2,000,000 requests/secAggregator becomes bottleneck; network saturation possibleHigher latency, possible timeoutsSharded aggregation, distributed caches, async processing
100,000,000 users~200,000,000 requests/secMultiple aggregator clusters needed; data partitioning essentialLatency sensitive; requires advanced load balancingGlobal load balancing, CDN for static data, event-driven aggregation
First Bottleneck

The aggregator service CPU and memory become the first bottleneck as it must combine multiple microservice responses per user request. At around 10,000 users, a single aggregator struggles to handle the volume due to CPU saturation and increased latency from queuing.

Scaling Solutions
  • Horizontal Scaling: Add more aggregator instances behind a load balancer to distribute request load.
  • Caching: Cache aggregated responses for repeated queries to reduce load on microservices and aggregator.
  • Sharding: Partition aggregation by user segments or request types to parallelize processing.
  • Asynchronous Processing: Use event-driven or message queues to decouple aggregation and reduce latency spikes.
  • CDN: For static or semi-static aggregated data, use CDN to offload traffic from aggregator.
Back-of-Envelope Cost Analysis
  • At 10,000 users: ~20,000 requests/sec (assuming 2 requests per user per second)
  • Aggregator CPU: Each instance handles ~5,000 concurrent requests; need ~4 instances minimum
  • Memory: Aggregation buffers and response assembly require sufficient RAM (e.g., 4-8GB per instance)
  • Network Bandwidth: 1 Gbps (~125 MB/s) per server; ensure aggregator instances have enough bandwidth for combined microservice responses
  • Storage: Minimal for aggregation itself; caching layer may require fast in-memory stores like Redis
Interview Tip

Start by explaining the aggregation flow and identify the component that combines multiple microservice responses. Discuss how load increases with users and which resource (CPU, memory, network) will saturate first. Then propose scaling strategies step-by-step, justifying each based on the bottleneck. Finally, mention trade-offs like latency vs consistency and caching freshness.

Self Check Question

Your aggregator service handles 1000 queries per second. Traffic grows 10x to 10,000 QPS. What is your first action and why?

Key Result
The aggregator service CPU and memory become the first bottleneck as request volume grows; horizontal scaling and caching are the primary solutions to maintain low latency and handle increased load.