| Users | Request Volume | Aggregation Load | Latency Impact | Infrastructure Changes |
|---|---|---|---|---|
| 100 users | ~200 requests/sec | Single aggregator instance handles requests | Low latency, simple aggregation | Basic load balancer, 1 aggregator service |
| 10,000 users | ~20,000 requests/sec | Aggregator CPU/memory starts to strain | Latency may increase due to queuing | Horizontal scaling of aggregator, caching introduced |
| 1,000,000 users | ~2,000,000 requests/sec | Aggregator becomes bottleneck; network saturation possible | Higher latency, possible timeouts | Sharded aggregation, distributed caches, async processing |
| 100,000,000 users | ~200,000,000 requests/sec | Multiple aggregator clusters needed; data partitioning essential | Latency sensitive; requires advanced load balancing | Global load balancing, CDN for static data, event-driven aggregation |
Request aggregation in Microservices - Scalability & System Analysis
Start learning this pattern below
Jump into concepts and practice - no test required
The aggregator service CPU and memory become the first bottleneck as it must combine multiple microservice responses per user request. At around 10,000 users, a single aggregator struggles to handle the volume due to CPU saturation and increased latency from queuing.
- Horizontal Scaling: Add more aggregator instances behind a load balancer to distribute request load.
- Caching: Cache aggregated responses for repeated queries to reduce load on microservices and aggregator.
- Sharding: Partition aggregation by user segments or request types to parallelize processing.
- Asynchronous Processing: Use event-driven or message queues to decouple aggregation and reduce latency spikes.
- CDN: For static or semi-static aggregated data, use CDN to offload traffic from aggregator.
- At 10,000 users: ~20,000 requests/sec (assuming 2 requests per user per second)
- Aggregator CPU: Each instance handles ~5,000 concurrent requests; need ~4 instances minimum
- Memory: Aggregation buffers and response assembly require sufficient RAM (e.g., 4-8GB per instance)
- Network Bandwidth: 1 Gbps (~125 MB/s) per server; ensure aggregator instances have enough bandwidth for combined microservice responses
- Storage: Minimal for aggregation itself; caching layer may require fast in-memory stores like Redis
Start by explaining the aggregation flow and identify the component that combines multiple microservice responses. Discuss how load increases with users and which resource (CPU, memory, network) will saturate first. Then propose scaling strategies step-by-step, justifying each based on the bottleneck. Finally, mention trade-offs like latency vs consistency and caching freshness.
Your aggregator service handles 1000 queries per second. Traffic grows 10x to 10,000 QPS. What is your first action and why?
Practice
Solution
Step 1: Understand request aggregation concept
Request aggregation means collecting data from multiple microservices to form one combined response.Step 2: Identify the main goal
The goal is to reduce multiple client calls into one, improving efficiency and user experience.Final Answer:
To combine data from multiple microservices into a single response -> Option DQuick Check:
Request aggregation = combine multiple responses [OK]
- Confusing aggregation with service splitting
- Thinking it only caches data
- Mixing aggregation with transaction management
Solution
Step 1: Review aggregator call patterns
Efficient aggregators call multiple services in parallel to reduce total wait time.Step 2: Identify correct implementation
Parallel asynchronous calls improve performance and user experience compared to sequential calls.Final Answer:
Make parallel calls to all required microservices and aggregate responses asynchronously -> Option AQuick Check:
Parallel async calls = best aggregator practice [OK]
- Using sequential calls causing slow responses
- Ignoring some microservices in aggregation
- Trying to use database triggers for aggregation
async function aggregate() {
const user = await getUser();
const orders = await getOrders(user.id);
const payments = await getPayments(user.id);
return { user, orders, payments };
}
What is the main problem with this code?Solution
Step 1: Analyze call sequence
The code waits for getUser, then calls getOrders and waits, then calls getPayments and waits, all sequentially.Step 2: Identify inefficiency
Calling getOrders and getPayments one after another increases total wait time unnecessarily.Final Answer:
It calls getOrders and getPayments sequentially, increasing total response time -> Option BQuick Check:
Sequential calls = slower aggregation [OK]
- Assuming error handling is missing
- Thinking return format is incorrect
- Believing getUser is called multiple times
Solution
Step 1: Understand error impact in aggregation
If one service fails, the aggregator should still return available data to avoid full failure.Step 2: Choose error handling strategy
Returning partial data with error info improves user experience and system resilience.Final Answer:
Ignore errors and return partial data with error info for failed services -> Option CQuick Check:
Partial data + error info = robust aggregation [OK]
- Retrying endlessly causing delays
- Stopping all calls on one failure
- Caching errors permanently causing stale data
Solution
Step 1: Consider scalability needs
Parallel async calls reduce latency and improve throughput under load.Step 2: Add timeout and fallback
Timeouts prevent long waits; fallback data keeps user experience smooth if a service is slow or down.Step 3: Evaluate other options
Sequential calls and long caching reduce freshness and responsiveness; monolith loses microservices benefits; synchronous blocking hurts scalability.Final Answer:
Use asynchronous parallel calls with timeout and fallback data for each microservice -> Option AQuick Check:
Async parallel + timeout + fallback = scalable aggregator [OK]
- Using sequential calls causing slow response
- Relying on stale cached data too long
- Ignoring microservices benefits by monolith design
