| Scale | Users/Clients | Messages per Second | System Changes |
|---|---|---|---|
| Small | 100 users | 100-500 msg/s | Single broker, simple topic management, direct message delivery |
| Medium | 10,000 users | 10,000-50,000 msg/s | Multiple brokers, partitioned topics, message persistence, basic load balancing |
| Large | 1,000,000 users | 1M+ msg/s | Clustered brokers, topic sharding, advanced load balancing, message replication, durable storage |
| Very Large | 100,000,000 users | 100M+ msg/s | Geo-distributed clusters, multi-region replication, CDN integration, hierarchical topic routing |
Pub/sub pattern in HLD - Scalability & System Analysis
At small to medium scale, the message broker is the first bottleneck. It handles all message routing and delivery. As users and message rates grow, a single broker's CPU, memory, and network limits are reached quickly.
Also, the network bandwidth between publishers, brokers, and subscribers becomes a bottleneck as message volume increases.
- Horizontal scaling: Add more broker instances and distribute topics among them (partitioning/sharding).
- Load balancing: Use load balancers to distribute client connections evenly across brokers.
- Caching: Use subscriber-side caching or edge caches for frequently requested messages.
- Message persistence: Store messages durably to allow replay and reduce load spikes.
- Geo-distribution: Deploy brokers in multiple regions to reduce latency and network load.
- CDN integration: For static or large messages, use CDNs to offload delivery.
Assuming 10,000 messages per second at medium scale:
- Each message ~1 KB → 10 MB/s bandwidth needed.
- Broker CPU: 1 server can handle ~5,000 msg/s → need 2+ brokers.
- Storage: For message persistence, 10,000 msg/s × 1 KB × 3600 s = ~36 GB/hour.
- Network: 1 Gbps link (~125 MB/s) sufficient for this scale.
Start by explaining the pub/sub components: publishers, subscribers, and brokers.
Discuss how message volume and user count affect broker load and network.
Identify the bottleneck (broker capacity) and propose scaling with partitioning and horizontal scaling.
Mention persistence and geo-distribution for reliability and latency.
Your message broker handles 1,000 messages per second. Traffic grows 10x to 10,000 msg/s. What do you do first?
Answer: Add more broker instances and partition topics to distribute load horizontally, because a single broker cannot handle 10x the load.