0
0
HLDsystem_design~10 mins

Pub/sub pattern in HLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Pub/sub pattern
Growth Table: Pub/Sub Pattern Scaling
ScaleUsers/ClientsMessages per SecondSystem Changes
Small100 users100-500 msg/sSingle broker, simple topic management, direct message delivery
Medium10,000 users10,000-50,000 msg/sMultiple brokers, partitioned topics, message persistence, basic load balancing
Large1,000,000 users1M+ msg/sClustered brokers, topic sharding, advanced load balancing, message replication, durable storage
Very Large100,000,000 users100M+ msg/sGeo-distributed clusters, multi-region replication, CDN integration, hierarchical topic routing
First Bottleneck

At small to medium scale, the message broker is the first bottleneck. It handles all message routing and delivery. As users and message rates grow, a single broker's CPU, memory, and network limits are reached quickly.

Also, the network bandwidth between publishers, brokers, and subscribers becomes a bottleneck as message volume increases.

Scaling Solutions
  • Horizontal scaling: Add more broker instances and distribute topics among them (partitioning/sharding).
  • Load balancing: Use load balancers to distribute client connections evenly across brokers.
  • Caching: Use subscriber-side caching or edge caches for frequently requested messages.
  • Message persistence: Store messages durably to allow replay and reduce load spikes.
  • Geo-distribution: Deploy brokers in multiple regions to reduce latency and network load.
  • CDN integration: For static or large messages, use CDNs to offload delivery.
Back-of-Envelope Cost Analysis

Assuming 10,000 messages per second at medium scale:

  • Each message ~1 KB → 10 MB/s bandwidth needed.
  • Broker CPU: 1 server can handle ~5,000 msg/s → need 2+ brokers.
  • Storage: For message persistence, 10,000 msg/s × 1 KB × 3600 s = ~36 GB/hour.
  • Network: 1 Gbps link (~125 MB/s) sufficient for this scale.
Interview Tip

Start by explaining the pub/sub components: publishers, subscribers, and brokers.

Discuss how message volume and user count affect broker load and network.

Identify the bottleneck (broker capacity) and propose scaling with partitioning and horizontal scaling.

Mention persistence and geo-distribution for reliability and latency.

Self Check

Your message broker handles 1,000 messages per second. Traffic grows 10x to 10,000 msg/s. What do you do first?

Answer: Add more broker instances and partition topics to distribute load horizontally, because a single broker cannot handle 10x the load.

Key Result
The message broker is the first bottleneck in pub/sub systems; horizontal scaling with partitioned brokers and load balancing is key to handle growth from thousands to millions of messages per second.