0
0
HLDsystem_design~10 mins

Long polling and Server-Sent Events in HLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Long polling and Server-Sent Events
Growth Table: Long Polling and Server-Sent Events
UsersConnectionsServer LoadNetwork UsageLatency
100 users100 concurrent connectionsLow CPU and memoryLow bandwidthLow latency, near real-time
10,000 users10,000 concurrent connectionsHigh CPU and memory on few serversModerate bandwidthSome delay due to server load
1,000,000 users1,000,000 concurrent connectionsCannot handle on few servers; memory and CPU bottleneckHigh bandwidth; network saturation riskIncreased latency; possible dropped connections
100,000,000 users100,000,000 concurrent connectionsImpossible on single data center; requires global distributionVery high bandwidth; CDN and edge servers neededLatency depends on geo-distribution; complex load balancing
First Bottleneck

The first bottleneck is the server's ability to maintain many concurrent open connections. Both long polling and Server-Sent Events (SSE) keep connections open, consuming memory and CPU resources. At around 10,000 to 50,000 concurrent connections per server, resource limits are reached. Network bandwidth also becomes a concern as each connection sends data periodically or stays open.

Scaling Solutions
  • Horizontal scaling: Add more servers behind a load balancer to distribute connections.
  • Connection multiplexing: Use protocols like HTTP/2 or WebSockets to reduce overhead per connection.
  • Use of reverse proxies: Employ Nginx or Envoy to efficiently manage many open connections.
  • Offload static content: Use CDNs to reduce server bandwidth for static assets.
  • Sharding users: Distribute users across multiple servers or regions to reduce load per server.
  • Switch to WebSockets: For very high scale, WebSockets can be more efficient than long polling or SSE.
  • Optimize message frequency: Reduce how often servers send updates to reduce bandwidth and CPU.
Back-of-Envelope Cost Analysis

Assuming 10,000 users with SSE or long polling:

  • Each server handles ~10,000 concurrent connections (high but possible with optimized servers).
  • Each connection sends a small message every 5 seconds -> 0.2 messages per second per connection.
  • Total messages per second = 10,000 users * 0.2 messages/sec = 2,000 messages/sec.
  • Bandwidth per message ~1 KB -> 2,000 KB/s = ~2 MB/s bandwidth per server.
  • Memory per connection ~2 KB -> 10,000 connections * 2 KB = ~20 MB RAM just for connections.
  • CPU usage depends on message processing; expect moderate CPU load.

Scaling to 1 million users requires ~100 servers, 200,000 messages/sec, and ~200 MB/s bandwidth total.

Interview Tip

Start by explaining how long polling and SSE keep connections open and why that matters for scaling. Identify the server resource limits (memory, CPU, network). Discuss horizontal scaling and connection management techniques. Mention alternatives like WebSockets and CDNs. Always quantify with rough numbers to show understanding of scale.

Self Check

Your server handles 1,000 QPS with long polling connections. Traffic grows 10x to 10,000 QPS. What do you do first and why?

Key Result
Long polling and Server-Sent Events scale well up to tens of thousands of concurrent connections per server. Beyond that, server memory, CPU, and network bandwidth become bottlenecks. Horizontal scaling with load balancers and connection-efficient protocols is essential for millions of users.