HLDsystem_design~10 mins

Long polling and Server-Sent Events in HLD - Scalability & System Analysis

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Scalability Analysis - Long polling and Server-Sent Events

Growth Table: Long Polling and Server-Sent Events

Users	Connections	Server Load	Network Usage	Latency
100 users	100 concurrent connections	Low CPU and memory	Low bandwidth	Low latency, near real-time
10,000 users	10,000 concurrent connections	High CPU and memory on few servers	Moderate bandwidth	Some delay due to server load
1,000,000 users	1,000,000 concurrent connections	Cannot handle on few servers; memory and CPU bottleneck	High bandwidth; network saturation risk	Increased latency; possible dropped connections
100,000,000 users	100,000,000 concurrent connections	Impossible on single data center; requires global distribution	Very high bandwidth; CDN and edge servers needed	Latency depends on geo-distribution; complex load balancing

First Bottleneck

The first bottleneck is the server's ability to maintain many concurrent open connections. Both long polling and Server-Sent Events (SSE) keep connections open, consuming memory and CPU resources. At around 10,000 to 50,000 concurrent connections per server, resource limits are reached. Network bandwidth also becomes a concern as each connection sends data periodically or stays open.

Scaling Solutions

Horizontal scaling: Add more servers behind a load balancer to distribute connections.
Connection multiplexing: Use protocols like HTTP/2 or WebSockets to reduce overhead per connection.
Use of reverse proxies: Employ Nginx or Envoy to efficiently manage many open connections.
Offload static content: Use CDNs to reduce server bandwidth for static assets.
Sharding users: Distribute users across multiple servers or regions to reduce load per server.
Switch to WebSockets: For very high scale, WebSockets can be more efficient than long polling or SSE.
Optimize message frequency: Reduce how often servers send updates to reduce bandwidth and CPU.

Back-of-Envelope Cost Analysis

Assuming 10,000 users with SSE or long polling:

Each server handles ~10,000 concurrent connections (high but possible with optimized servers).
Each connection sends a small message every 5 seconds -> 0.2 messages per second per connection.
Total messages per second = 10,000 users * 0.2 messages/sec = 2,000 messages/sec.
Bandwidth per message ~1 KB -> 2,000 KB/s = ~2 MB/s bandwidth per server.
Memory per connection ~2 KB -> 10,000 connections * 2 KB = ~20 MB RAM just for connections.
CPU usage depends on message processing; expect moderate CPU load.

Scaling to 1 million users requires ~100 servers, 200,000 messages/sec, and ~200 MB/s bandwidth total.

Interview Tip

Start by explaining how long polling and SSE keep connections open and why that matters for scaling. Identify the server resource limits (memory, CPU, network). Discuss horizontal scaling and connection management techniques. Mention alternatives like WebSockets and CDNs. Always quantify with rough numbers to show understanding of scale.

Self Check

Your server handles 1,000 QPS with long polling connections. Traffic grows 10x to 10,000 QPS. What do you do first and why?

Key Result

Long polling and Server-Sent Events scale well up to tens of thousands of concurrent connections per server. Beyond that, server memory, CPU, and network bandwidth become bottlenecks. Horizontal scaling with load balancers and connection-efficient protocols is essential for millions of users.