| Users | Connections | Server Load | Network Traffic | Data Storage |
|---|---|---|---|---|
| 100 users | ~100 connections | Single server handles all | Low, few messages/sec | Minimal, mostly ephemeral |
| 10,000 users | ~10,000 connections | Multiple servers with load balancer | Moderate, hundreds messages/sec | Some persistent logs |
| 1,000,000 users | ~1,000,000 connections | Many servers, horizontal scaling | High, thousands messages/sec | Large logs, analytics data |
| 100,000,000 users | ~100,000,000 connections | Massive cluster, sharding | Very high, millions messages/sec | Distributed storage, archiving |
WebSocket for real-time communication in HLD - Scalability & System Analysis
The first bottleneck is the WebSocket server's ability to maintain concurrent connections. Each server can handle roughly 1,000 to 5,000 concurrent WebSocket connections depending on hardware and software optimizations. Beyond this, CPU and memory usage spike, causing latency and dropped connections.
- Horizontal scaling: Add more WebSocket servers behind a load balancer to distribute connections.
- Connection sharding: Partition users by region or user ID to route connections to specific servers.
- Use message brokers: Employ systems like Redis Pub/Sub or Kafka to distribute messages between servers.
- Caching: Cache frequent messages or state to reduce backend load.
- CDN for static content: Offload static assets to CDN to reduce server bandwidth.
- Optimize protocols: Use binary frames and compression to reduce bandwidth.
- At 10,000 users sending 1 message/sec: 10,000 messages/sec to handle.
- Each message ~1 KB -> 10 MB/s bandwidth needed.
- Storage for logs: 10,000 messages/sec x 1 KB x 3600 sec = ~36 GB/hour.
- Network: 1 Gbps link can handle ~125 MB/s, so one server can handle this bandwidth.
- CPU: Each server can handle ~3,000-5,000 connections comfortably.
Start by explaining how WebSocket maintains persistent connections and why that limits concurrency per server. Then discuss how to horizontally scale servers and distribute connections. Mention message brokers for cross-server communication. Finally, talk about bandwidth and storage considerations. Keep answers structured: bottleneck -> solution -> trade-offs.
Your WebSocket server handles 1,000 QPS (messages per second). Traffic grows 10x to 10,000 QPS. What do you do first and why?
Answer: Add more WebSocket servers and use a load balancer to distribute connections and message load. This horizontally scales capacity to handle increased concurrent connections and message throughput.