0
0
HLDsystem_design~25 mins

WebSocket for real-time communication in HLD - System Design Exercise

Choose your learning style9 modes available
Design: Real-Time Communication System using WebSocket
Design covers WebSocket server architecture, client connection management, message routing, and scaling strategies. Out of scope are client UI details and persistent message storage.
Functional Requirements
FR1: Support real-time bidirectional communication between clients and server
FR2: Handle up to 50,000 concurrent WebSocket connections
FR3: Deliver messages with p99 latency under 100ms
FR4: Support user authentication and authorization before connection
FR5: Allow broadcasting messages to multiple clients efficiently
FR6: Ensure message ordering and delivery guarantees
FR7: Provide reconnection support for clients after network interruptions
Non-Functional Requirements
NFR1: System must maintain 99.9% uptime (approx. 8.77 hours downtime per year)
NFR2: Scale horizontally to handle increasing number of connections
NFR3: Use industry standard protocols and technologies
NFR4: Secure communication with TLS encryption
NFR5: Minimize server resource usage per connection
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
❓ Question 6
❓ Question 7
Key Components
Load balancer to distribute incoming WebSocket connections
WebSocket server cluster to handle connections
Authentication service for validating users
Message broker or pub/sub system for routing messages
In-memory data store for session and connection state
TLS termination for secure communication
Design Patterns
Publish-Subscribe pattern for message distribution
Sticky sessions or session affinity for connection consistency
Horizontal scaling with stateless WebSocket servers
Backpressure handling to avoid server overload
Heartbeat and ping/pong for connection health checks
Reference Architecture
Client1 ---\
Client2 ----> Load Balancer ---> WebSocket Server Cluster ---> Message Broker ---> Other Servers
ClientN ---/

Authentication Service <--> WebSocket Server Cluster

In-Memory Store <--> WebSocket Server Cluster

TLS Termination at Load Balancer
Components
Load Balancer
Nginx or AWS ALB
Distributes incoming WebSocket connection requests evenly across WebSocket servers and handles TLS termination
WebSocket Server Cluster
Node.js with ws library or similar
Manages WebSocket connections, authenticates users, sends and receives messages
Authentication Service
OAuth2 or JWT based service
Validates user credentials before allowing WebSocket upgrade
Message Broker
Redis Pub/Sub or Apache Kafka
Routes messages between WebSocket servers for broadcasting and private messaging
In-Memory Data Store
Redis
Stores session data, connection metadata, and supports quick lookup for routing
Request Flow
1. Client sends HTTP request to Load Balancer to initiate WebSocket handshake.
2. Load Balancer terminates TLS and forwards request to a WebSocket server.
3. WebSocket server authenticates client using Authentication Service.
4. Upon successful authentication, WebSocket connection is established.
5. Client sends messages over WebSocket to server.
6. WebSocket server publishes messages to Message Broker for distribution.
7. Other WebSocket servers subscribed to Message Broker receive messages and forward to connected clients.
8. Heartbeat messages (ping/pong) maintain connection health.
9. If client disconnects, reconnection attempts are handled with session restoration using In-Memory Data Store.
Database Schema
Entities: - User: user_id (PK), username, password_hash, auth_token - Session: session_id (PK), user_id (FK), connection_id, last_active_timestamp - Message: message_id (PK), sender_user_id (FK), recipient_user_id (nullable), group_id (nullable), content, timestamp Relationships: - User 1:N Session - User 1:N Message (as sender) - Message may be private (recipient_user_id) or broadcast (group_id)
Scaling Discussion
Bottlenecks
WebSocket server CPU and memory limits due to large number of concurrent connections
Load balancer capacity to handle many simultaneous TLS handshakes
Message broker throughput limits when broadcasting to many clients
Network bandwidth constraints for high message volume
Session state synchronization across servers
Solutions
Scale WebSocket servers horizontally and use stateless design with shared session store
Use multiple load balancers with DNS round-robin or cloud provider scaling
Partition message topics in message broker and use sharding
Implement message compression and rate limiting to reduce bandwidth
Use distributed in-memory stores like Redis Cluster for session state replication
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing architecture and data flow, 10 minutes discussing scaling and trade-offs, 5 minutes summarizing.
Explain why WebSocket is chosen for bidirectional real-time communication
Discuss authentication before WebSocket upgrade for security
Describe how load balancer and server cluster handle scale and availability
Highlight message broker role in decoupling and efficient message routing
Address reconnection and session management for reliability
Mention trade-offs between consistency, latency, and resource usage