HLDsystem_design~25 mins

Long polling and Server-Sent Events in HLD - System Design Exercise

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Design: Real-time Notification System using Long Polling and Server-Sent Events

Design covers server architecture, client-server communication using Long Polling and Server-Sent Events, connection management, and fallback mechanisms. Out of scope are message content design and authentication mechanisms.

Functional Requirements

FR1: Deliver real-time notifications from server to clients

FR2: Support up to 10,000 concurrent clients

FR3: Ensure notifications are delivered with latency under 1 second

FR4: Allow clients to receive updates without frequent polling

FR5: Support fallback from Server-Sent Events to Long Polling if SSE is not supported

FR6: Handle client reconnections gracefully without losing messages

Non-Functional Requirements

NFR1: System must maintain 99.9% uptime (about 8.77 hours downtime per year)

NFR2: API response latency p99 under 500ms for connection establishment

NFR3: Support clients on modern browsers and some legacy browsers

NFR4: Minimize server resource usage for idle connections

Think Before You Design

Questions to Ask

❓ Question 1

❓ Question 2

❓ Question 3

❓ Question 4

❓ Question 5

Key Components

HTTP server capable of handling long-lived connections

Event source endpoint for Server-Sent Events

Long polling endpoint for fallback

Message queue or broker for notification delivery

Connection manager to track active clients

Client-side logic to handle SSE and fallback to long polling

Design Patterns

Publish-Subscribe pattern for notifications

Fallback pattern from SSE to Long Polling

Heartbeat or ping messages to keep connections alive

Exponential backoff for client reconnection attempts

Reference Architecture

                    +---------------------+
                    |  Notification Source |
                    +----------+----------+
                               |
                               v
                    +---------------------+
                    |   Message Broker    |
                    +----------+----------+
                               |
        +----------------------+-----------------------+
        |                                              |
+-------v--------+                             +-------v--------+
| HTTP Server SSE |                             | HTTP Server LP |
| (Event Source)  |                             | (Long Polling) |
+-------+--------+                             +-------+--------+
        |                                              |
        |                                              |
+-------v--------+                             +-------v--------+
| Connection     |                             | Connection     |
| Manager       |                             | Manager       |
+-------+--------+                             +-------+--------+
        |                                              |
        +----------------------+-----------------------+
                               |
                               v
                      +---------------------+
                      |      Clients        |
                      +---------------------+

Components

Notification Source

Any backend service or application

Generates notifications or events to be sent to clients

Message Broker

Redis Pub/Sub or RabbitMQ

Queues and distributes notifications to HTTP servers

HTTP Server SSE

Node.js with Express and EventSource API

Handles Server-Sent Events connections, pushes events to clients

HTTP Server LP

Node.js with Express

Handles long polling requests as fallback for clients without SSE support

Connection Manager

In-memory store or Redis

Tracks active client connections and manages reconnections

Clients

Modern browsers with EventSource API and fallback JavaScript

Receive notifications via SSE or long polling, handle reconnections

Request Flow

1. 1. Notification Source publishes an event to the Message Broker.

2. 2. HTTP Servers subscribe to the Message Broker to receive notifications.

3. 3. When a notification arrives, HTTP Server SSE pushes it to connected clients via SSE.

4. 4. Clients with SSE support receive events in real-time over a persistent HTTP connection.

5. 5. Clients without SSE support send long polling requests to HTTP Server LP.

6. 6. HTTP Server LP holds the request open until a notification is available or timeout occurs.

7. 7. Upon notification, HTTP Server LP responds to the client and client immediately sends a new long polling request.

8. 8. Connection Manager tracks active connections and handles reconnections with exponential backoff.

9. 9. Heartbeat messages are sent periodically to keep connections alive and detect dropped clients.

Database Schema

Entities: - ClientConnection: {client_id (PK), connection_type (SSE or LP), last_heartbeat, status} - Notification: {notification_id (PK), content, timestamp} - Subscription: {client_id (FK), topic} Relationships: - One ClientConnection can have multiple Subscriptions - Notifications are published to topics which clients subscribe to - Connection Manager uses ClientConnection to track active clients and their connection types

Scaling Discussion

Bottlenecks

HTTP server resource exhaustion due to many open SSE connections

Message Broker overload with high notification volume

Connection Manager becoming a single point of failure

Network bandwidth limits for pushing notifications to many clients

Solutions

Use multiple HTTP server instances behind a load balancer with sticky sessions

Employ horizontal scaling and clustering for Message Broker

Distribute Connection Manager state using Redis or consistent hashing

Implement message batching and compression to reduce bandwidth usage

Interview Tips

Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing architecture and data flow, 10 minutes discussing scaling and trade-offs, 5 minutes summarizing.

Explain difference between Long Polling and Server-Sent Events clearly

Discuss fallback strategy and client compatibility

Highlight connection management and reconnection handling

Address scalability challenges and solutions

Mention latency and availability targets and how design meets them