HLDsystem_design~25 mins

Pub/sub pattern in HLD - System Design Exercise

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Design: Publish/Subscribe Messaging System

Design the core pub/sub messaging system including message routing, topic management, and delivery guarantees. Out of scope: detailed client SDKs, security/authentication mechanisms, and UI.

Functional Requirements

FR1: Allow multiple publishers to send messages without knowing subscribers

FR2: Allow multiple subscribers to receive messages they are interested in

FR3: Support message filtering by topic or category

FR4: Ensure messages are delivered to all relevant subscribers

FR5: Handle at least 10,000 concurrent publishers and subscribers

FR6: Provide message delivery latency under 200ms (p99)

FR7: Support message persistence for reliability

Non-Functional Requirements

NFR1: System must be highly available with 99.9% uptime

NFR2: Must scale horizontally to handle increasing load

NFR3: Messages should be delivered in near real-time

NFR4: Subscribers should not block publishers

Think Before You Design

Questions to Ask

❓ Question 1

❓ Question 2

❓ Question 3

❓ Question 4

❓ Question 5

❓ Question 6

Key Components

Message Broker or Broker Cluster

Publisher API

Subscriber API

Topic Manager

Message Queue or Buffer

Delivery Worker or Dispatcher

Persistence Storage for messages

Cache for fast topic lookup

Design Patterns

Fan-out pattern for message distribution

Message queue for buffering

Event-driven architecture

Load balancing across brokers

Backpressure handling for slow consumers

Reference Architecture

  +-------------+       +----------------+       +--------------+
  |  Publishers | ----> | Message Broker | ----> | Subscribers  |
  +-------------+       +----------------+       +--------------+
                             |       ^
                             |       |
                       +-----------+  |
                       | Topic Mgr |--+
                       +-----------+

Components

Message Broker

Kafka / RabbitMQ / Custom Broker

Receives messages from publishers, routes them to subscribers based on topics

Topic Manager

In-memory store or distributed cache (e.g., Redis)

Manages topic subscriptions and metadata

Publisher API

REST/gRPC endpoints

Allows publishers to send messages to topics

Subscriber API

WebSocket / REST / gRPC

Allows subscribers to register interest and receive messages

Message Queue

Broker internal queues

Buffers messages for delivery to subscribers

Persistence Storage

Distributed log or database

Stores messages for durability and replay

Request Flow

1. 1. Publisher sends message to Message Broker via Publisher API specifying topic.

2. 2. Message Broker receives message and stores it in Message Queue and Persistence Storage.

3. 3. Topic Manager identifies subscribers interested in the topic.

4. 4. Message Broker dispatches message to all subscribers registered for the topic.

5. 5. Subscribers receive messages asynchronously via Subscriber API.

6. 6. Acknowledgments from subscribers are optionally processed for delivery guarantees.

Database Schema

Entities: - Topic: id (PK), name - Subscriber: id (PK), connection_info - Subscription: id (PK), subscriber_id (FK), topic_id (FK) - Message: id (PK), topic_id (FK), payload, timestamp Relationships: - One Topic has many Subscriptions - One Subscriber can subscribe to many Topics (many-to-many via Subscription) - Messages belong to one Topic

Scaling Discussion

Bottlenecks

Message Broker CPU and memory limits under high message throughput

Network bandwidth for delivering messages to many subscribers

Topic Manager becoming a single point of failure or bottleneck

Persistence Storage write/read latency under load

Handling slow or offline subscribers causing backpressure

Solutions

Scale Message Broker horizontally with partitioning and clustering

Use load balancers and CDN-like edge nodes for subscriber delivery

Distribute Topic Manager using consistent hashing or sharding

Use high-performance distributed storage (e.g., Kafka logs, Cassandra)

Implement backpressure strategies like message buffering, dropping, or slow consumer detection

Interview Tips

Time: 10 minutes for requirements and clarifications, 15 minutes for architecture and components, 10 minutes for scaling and trade-offs, 10 minutes for Q&A

Clarify delivery guarantees and subscriber types early

Explain how decoupling publishers and subscribers improves scalability

Describe how topics and subscriptions are managed

Discuss message persistence and reliability

Highlight how system scales horizontally and handles failures

Mention trade-offs between latency, consistency, and availability