Bird
Raised Fist0
HLDsystem_design~25 mins

One-to-one messaging in HLD - System Design Exercise

Choose your learning style9 modes available
Design: One-to-one Messaging System
Design covers backend architecture, data storage, and message delivery mechanisms. Client UI and encryption details are out of scope.
Functional Requirements
FR1: Allow users to send and receive messages privately between two users
FR2: Support message delivery in real-time with low latency
FR3: Store message history for users to view past conversations
FR4: Support message read receipts to show if a message was read
FR5: Allow users to be online or offline; deliver messages accordingly
FR6: Support up to 1 million active users with up to 10,000 concurrent connections
FR7: Ensure message order is preserved between two users
Non-Functional Requirements
NFR1: System should have p99 latency under 200ms for message delivery
NFR2: Availability target of 99.9% uptime (about 8.77 hours downtime per year)
NFR3: Messages must be durable and not lost
NFR4: Support mobile and web clients
NFR5: Privacy: messages are only visible to the two users involved
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
❓ Question 6
Key Components
API Gateway or Load Balancer
Authentication Service
Messaging Service (handles sending and receiving)
Message Queue or Pub/Sub system for real-time delivery
Persistent Storage (database) for message history
Cache layer for recent messages or user presence
Notification Service for offline users
Design Patterns
Publish-Subscribe pattern for message delivery
Event-driven architecture for message processing
CQRS (Command Query Responsibility Segregation) for separating reads and writes
Sharding and partitioning for database scalability
WebSocket or long polling for real-time communication
Reference Architecture
Client (Web/Mobile)
    |
    v
API Gateway / Load Balancer
    |
Authentication Service
    |
Messaging Service <--> Cache (User presence, recent messages)
    |
Message Queue / Pub-Sub
    |
Persistent Storage (Database)
    |
Notification Service (for offline users)
Components
API Gateway / Load Balancer
Nginx, AWS ALB
Route client requests to backend services and balance load
Authentication Service
OAuth 2.0, JWT
Verify user identity and issue tokens
Messaging Service
Node.js or Go microservice
Handle sending, receiving, and ordering of messages
Message Queue / Pub-Sub
Apache Kafka or Redis Streams
Enable real-time message delivery and decouple sender and receiver
Persistent Storage
PostgreSQL or Cassandra
Store message history and user metadata
Cache
Redis
Store user presence status and recent messages for fast access
Notification Service
Firebase Cloud Messaging or APNs
Send push notifications to offline users
Request Flow
1. 1. User client connects to API Gateway and authenticates via Authentication Service.
2. 2. Client opens a WebSocket connection to Messaging Service for real-time communication.
3. 3. When User A sends a message to User B, Messaging Service receives the message.
4. 4. Messaging Service writes the message to Persistent Storage to ensure durability.
5. 5. Messaging Service publishes the message to Message Queue / Pub-Sub.
6. 6. Messaging Service subscribes to the queue and delivers the message to User B if online via WebSocket.
7. 7. If User B is offline, Notification Service sends a push notification.
8. 8. User B receives the message and sends a read receipt back through Messaging Service.
9. 9. Messaging Service updates message status and notifies User A.
10. 10. Cache stores user presence and recent messages to optimize delivery and UI updates.
Database Schema
Entities: - User: user_id (PK), username, status - Message: message_id (PK), sender_id (FK User), receiver_id (FK User), content, timestamp, status (sent, delivered, read) - Conversation: conversation_id (PK), user1_id (FK User), user2_id (FK User) Relationships: - One Conversation between two Users (1:1) - Messages belong to one Conversation (1:N) - Message sender and receiver reference User entity
Scaling Discussion
Bottlenecks
Messaging Service CPU and memory limits with high concurrent connections
Database write throughput for storing messages
Message Queue throughput and latency under heavy load
Cache size and eviction policies for user presence and recent messages
Notification Service limits for push notifications
Solutions
Scale Messaging Service horizontally with stateless design and sticky sessions or distributed session management
Partition database by user or conversation ID (sharding) to distribute load
Use a high-throughput distributed message queue like Kafka with multiple partitions
Implement cache sharding and optimize eviction policies; use TTL for presence data
Batch notifications and use multiple notification providers to distribute load
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 15 minutes designing architecture and data flow, 10 minutes discussing scaling and trade-offs, 10 minutes for questions and wrap-up.
Emphasize real-time delivery with WebSocket and message queue
Discuss durability and ordering guarantees with persistent storage
Explain how offline users are handled with notifications
Highlight scalability strategies like sharding and horizontal scaling
Mention security and privacy considerations for one-to-one messaging