Bird
Raised Fist0
HLDsystem_design~25 mins

Notification system design in HLD - System Design Exercise

Choose your learning style9 modes available
Design: Notification System
Design covers backend services, data storage, and delivery mechanisms for notifications. Out of scope are the detailed UI designs and third-party SMS/email provider internals.
Functional Requirements
FR1: Send notifications to users via multiple channels: email, SMS, and push notifications
FR2: Support both real-time and scheduled notifications
FR3: Allow users to subscribe or unsubscribe from different notification types
FR4: Ensure delivery guarantees with retries for failed notifications
FR5: Provide an admin interface to create, schedule, and monitor notifications
FR6: Handle up to 100,000 notifications per minute during peak times
FR7: Support user preferences for notification channels and quiet hours
Non-Functional Requirements
NFR1: System should have 99.9% uptime
NFR2: API response latency for sending notifications should be under 200ms (p99)
NFR3: Notifications must be delivered within 1 minute of scheduled time
NFR4: Data privacy and security must be ensured for user information
NFR5: System should be horizontally scalable to handle load spikes
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
❓ Question 6
Key Components
API Gateway or Notification API
User Preferences Service
Notification Scheduler
Message Queue for decoupling
Notification Dispatcher for each channel
Retry and Failure Handler
Database for storing user data and notification logs
Admin Dashboard backend
Design Patterns
Publish-Subscribe pattern for event-driven notifications
Circuit Breaker for third-party service failures
Bulkhead pattern to isolate channel failures
Retry with exponential backoff
CQRS for separating read and write models of user preferences
Reference Architecture
          +-------------------+          
          |   Admin Dashboard |          
          +---------+---------+          
                    |                    
                    v                    
          +-------------------+          
          | Notification API   |          
          +---------+---------+          
                    |                    
        +-----------+-----------+        
        |                       |        
+-------v-------+       +-------v-------+
| User Pref.    |       | Notification  |
| Service       |       | Scheduler     |
+-------+-------+       +-------+-------+
        |                       |        
        |                       |        
        |                       v        
        |               +---------------+ 
        |               | Message Queue | 
        |               +-------+-------+ 
        |                       |         
        |        +--------------+--------------+ 
        |        |              |              | 
+-------v---+ +--v---------+ +--v---------+ +--v---------+
| Email     | | SMS        | | Push       | | Retry &    |
| Dispatcher| | Dispatcher | | Dispatcher | | Failure    |
+-----------+ +------------+ +------------+ +------------+
                    |              |             |       
                    +--------------+-------------+       
                                   |                     
                          +--------v--------+            
                          | Third-party     |            
                          | Providers       |            
                          +-----------------+            
Components
Notification API
RESTful API with Node.js or Python Flask
Receives notification requests and user subscription changes
User Preferences Service
Relational DB (PostgreSQL) with REST API
Stores user notification preferences and subscription status
Notification Scheduler
Cron jobs or distributed scheduler like Apache Airflow
Schedules notifications for future delivery
Message Queue
Apache Kafka or RabbitMQ
Decouples notification creation from delivery, buffers load
Notification Dispatchers
Microservices per channel (Email, SMS, Push) using appropriate SDKs
Send notifications to users via respective channels
Retry & Failure Handler
Service with retry logic and dead-letter queue
Handles failed deliveries with retries and alerts
Database
PostgreSQL for user data and notification logs
Stores user info, preferences, notification history
Admin Dashboard
Web frontend with backend APIs
Allows admins to create, schedule, and monitor notifications
Request Flow
1. Admin creates or schedules notification via Admin Dashboard.
2. Notification API receives request and validates it.
3. User Preferences Service is queried to filter users based on subscription.
4. Notification Scheduler schedules immediate or future notifications.
5. Scheduled notifications are pushed to Message Queue.
6. Notification Dispatchers consume messages from queue per channel.
7. Dispatchers send notifications via third-party providers.
8. Retry & Failure Handler monitors delivery status and retries failures.
9. Delivery status and logs are stored in Database for auditing.
Database Schema
Entities: - User: user_id (PK), name, email, phone, push_token - Notification: notification_id (PK), content, type, scheduled_time, status - UserPreferences: user_id (FK), notification_type, channel, subscribed (bool), quiet_hours_start, quiet_hours_end - NotificationLog: log_id (PK), notification_id (FK), user_id (FK), channel, status, timestamp Relationships: - UserPreferences linked to User by user_id - NotificationLog linked to Notification and User by their IDs - Notification stores metadata about each notification event
Scaling Discussion
Bottlenecks
Message Queue throughput limits under very high load
Third-party provider rate limits and failures
Database write contention for logging large volumes
Notification Dispatchers becoming overwhelmed
User Preferences Service latency under heavy read load
Solutions
Partition and shard message queues, use multiple topics for channels
Implement circuit breakers and fallback channels, batch requests to providers
Use write-optimized storage or separate logging database, batch writes
Scale dispatchers horizontally with autoscaling and load balancing
Cache user preferences with Redis or CDN to reduce DB hits
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 15 minutes designing components and data flow, 10 minutes discussing scaling and failure handling, 10 minutes for Q&A and trade-offs.
Emphasize decoupling with message queues for scalability
Discuss user preferences and subscription management clearly
Explain retry and failure handling for delivery guarantees
Mention channel-specific dispatchers for modularity
Highlight scaling strategies and monitoring for reliability