Bird
0
0
LLDsystem_design~25 mins

Notification system in LLD - System Design Exercise

Choose your learning style9 modes available
Design: Notification System
Design covers backend notification processing, delivery, and subscription management. Out of scope: UI design for user preferences and third-party integrations beyond email, SMS, and push.
Functional Requirements
FR1: Send notifications to users via multiple channels: email, SMS, and push notifications
FR2: Support scheduling notifications for future delivery
FR3: Allow users to subscribe or unsubscribe from different notification types
FR4: Ensure delivery guarantees with retries on failure
FR5: Provide an API for other services to trigger notifications
FR6: Support at least 10,000 notifications per second
FR7: Ensure p99 latency for notification delivery under 500ms
FR8: Maintain 99.9% system availability
Non-Functional Requirements
NFR1: Handle spikes up to 50,000 notifications per second during peak times
NFR2: Notifications must be delivered in order per user
NFR3: Data privacy compliance for user contact information
NFR4: System must be horizontally scalable
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
❓ Question 6
❓ Question 7
Key Components
API Gateway for receiving notification requests
User Subscription Service to manage preferences
Notification Queue to buffer and order notifications
Worker Services to process and send notifications
Channel-specific Delivery Services (Email, SMS, Push)
Database for storing user preferences and notification logs
Cache for quick access to subscription data
Retry and Dead Letter Queue for failed notifications
Monitoring and Logging system
Design Patterns
Publish-Subscribe for decoupling notification producers and consumers
Queue-based Load Leveling to handle spikes
Circuit Breaker for external channel failures
Idempotency to avoid duplicate notifications
Event Sourcing for audit and replay
Reference Architecture
          +-------------------+
          |   API Gateway     |
          +---------+---------+
                    |
                    v
          +-------------------+          +---------------------+
          | User Subscription |<-------->|    Database         |
          |    Service        |          | (User prefs, logs)  |
          +---------+---------+          +---------------------+
                    |
                    v
          +-------------------+
          | Notification Queue |<-------------------+
          +---------+---------+                    |
                    |                              |
          +---------v---------+                    |
          | Worker Services   |                    |
          +----+----+----+----+                    |
               |    |    |                         |
       +-------+    |    +--------+                |
       |            |             |                |
+------+--+    +----+----+   +----+-----+          |
| Email   |    | SMS     |   | Push     |          |
| Service |    | Service |   | Service  |          |
+---------+    +---------+   +----------+          |
       |            |             |                |
       +------------+-------------+----------------+
                    |
          +---------v---------+
          | Retry & DLQ       |
          +-------------------+
Components
API Gateway
RESTful HTTP Server
Receive notification requests from clients and other services
User Subscription Service
Microservice with relational DB
Manage user preferences and subscription status
Notification Queue
Distributed message queue (e.g., Kafka, RabbitMQ)
Buffer notifications and maintain order per user
Worker Services
Stateless microservices
Consume notifications from queue, apply business logic, and dispatch to channels
Email Service
SMTP or Email API (e.g., SendGrid)
Send email notifications
SMS Service
SMS Gateway API (e.g., Twilio)
Send SMS notifications
Push Service
Push notification service (e.g., Firebase Cloud Messaging)
Send push notifications to mobile/web clients
Retry & Dead Letter Queue
Message queue with retry logic
Handle failed notifications with retries and store permanently failed messages
Database
Relational DB (e.g., PostgreSQL)
Store user subscriptions, notification logs, and metadata
Request Flow
1. Client or service sends notification request to API Gateway.
2. API Gateway forwards request to User Subscription Service to verify user preferences.
3. If user is subscribed, notification is placed into Notification Queue.
4. Worker Services consume notifications from the queue in order.
5. Workers send notifications to appropriate channel services (Email, SMS, Push).
6. Channel services attempt delivery and report success or failure.
7. On failure, notification is sent to Retry & Dead Letter Queue for retry attempts.
8. Successful deliveries and failures are logged in the Database.
9. Users can update subscription preferences via User Subscription Service API.
Database Schema
Entities: - User (user_id PK, name, contact_info) - Subscription (subscription_id PK, user_id FK, channel ENUM, is_subscribed BOOLEAN) - Notification (notification_id PK, user_id FK, content TEXT, channel ENUM, status ENUM, created_at TIMESTAMP, delivered_at TIMESTAMP) - RetryLog (retry_id PK, notification_id FK, attempt_count INT, last_attempt TIMESTAMP, status ENUM) Relationships: - User 1:N Subscription - User 1:N Notification - Notification 1:1 RetryLog (optional)
Scaling Discussion
Bottlenecks
Notification Queue can become a bottleneck under high load
Worker Services may be overwhelmed processing many notifications
External channel services (Email, SMS, Push) may have rate limits
Database can become a bottleneck for subscription and logging queries
Solutions
Partition Notification Queue by user ID to distribute load and maintain order
Scale Worker Services horizontally with auto-scaling based on queue length
Implement rate limiting and circuit breakers for external channel APIs
Use caching (e.g., Redis) for user subscription data to reduce DB load
Archive old notification logs to reduce database size and improve performance
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing architecture and data flow, 10 minutes discussing scaling and trade-offs, 5 minutes summarizing.
Clarify notification channels and delivery guarantees early
Emphasize decoupling producers and consumers with queues
Discuss user subscription management and privacy
Explain retry mechanisms and failure handling
Highlight scalability strategies and bottleneck mitigation
Mention monitoring and alerting importance