| Users | Notifications/Day | Key Changes |
|---|---|---|
| 100 | ~1,000 | Simple queue, single server, direct DB writes |
| 10,000 | ~100,000 | Message queue introduced, caching user preferences, DB indexing |
| 1,000,000 | ~10,000,000 | Multiple app servers, distributed queue, read replicas, push notification services |
| 100,000,000 | ~1,000,000,000 | Sharded DB, global CDN for media, microservices, event-driven architecture |
Design a notification system in HLD - Scalability & System Analysis
At around 10,000 users, the database becomes the first bottleneck. Writing and reading notification data for many users causes high latency and connection limits. The single server and simple queue cannot handle the volume efficiently.
- Horizontal Scaling: Add more application servers behind a load balancer to handle more notification requests.
- Message Queues: Use distributed queues (e.g., Kafka, RabbitMQ) to decouple notification generation from delivery.
- Caching: Cache user notification preferences and recent notifications to reduce DB load.
- Database Read Replicas: Use replicas to distribute read traffic and reduce load on the primary DB.
- Sharding: Partition the database by user ID or region to scale writes and storage.
- Push Notification Services: Use external services (e.g., Firebase, APNs) for mobile push notifications to offload delivery.
- CDN: Use CDN for static media in notifications to reduce bandwidth and latency.
- At 1M users sending 10 notifications/day: ~10M notifications/day ≈ 115 notifications/sec.
- Database: Needs to handle ~115 writes/sec plus reads; a single DB can handle ~5,000 QPS, so one instance is sufficient but close to limits.
- Message Queue: Must support ~115 enqueue/dequeue operations per second, well within Kafka or RabbitMQ capabilities.
- Bandwidth: Assuming 1 KB per notification, 115 KB/s ≈ 0.9 Mbps, easily handled by 1 Gbps network.
- Storage: 10M notifications/day x 1 KB = ~10 GB/day; plan for archiving and tiered storage.
Start by clarifying notification types and user scale. Discuss data flow from event to delivery. Identify bottlenecks at each scale. Propose incremental scaling solutions: caching, queues, DB replicas, sharding. Mention trade-offs and real-world constraints like latency and cost.
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas to distribute read traffic and reduce load on the primary database. Also, introduce caching for frequent reads and consider message queues to decouple processing.
