| Scale | Users | Notifications/Second | System Changes |
|---|---|---|---|
| Small | 100 | 10-50 | Single server handles notification dispatch; simple queue; direct DB writes |
| Medium | 10,000 | 1,000-5,000 | Introduce message queue (e.g., Kafka); use worker pool; caching user preferences |
| Large | 1,000,000 | 100,000+ | Multiple distributed queues; microservices for notification types; horizontal scaling; CDN for push notifications |
| Very Large | 100,000,000 | 10,000,000+ | Global distributed system; sharded databases; multi-region queues; edge computing; advanced rate limiting |
Notification to all parties in LLD - Scalability & System Analysis
The first bottleneck is the message queue and notification dispatch system. As user count and notification volume grow, the queue can become overwhelmed, causing delays. Also, the database storing notification status can become a bottleneck due to high write/read load.
- Horizontal scaling: Add more worker servers to process notifications concurrently.
- Message queue partitioning: Use multiple partitions or topics to distribute load.
- Caching: Cache user notification preferences to reduce DB hits.
- Sharding: Split notification data by user ID ranges or regions.
- CDN and push services: Use content delivery networks and push notification services to offload delivery.
- Rate limiting and batching: Control notification bursts and batch notifications to reduce load.
At 1 million users sending 100,000 notifications per second:
- Requests per second: ~100,000 (notification dispatch calls)
- Storage: Assuming 1 KB per notification record, 100,000 notifications/sec = ~100 MB/sec = ~8.6 TB/day
- Network bandwidth: 100,000 notifications * 1 KB = ~100 MB/s (800 Mbps), requires high bandwidth servers and network infrastructure
Start by clarifying notification types and delivery guarantees. Discuss expected user scale and notification volume. Identify bottlenecks early (queue, DB, network). Propose incremental scaling steps: caching, horizontal scaling, sharding. Mention trade-offs like latency vs cost. Use real numbers to justify design choices.
Your database handles 1000 QPS for notification status updates. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Introduce read replicas and caching to reduce DB load, and consider sharding notification data to distribute writes. Also, optimize notification processing to reduce unnecessary DB writes.