0
0
LLDsystem_design~10 mins

Notification on state change in LLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Notification on state change
Growth Table: Notification on State Change
UsersNotifications per secondSystem ChangesStorage & Bandwidth
100 users~10-50Single server handles events and notifications synchronouslyMinimal storage, low bandwidth
10,000 users~1,000-5,000Introduce async processing with message queues; database indexing for state changesModerate storage for logs, moderate bandwidth
1,000,000 users~100,000+Horizontal scaling of app servers; distributed message queues; caching notifications; database shardingHigh storage for notification history; high bandwidth; CDN for static content
100,000,000 users~10,000,000+Multi-region deployment; global load balancing; event streaming platforms; advanced sharding and partitioning; real-time analyticsMassive storage with tiered archival; very high bandwidth; CDN and edge computing
First Bottleneck

At small scale, the database is the first bottleneck because it must track state changes and notification statuses. As users grow, synchronous writes and reads overload the DB.

Scaling Solutions
  • Async Processing: Use message queues (e.g., Kafka, RabbitMQ) to decouple state change detection from notification sending.
  • Horizontal Scaling: Add more application servers behind load balancers to handle increased notification processing.
  • Caching: Cache frequent state queries and notification templates to reduce DB load.
  • Database Sharding: Partition the database by user ID or region to distribute load.
  • CDN & Edge: Use CDN to deliver static notification content and reduce latency.
  • Event Streaming: Employ event streaming platforms for real-time, scalable state change propagation.
Back-of-Envelope Cost Analysis
  • At 1M users with 0.1 notifications/user/sec -> 100,000 notifications/sec.
  • Each notification ~1KB payload -> 100 MB/s bandwidth needed.
  • Storage for notification logs: 100,000 notifications/sec x 1KB x 3600 sec = ~360 GB/hour.
  • Database QPS: 100,000+ writes/sec, requiring sharding and replicas.
  • Network bandwidth: Multiple 1 Gbps links or 10 Gbps aggregation.
Interview Tip

Start by clarifying notification volume and latency needs. Identify the bottleneck (usually DB). Discuss async decoupling, horizontal scaling, caching, and data partitioning. Show awareness of trade-offs like consistency vs latency.

Self Check

Your database handles 1000 QPS. Traffic grows 10x. What do you do first?

Answer: Introduce read replicas and caching to reduce DB load, and move notification processing to async queues to smooth spikes.

Key Result
The database is the first bottleneck as users and notifications grow; scaling requires async processing, horizontal app scaling, caching, and database sharding to handle high notification volumes efficiently.