HLDsystem_design~10 mins

Notification system design in HLD - Scalability & System Analysis

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Scalability Analysis - Notification system design

Growth Table: Notification System

Scale	Users	Notifications/Day	Key Changes
Small	100	1,000	Single server handles API and DB; simple queue; direct push
Medium	10,000	100,000	Introduce message queue; DB read replicas; caching user preferences
Large	1,000,000	10,000,000	Multiple app servers; sharded DB; distributed queue; push notification services
Very Large	100,000,000	1,000,000,000	Global load balancers; multi-region DB shards; CDN for static content; advanced throttling

First Bottleneck

At small to medium scale, the database is the first bottleneck. It struggles with high write volume for notifications and user preferences. As traffic grows, the message queue and application servers also become bottlenecks due to processing and delivery delays.

Scaling Solutions

Database: Use read replicas for reads, write sharding by user ID, and caching for user settings.
Application Servers: Horizontally scale with load balancers to handle more concurrent connections.
Message Queue: Use distributed queues like Kafka or RabbitMQ to handle high throughput and ensure reliable delivery.
Push Delivery: Integrate with platform push services (APNs, FCM) and use CDN for static notification content.
Throttling & Batching: Batch notifications and throttle to avoid overwhelming users and systems.

Back-of-Envelope Cost Analysis

Requests per second: At 1M users sending 10 notifications/day, ~115 QPS (10M/86400s).
Storage: Assuming 1KB per notification, 10M notifications/day = ~10GB/day storage.
Bandwidth: Push payloads are small (~1KB), so 10M notifications ~10GB outbound daily.
Server capacity: One app server handles ~3000 concurrent connections; scale horizontally as users grow.
Database QPS: One PostgreSQL instance handles ~5000 QPS; use sharding and replicas beyond that.

Interview Tip

Start by clarifying notification types and delivery guarantees. Discuss user scale and traffic patterns. Identify bottlenecks step-by-step: database, queue, delivery. Propose incremental scaling solutions with clear reasoning. Mention trade-offs like latency vs cost and user experience.

Self Check

Your database handles 1000 QPS. Traffic grows 10x. What do you do first?

Answer: Add read replicas to offload read queries and implement caching for frequent reads. For writes, consider sharding or batching writes to reduce load.

Key Result

The database is the first bottleneck as notification volume grows; scaling requires read replicas, sharding, distributed queues, and horizontal app server scaling.