HLDsystem_design~10 mins

One-to-one messaging in HLD - Scalability & System Analysis

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Scalability Analysis - One-to-one messaging

Growth Table: One-to-one Messaging

Users	Messages/Day	Server Load	Database Load	Network	Notes
100	1,000	Single app server handles all	Single DB instance handles writes/reads	Low bandwidth, no CDN needed	Simple setup, no caching needed
10,000	100,000	Multiple app servers behind load balancer	DB starts to see high write/read load	Moderate bandwidth, consider caching	Introduce Redis cache for recent messages
1,000,000	10,000,000	Hundreds of app servers, autoscaling	DB bottleneck: read replicas, sharding needed	High bandwidth, CDN for media files	Use message queues for delivery, partition users
100,000,000	1,000,000,000	Thousands of app servers, geo-distributed	Multi-region DB clusters, advanced sharding	Very high bandwidth, global CDN	Strong consistency challenges, eventual consistency for some data

First Bottleneck

At small scale, the database is the first bottleneck because it must handle all message writes and reads. As users grow, the DB CPU and disk I/O get saturated. This slows down message delivery and retrieval.

Scaling Solutions

Horizontal scaling: Add more app servers behind a load balancer to handle concurrent connections.
Database read replicas: Offload read queries to replicas to reduce load on primary DB.
Sharding: Split user data across multiple database instances by user ID to distribute load.
Caching: Use Redis or Memcached to cache recent messages and user presence info.
Message queues: Use queues like Kafka or RabbitMQ to decouple message ingestion and delivery.
CDN: For media files (images, videos), use CDN to reduce bandwidth on origin servers.
Geo-distribution: Deploy servers and databases closer to users to reduce latency.

Back-of-Envelope Cost Analysis

Assuming 1 million users sending 10 messages/day each:

Messages per second (QPS): ~115 (10M messages / 86400 seconds)
Database writes: 115 QPS (each message is a write)
Database reads: 230 QPS (assuming 2 reads per message for delivery and retrieval)
Storage: 10M messages/day * 1KB/message = ~10GB/day
Network bandwidth: 10M messages/day * 1KB = ~120 KB/s peak
One DB instance can handle ~5,000 QPS, so single DB can handle this load but with little room for growth.

Interview Tip

Start by explaining the user scale and expected message volume. Identify the database as the first bottleneck. Discuss horizontal scaling of app servers, caching, and database read replicas. Then explain sharding and geo-distribution for large scale. Always justify why each solution fits the bottleneck.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Add read replicas to offload read queries and reduce load on the primary database. Also consider caching frequently accessed data to reduce DB hits.

Key Result

The database is the first bottleneck in one-to-one messaging as user and message volume grow. Scaling requires adding read replicas, caching, and sharding to distribute load, along with horizontal scaling of app servers and CDN for media.