Bird
Raised Fist0
HLDsystem_design~10 mins

One-to-one messaging in HLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - One-to-one messaging
Growth Table: One-to-one Messaging
UsersMessages/DayServer LoadDatabase LoadNetworkNotes
1001,000Single app server handles allSingle DB instance handles writes/readsLow bandwidth, no CDN neededSimple setup, no caching needed
10,000100,000Multiple app servers behind load balancerDB starts to see high write/read loadModerate bandwidth, consider cachingIntroduce Redis cache for recent messages
1,000,00010,000,000Hundreds of app servers, autoscalingDB bottleneck: read replicas, sharding neededHigh bandwidth, CDN for media filesUse message queues for delivery, partition users
100,000,0001,000,000,000Thousands of app servers, geo-distributedMulti-region DB clusters, advanced shardingVery high bandwidth, global CDNStrong consistency challenges, eventual consistency for some data
First Bottleneck

At small scale, the database is the first bottleneck because it must handle all message writes and reads. As users grow, the DB CPU and disk I/O get saturated. This slows down message delivery and retrieval.

Scaling Solutions
  • Horizontal scaling: Add more app servers behind a load balancer to handle concurrent connections.
  • Database read replicas: Offload read queries to replicas to reduce load on primary DB.
  • Sharding: Split user data across multiple database instances by user ID to distribute load.
  • Caching: Use Redis or Memcached to cache recent messages and user presence info.
  • Message queues: Use queues like Kafka or RabbitMQ to decouple message ingestion and delivery.
  • CDN: For media files (images, videos), use CDN to reduce bandwidth on origin servers.
  • Geo-distribution: Deploy servers and databases closer to users to reduce latency.
Back-of-Envelope Cost Analysis

Assuming 1 million users sending 10 messages/day each:

  • Messages per second (QPS): ~115 (10M messages / 86400 seconds)
  • Database writes: 115 QPS (each message is a write)
  • Database reads: 230 QPS (assuming 2 reads per message for delivery and retrieval)
  • Storage: 10M messages/day * 1KB/message = ~10GB/day
  • Network bandwidth: 10M messages/day * 1KB = ~120 KB/s peak
  • One DB instance can handle ~5,000 QPS, so single DB can handle this load but with little room for growth.
Interview Tip

Start by explaining the user scale and expected message volume. Identify the database as the first bottleneck. Discuss horizontal scaling of app servers, caching, and database read replicas. Then explain sharding and geo-distribution for large scale. Always justify why each solution fits the bottleneck.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Add read replicas to offload read queries and reduce load on the primary database. Also consider caching frequently accessed data to reduce DB hits.

Key Result
The database is the first bottleneck in one-to-one messaging as user and message volume grow. Scaling requires adding read replicas, caching, and sharding to distribute load, along with horizontal scaling of app servers and CDN for media.