0
0
HLDsystem_design~10 mins

Why database choice impacts architecture in HLD - Scalability Evidence

Choose your learning style9 modes available
Scalability Analysis - Why database choice impacts architecture
Growth Table: Impact of Database Choice at Different Scales
Users / TrafficDatabase BehaviorArchitecture Impact
100 usersSingle instance DB handles requests easilySimple architecture, direct DB connection
10,000 usersDB load increases, latency may riseIntroduce connection pooling, caching layers
1 million usersSingle DB instance bottleneck; scaling limits reachedNeed read replicas, sharding, or NoSQL options
100 million usersMassive data volume and traffic; complex queries slowDistributed DB clusters, multi-region replication, polyglot persistence
First Bottleneck: Database

The database is usually the first component to break as user count and data grow. This happens because databases have limits on how many queries they can process per second and how much data they can store efficiently. When the DB is overwhelmed, response times increase and the whole system slows down.

Scaling Solutions for Database Bottlenecks
  • Vertical Scaling: Upgrade to more powerful servers with faster CPUs, more RAM, and SSD storage.
  • Read Replicas: Create copies of the database to handle read queries, reducing load on the main DB.
  • Sharding: Split data horizontally across multiple database servers to distribute load.
  • Caching: Use in-memory caches like Redis or Memcached to serve frequent queries quickly.
  • NoSQL Databases: Use databases optimized for specific data types or access patterns (e.g., document stores, key-value stores) for better scalability.
  • Polyglot Persistence: Combine multiple database types to handle different parts of data efficiently.
Back-of-Envelope Cost Analysis
  • At 1 million users, assuming 10 requests per user per day, total requests = 10 million/day ≈ 115 requests/sec.
  • A single PostgreSQL instance can handle ~5,000 QPS, so DB can handle this load but with little room for growth.
  • Storage needs grow with data size; 100 million users may require terabytes of storage.
  • Network bandwidth must support data transfer; 1 Gbps network can handle ~125 MB/s, enough for many applications but may need upgrades at large scale.
Interview Tip: Structuring Scalability Discussion

Start by identifying the database's role and limits in your system. Discuss expected traffic and data growth. Explain how the database choice affects read/write patterns, consistency, and latency. Then describe bottlenecks and propose scaling solutions like replication, sharding, or caching. Finally, mention trade-offs and cost implications.

Self-Check Question

Your database handles 1000 queries per second (QPS). Traffic grows 10x to 10,000 QPS. What do you do first and why?

Key Result
Database choice impacts architecture because databases have limits on query handling and storage. As users grow, the database becomes the first bottleneck, requiring solutions like replication, sharding, or caching to scale effectively.