HLDsystem_design~10 mins

Database sharding strategies in HLD - Scalability & System Analysis

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Scalability Analysis - Database sharding strategies

Growth Table: Database Sharding Strategies

Users / Data Size	100 Users	10K Users	1M Users	100M Users
Data Volume	Small, fits on single DB	Medium, single DB still possible	Large, single DB slow, storage limits	Very large, single DB impossible
Traffic (QPS)	Low (~10-100 QPS)	Moderate (~1K QPS)	High (~50K QPS)	Very High (~5M QPS)
DB Load	Single instance handles load	Near max capacity, some slow queries	DB CPU/memory bottleneck	DB storage and CPU overwhelmed
Latency	Low latency	Latency starts increasing	Latency high without sharding	Latency very high without sharding
Sharding Need	Not needed	Optional, for growth	Required for performance	Mandatory, complex sharding

First Bottleneck

The database server is the first bottleneck as user count and data grow. It hits CPU, memory, and storage limits. Single-instance databases struggle with high query rates and large data volumes. This causes slow queries and increased latency.

Scaling Solutions for Database Sharding

Horizontal Sharding: Split data across multiple database servers by a shard key (e.g., user ID range, geographic region). Each shard handles a subset of data and queries.
Vertical Sharding: Separate tables or features into different databases to reduce load per DB.
Directory-Based Sharding: Use a lookup service to map keys to shards, allowing flexible shard assignment.
Consistent Hashing: Distribute data evenly and allow easy addition/removal of shards with minimal data movement.
Caching: Use in-memory caches (e.g., Redis) to reduce DB reads.
Read Replicas: Offload read queries to replicas to reduce master DB load.
Connection Pooling: Efficiently manage DB connections to avoid overload.

Back-of-Envelope Cost Analysis

At 1M users, expect ~50K QPS. A single DB handles ~5K QPS max, so need ~10 shards.
Storage per user assumed 1MB -> 1M users = ~1TB data, split across shards.
Network bandwidth per shard: 5K QPS * 1KB/request ≈ 5MB/s, manageable on 1Gbps links.
Adding shards increases operational complexity and cost (servers, monitoring).

Interview Tip

Start by explaining the current system limits. Identify the database as the bottleneck. Discuss sharding types and trade-offs. Mention shard key choice importance. Explain how caching and replicas help. Show awareness of operational complexity and data consistency challenges.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Introduce horizontal sharding to split load across multiple database servers. This reduces CPU and memory pressure on each DB instance and maintains low latency.

Key Result

Database sharding becomes essential beyond tens of thousands of users to handle CPU, memory, and storage limits. Horizontal sharding by a good shard key distributes load and data, enabling scalable performance.