0
0
HLDsystem_design~10 mins

Horizontal vs vertical partitioning in HLD - Scaling Approaches Compared

Choose your learning style9 modes available
Scalability Analysis - Horizontal vs vertical partitioning
Growth Table: Horizontal vs Vertical Partitioning
Users/Data SizeHorizontal PartitioningVertical Partitioning
100 users / small dataSingle database instance; simple queriesSingle database; all columns together
10,000 users / medium dataSplit data rows across shards by user ID; reduces row size per shardSplit tables by columns (e.g., user info separate from logs); reduces column size per query
1 million users / large dataMany shards; each handles subset of users; easier to scale writesTables split by feature sets; some queries need joins across vertical partitions
100 million users / huge dataHundreds of shards; complex routing; cross-shard queries costlyMultiple vertical partitions across clusters; increased join complexity; possible duplication
First Bottleneck

At small scale, the database handles all data easily. As users grow, the first bottleneck is usually the database storage and query performance.

Horizontal partitioning bottleneck: Single shard storage and write throughput limits.

Vertical partitioning bottleneck: Complex joins across partitions slow queries.

Scaling Solutions
  • Horizontal Partitioning: Add more shards to distribute rows; use consistent hashing or range-based sharding; implement routing layer to direct queries.
  • Vertical Partitioning: Separate tables by feature or column groups; optimize queries to avoid unnecessary joins; replicate frequently joined partitions.
  • Combine both: Use horizontal partitioning within vertical partitions for very large scale.
  • Use caching layers to reduce database load.
  • Use read replicas to scale read-heavy workloads.
Back-of-Envelope Cost Analysis

Assuming 1 million users, each generating 10 requests per second:

  • Total requests: 10 million QPS (queries per second).
  • Single DB instance handles ~5,000 QPS → Need ~2,000 shards for horizontal partitioning.
  • Storage per user: 1 MB → Total 1 TB data; split across shards or vertical tables.
  • Network bandwidth: 1 Gbps = 125 MB/s; large data transfers require multiple network interfaces or data centers.
Interview Tip

Start by explaining the difference: horizontal splits rows, vertical splits columns.

Discuss pros and cons of each with examples.

Identify bottlenecks based on data size and query patterns.

Suggest scaling steps logically: start with vertical if columns are large, horizontal if rows grow fast.

Mention combining both for complex systems.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Implement horizontal partitioning (sharding) to distribute load across multiple database instances, because a single instance cannot handle 10x the queries.

Key Result
Horizontal partitioning scales by splitting data rows across multiple databases to handle more users and writes, while vertical partitioning splits data columns to optimize query performance and storage. The first bottleneck is usually database throughput or join complexity, fixed by sharding or splitting tables respectively.