0
0
HLDsystem_design~7 mins

Horizontal vs vertical partitioning in HLD - Architecture Trade-offs

Choose your learning style9 modes available
Problem Statement
When a database grows large, queries slow down and maintenance becomes difficult. Storing all data in one place causes bottlenecks, long response times, and risks of downtime if the database fails.
Solution
Partitioning splits the database into smaller parts to improve performance and manageability. Horizontal partitioning divides data by rows, spreading different records across partitions. Vertical partitioning splits data by columns, grouping related fields together to optimize access patterns.
Architecture
Client App
Horizontal
Partitioning

The diagram shows two approaches: horizontal partitioning splits data into row-based partitions, while vertical partitioning splits data into column-based groups.

Trade-offs
✓ Pros
Horizontal partitioning improves query performance by limiting searches to relevant partitions.
Vertical partitioning reduces I/O by accessing only needed columns, optimizing read-heavy workloads.
Both methods improve scalability by distributing data and load.
Horizontal partitioning supports easier data archiving and deletion by partition.
✗ Cons
Horizontal partitioning can complicate joins across partitions, increasing query complexity.
Vertical partitioning may require joining multiple column groups, adding latency.
Both add complexity to database design and maintenance.
Use horizontal partitioning when the dataset is large with many rows and queries often filter by row attributes. Use vertical partitioning when tables have many columns but queries access only subsets of columns frequently.
Avoid horizontal partitioning if the dataset is small or queries always need full rows. Avoid vertical partitioning if most queries require all columns or if the schema is simple with few columns.
Real World Examples
Amazon
Uses horizontal partitioning to distribute customer order data by region, improving query speed and fault isolation.
LinkedIn
Applies vertical partitioning to separate user profile info from activity logs, optimizing read performance for profile views.
Twitter
Employs horizontal partitioning to shard tweets by user ID, enabling scalable writes and reads.
Alternatives
Sharding
Sharding is a form of horizontal partitioning but usually involves distributing data across multiple servers or nodes.
Use when: Choose sharding when you need to scale out across multiple machines rather than just splitting data within one database.
Denormalization
Denormalization duplicates data to reduce joins instead of splitting data by rows or columns.
Use when: Choose denormalization when read performance is critical and some data duplication is acceptable.
Summary
Partitioning breaks large databases into smaller parts to improve performance and scalability.
Horizontal partitioning splits data by rows, while vertical partitioning splits data by columns.
Choosing the right partitioning depends on data size, query patterns, and system scale.