0
0
HLDsystem_design~7 mins

Vertical scaling vs horizontal scaling in HLD - Architecture Trade-offs

Choose your learning style9 modes available
Problem Statement
When a system faces increased load, relying on a single server to handle all requests can cause slow response times and risk total failure if that server crashes. Without proper scaling, the system cannot handle growth or sudden spikes in traffic, leading to poor user experience and downtime.
Solution
Vertical scaling adds more power (CPU, RAM) to a single server to handle more load, while horizontal scaling adds more servers to share the load. Vertical scaling improves capacity by making one machine stronger, and horizontal scaling improves capacity by spreading work across many machines.
Architecture
Powerful
Powerful
Server 1
Load
Balancer

The diagram shows vertical scaling as upgrading a single server's resources, and horizontal scaling as adding multiple servers behind a load balancer sharing the workload.

Trade-offs
✓ Pros
Vertical scaling is simpler to implement since it involves upgrading one machine.
Horizontal scaling provides better fault tolerance because multiple servers share the load.
Horizontal scaling allows near unlimited growth by adding more servers.
Vertical scaling avoids complexity of distributed systems and data synchronization.
✗ Cons
Vertical scaling hits hardware limits and can be expensive beyond a point.
Horizontal scaling requires load balancing and data consistency mechanisms.
Horizontal scaling adds complexity in deployment, monitoring, and maintenance.
Use vertical scaling for small to medium workloads where simplicity is key and hardware upgrades are affordable. Use horizontal scaling when traffic exceeds single server capacity or high availability is required.
Avoid vertical scaling when expected load exceeds the maximum hardware limits or cost is prohibitive. Avoid horizontal scaling for very small systems where added complexity outweighs benefits.
Real World Examples
Netflix
Uses horizontal scaling with many servers behind load balancers to stream video reliably to millions of users worldwide.
Amazon
Employs horizontal scaling in its web services to handle massive traffic spikes during sales events.
Small business websites
Often use vertical scaling by upgrading their single server's CPU and RAM to handle moderate traffic increases.
Alternatives
Auto-scaling
Automatically adjusts the number of servers based on real-time load, combining horizontal scaling with automation.
Use when: Use when traffic fluctuates unpredictably and you want to optimize cost and performance.
Sharding
Splits data across multiple databases or servers to scale horizontally at the data layer.
Use when: Use when data size grows beyond single database capacity and needs distributed storage.
Summary
Vertical scaling increases a single server's capacity by adding more resources like CPU and RAM.
Horizontal scaling adds more servers to distribute the workload and improve fault tolerance.
Choosing between them depends on system size, expected load, cost, and complexity requirements.