HLDsystem_design~25 mins

Vertical scaling vs horizontal scaling in HLD - Design Approaches Compared

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Design: Scaling a Web Application

Focus on scaling strategies for the application backend and database. Out of scope: frontend design, detailed security implementation.

Functional Requirements

FR1: Support increasing number of users without downtime

FR2: Maintain response time under 200ms for 95% of requests

FR3: Ensure system availability of 99.9% uptime

Non-Functional Requirements

NFR1: Budget limits for hardware and infrastructure

NFR2: Minimal changes to existing application code

NFR3: Support both read and write operations efficiently

Think Before You Design

Questions to Ask

❓ Question 1

❓ Question 2

❓ Question 3

❓ Question 4

❓ Question 5

Key Components

Application servers

Databases

Load balancers

Caching layers

Monitoring tools

Design Patterns

Load balancing

Database replication

Caching

Auto-scaling

Sharding

Reference Architecture

          +-------------------+
          |    Load Balancer   |
          +---------+---------+
                    |
        +-----------+-----------+
        |                       |
+-------+-------+       +-------+-------+
| App Server 1  |       | App Server 2  |
+-------+-------+       +-------+-------+
        |                       |
        +-----------+-----------+
                    |
          +---------+---------+
          |     Database      |
          |  (Vertical Scale) |
          +-------------------+


Alternative Horizontal Scaling:

          +-------------------+
          |    Load Balancer   |
          +---------+---------+
                    |
        +-----------+-----------+
        |           |           |
+-------+-------+ +-------+-------+ +-------+-------+
| App Server 1  | | App Server 2  | | App Server 3  |
+---------------+ +---------------+ +---------------+
                    |
          +---------+---------+
          |   Database Cluster  |
          | (Horizontal Scale)  |
          +--------------------+

Components

Load Balancer

Nginx, HAProxy, or Cloud Load Balancer

Distributes incoming requests evenly across multiple servers

Application Servers

Docker containers or virtual machines

Run the application code; can be scaled vertically or horizontally

Database

PostgreSQL, MySQL, or NoSQL databases

Stores application data; can be scaled vertically by upgrading hardware or horizontally by clustering/sharding

Caching Layer

Redis or Memcached

Speeds up read operations by storing frequently accessed data

Monitoring Tools

Prometheus, Grafana

Track system performance and detect bottlenecks

Request Flow

1. User sends a request to the Load Balancer.

2. Load Balancer forwards the request to one of the Application Servers.

3. Application Server processes the request, querying the Database or Cache as needed.

4. Database responds with data; Cache may serve data faster if available.

5. Application Server sends the response back to the Load Balancer.

6. Load Balancer returns the response to the user.

Database Schema

Entities depend on application domain; scaling strategies apply to the database as a whole rather than schema changes. Vertical scaling upgrades hardware (CPU, RAM) of a single database instance. Horizontal scaling involves database clustering or sharding to distribute data across multiple nodes.

Scaling Discussion

Bottlenecks

Single server CPU or memory limits reached (vertical scaling limit)

Database becomes a single point of failure or performance bottleneck

Load balancer overload if not scaled properly

Network bandwidth saturation

Data consistency challenges in horizontal scaling

Solutions

Vertical scaling: Upgrade server hardware (more CPU, RAM, faster disks) to handle more load on a single machine.

Horizontal scaling: Add more servers behind a load balancer to distribute load; use database replication and sharding to spread data.

Implement caching to reduce database load.

Use auto-scaling groups to add/remove servers based on demand.

Use distributed databases with strong or eventual consistency models depending on requirements.

Interview Tips

Time: Spend 10 minutes understanding requirements and clarifying constraints, 20 minutes designing and explaining vertical and horizontal scaling approaches, 10 minutes discussing trade-offs and scaling challenges, 5 minutes for questions.

Explain difference between vertical and horizontal scaling with simple examples (e.g., upgrading a single computer vs adding more computers).

Discuss pros and cons of each approach (cost, complexity, limits).

Mention real-world scenarios where each scaling type is appropriate.

Highlight importance of load balancing and caching in scaling.

Address data consistency and availability trade-offs in horizontal scaling.