Overview - Horizontal scaling mental model

What is it?

Horizontal scaling means adding more machines to handle more data or users. Instead of making one machine stronger, you add many machines working together. In databases like MongoDB, this helps store lots of data and serve many users quickly. It spreads the work across many servers to keep things fast and reliable.

Why it matters

Without horizontal scaling, a single machine can become too slow or crash under heavy use. This would make websites or apps slow or unavailable. Horizontal scaling solves this by sharing the load, so systems can grow smoothly as more people use them. It keeps services running well even when demand grows a lot.

Where it fits

Before learning horizontal scaling, you should understand basic database concepts and vertical scaling (making one machine stronger). After this, you can learn about sharding, replication, and distributed systems to see how horizontal scaling works in detail.

Mental Model

Core Idea

Horizontal scaling splits data and work across many machines to handle more load efficiently.

Think of it like...

Imagine a pizza shop that gets more customers than one chef can handle. Instead of making one chef work faster, the shop hires more chefs, each making pizzas for different customers at the same time. This way, more pizzas get made faster without tiring one chef.

┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│   Machine 1   │       │   Machine 2   │       │   Machine 3   │
│  (Data Part)  │       │  (Data Part)  │       │  (Data Part)  │
└──────┬────────┘       └──────┬────────┘       └──────┬────────┘
       │                       │                       │
       └──────────────┬────────┴───────────────┬───────┘
                      │                        │
                 ┌────┴─────┐            ┌─────┴─────┐
                 │  Clients │            │  Clients  │
                 └──────────┘            └───────────┘

Build-Up - 7 Steps

1

FoundationUnderstanding scaling basics

Concept: Learn what scaling means and the difference between vertical and horizontal scaling.

Scaling means making a system handle more work. Vertical scaling means making one machine stronger by adding CPU, RAM, or storage. Horizontal scaling means adding more machines to share the work. Vertical scaling has limits because one machine can only be so powerful.

Result

You understand that scaling is about handling more users or data, and that horizontal scaling adds machines instead of upgrading one.

Knowing the difference between vertical and horizontal scaling helps you choose the right approach for growing systems.

2

FoundationBasics of distributed data storage

3

IntermediateHow MongoDB shards data horizontally

4

IntermediateRole of the query router in scaling

5

IntermediateBalancing data across shards

6

AdvancedHandling failures in horizontal scaling

7

ExpertTrade-offs and challenges of horizontal scaling

Under the Hood

Horizontal scaling in MongoDB works by dividing data into chunks based on a shard key. Each chunk is stored on a shard, which is a replica set of machines holding copies for safety. A query router directs client requests to the correct shards. The system balances chunks by moving them between shards to keep load even. Replication ensures data is copied across machines for fault tolerance. Internally, metadata about chunk locations is stored in config servers, which the query router consults to route queries.

Why designed this way?

MongoDB was designed for horizontal scaling to handle big data and high traffic beyond single machine limits. Early databases focused on vertical scaling but hit hardware limits. Sharding with replica sets balances performance, availability, and fault tolerance. Using a query router hides complexity from clients. Config servers centralize metadata for efficient routing. This design trades some complexity for scalability and reliability.

┌───────────────┐       ┌───────────────┐       ┌───────────────┐
│ Config Server │◄──────┤ Query Router  ├──────►│   Client App  │
└──────┬────────┘       └──────┬────────┘       └───────────────┘
       │                       │
       │                       │
┌──────┴───────┐        ┌──────┴───────┐        ┌──────┴───────┐
│  Shard 1     │        │  Shard 2     │        │  Shard 3     │
│ (Replica Set)│        │ (Replica Set)│        │ (Replica Set)│
└──────────────┘        └──────────────┘        └──────────────┘

Myth Busters - 4 Common Misconceptions

Quick: Does horizontal scaling mean every machine has all the data? Commit yes or no.

Common Belief:Horizontal scaling means copying all data to every machine for safety.

Tap to reveal reality

Quick: Is horizontal scaling always better than vertical scaling? Commit yes or no.

Common Belief:Horizontal scaling is always the best way to improve performance.

Tap to reveal reality

Quick: Does adding more shards always make queries faster? Commit yes or no.

Common Belief:More shards always mean faster queries because work is split more.

Tap to reveal reality

Quick: Can horizontal scaling guarantee immediate consistency across all data? Commit yes or no.

Common Belief:Horizontal scaling always keeps data instantly consistent everywhere.

Tap to reveal reality

Expert Zone

1

Choosing the right shard key is critical; a poor choice can cause uneven data distribution and hotspots.

2

Balancing chunk migrations can impact performance temporarily, so timing and thresholds matter.

3

Replication lag within shards can cause read inconsistencies if not managed carefully.

When NOT to use

Horizontal scaling is not ideal for small datasets or simple applications where vertical scaling suffices. Also, if strong immediate consistency is mandatory, consider single-node or tightly coupled systems instead. Alternatives include vertical scaling, caching layers, or specialized distributed databases designed for strict consistency.

Production Patterns

In production, MongoDB clusters combine sharding with replica sets for fault tolerance. Monitoring tools track shard balance and replication health. Shard keys are chosen based on query patterns to minimize cross-shard queries. Backup and recovery plans consider distributed data. Rolling upgrades and maintenance require careful coordination to avoid downtime.

Connections

Load balancing

Horizontal scaling uses load balancing to distribute requests across machines.

Understanding load balancing helps grasp how horizontal scaling spreads work evenly to avoid overload.

MapReduce (Distributed Computing)

Both split tasks across machines to process large data sets in parallel.

Knowing MapReduce clarifies how horizontal scaling breaks big problems into smaller parts handled by many machines.

Human teamwork in organizations

Horizontal scaling is like dividing work among team members to increase productivity.

Seeing horizontal scaling as teamwork helps understand the need for coordination, communication, and balancing workload.

Common Pitfalls

#1Choosing a shard key that causes uneven data distribution.

Wrong approach:Sharding on a field with few unique values, e.g., sharding all users by country when most users are from one country.

Correct approach:Shard on a field with high cardinality and even distribution, e.g., user ID or hashed value.

Root cause:Misunderstanding that shard keys must evenly split data to avoid hotspots.

#2Querying without considering shard key, causing scatter-gather queries.

Wrong approach:Running queries that do not include the shard key, forcing the router to query all shards.

Correct approach:Include the shard key in queries to target specific shards and improve performance.

Root cause:Not realizing that queries without shard keys reduce horizontal scaling benefits.

#3Ignoring replication lag leading to stale reads.

Wrong approach:Reading from secondary replicas without handling possible delays in data replication.

Correct approach:Configure read preferences carefully and understand replication lag impact.

Root cause:Overlooking that replicas may not be instantly updated, causing inconsistent reads.

Key Takeaways

Horizontal scaling adds more machines to share data and workload, enabling systems to grow beyond single machine limits.

Data is split across machines using shard keys, and a query router directs requests to the right machine, hiding complexity from users.

Replication within shards protects data and ensures availability even if some machines fail.

Choosing the right shard key and understanding query patterns are crucial for effective horizontal scaling.

Horizontal scaling introduces complexity and trade-offs, so it must be used thoughtfully with awareness of its challenges.