What is ranged sharding in mongodb

MongodbConceptBeginner · 3 min read

Ranged Sharding in MongoDB: What It Is and How It Works

In MongoDB, ranged sharding is a method of distributing data across multiple servers by dividing it into continuous ranges based on shard key values. This means documents with similar shard key values are stored together on the same shard, which helps with range queries and ordered data access.

⚙️

How It Works

Imagine you have a big book and want to share it with friends so everyone can read different chapters at the same time. Instead of cutting pages randomly, you split the book into chapters in order. This is like ranged sharding in MongoDB, where data is split into continuous ranges based on shard key values.

Each shard holds a range of data, for example, all records where the key is between 1 and 1000 go to shard A, 1001 to 2000 go to shard B, and so on. This way, queries that look for data in a certain range can quickly find the right shard without searching everywhere.

Ranged sharding is good when your queries often ask for data in order or within specific ranges, like dates or numbers.

💻

Example

This example shows how to enable ranged sharding on a MongoDB collection using a shard key that is a range of values.

javascript

sh.enableSharding("myDatabase")

// Shard the collection using a ranged shard key on the field 'age'
sh.shardCollection("myDatabase.users", { age: 1 })

use myDatabase

// Insert sample documents
for (let i = 1; i <= 5; i++) {
  db.users.insert({ name: "User" + i, age: i * 10 })
}

// Query documents where age is between 10 and 30
db.users.find({ age: { $gte: 10, $lte: 30 } })

Output

{ "_id" : ObjectId(...), "name" : "User1", "age" : 10 } { "_id" : ObjectId(...), "name" : "User2", "age" : 20 } { "_id" : ObjectId(...), "name" : "User3", "age" : 30 }

🎯

When to Use

Use ranged sharding when your application often queries data in ranges or sorted order, such as time series data, user ages, or ordered IDs. It helps improve query speed by directing range queries to specific shards.

However, if your data is not evenly distributed or your queries are random, ranged sharding can cause some shards to hold much more data than others, leading to imbalance. In those cases, other sharding methods like hashed sharding might be better.

✅

Key Points

Ranged sharding splits data into continuous ranges based on shard key values.
It groups similar data together, which is good for range queries.
Can cause uneven data distribution if ranges are not balanced.
Best for ordered or range-based queries like dates or numbers.

✅

Key Takeaways

Ranged sharding divides data into continuous ranges by shard key values to improve range query performance.

It stores similar data together on the same shard, making ordered queries faster.

Use ranged sharding when your queries often request data within specific ranges.

Be cautious of uneven data distribution which can cause some shards to become overloaded.

For random or evenly distributed queries, consider other sharding methods like hashed sharding.