0
0
MongodbHow-ToBeginner · 4 min read

How to Choose Shard Key in MongoDB: Best Practices and Examples

Choose a shard key in MongoDB that evenly distributes data across shards, supports your most common queries, and has high cardinality to avoid hotspots. The key should be immutable and frequently used in queries to maximize performance and scalability.
📐

Syntax

The shard key is specified when you enable sharding on a collection using the sh.shardCollection() command. It defines the field or fields MongoDB uses to distribute data across shards.

Syntax:

sh.shardCollection("database.collection", { shardKeyField: 1 })

Here, database.collection is the target collection, and shardKeyField is the field used as the shard key. The value 1 means ascending order.

mongodb
sh.shardCollection("mydb.users", { userId: 1 })
💻

Example

This example shows how to shard a collection named orders in the shop database using the orderId field as the shard key. The orderId has high cardinality and is used often in queries.

mongodb
use shop
sh.enableSharding("shop")
sh.shardCollection("shop.orders", { orderId: 1 })
Output
{ "ok" : 1 }
⚠️

Common Pitfalls

Choosing a bad shard key can cause uneven data distribution and slow queries. Common mistakes include:

  • Using a field with low cardinality (few unique values), causing data hotspots.
  • Choosing a shard key that is frequently updated, which is not allowed.
  • Picking a shard key not used in queries, leading to scatter-gather queries and poor performance.

Example of a bad shard key (low cardinality):

mongodb
sh.shardCollection("shop.orders", { status: 1 })  // 'status' has few unique values like 'pending', 'shipped'
📊

Quick Reference

TipExplanation
High CardinalityChoose a shard key with many unique values to balance data.
Immutable FieldShard key fields cannot be changed after insertion.
Query SupportUse a shard key that appears in most queries for efficiency.
Even DistributionAvoid keys that cause data to cluster on few shards.
Avoid Monotonically Increasing KeysKeys like timestamps can cause write hotspots.

Key Takeaways

Pick a shard key with high cardinality to spread data evenly across shards.
Use a shard key that your queries frequently filter on to improve performance.
Shard keys must be immutable and cannot be updated after insertion.
Avoid shard keys with low unique values to prevent hotspots and uneven load.
Test shard key choices with your workload to ensure balanced distribution.