How to Choose Shard Key in MongoDB: Best Practices and Examples
Choose a
shard key in MongoDB that evenly distributes data across shards, supports your most common queries, and has high cardinality to avoid hotspots. The key should be immutable and frequently used in queries to maximize performance and scalability.Syntax
The shard key is specified when you enable sharding on a collection using the sh.shardCollection() command. It defines the field or fields MongoDB uses to distribute data across shards.
Syntax:
sh.shardCollection("database.collection", { shardKeyField: 1 })Here, database.collection is the target collection, and shardKeyField is the field used as the shard key. The value 1 means ascending order.
mongodb
sh.shardCollection("mydb.users", { userId: 1 })
Example
This example shows how to shard a collection named orders in the shop database using the orderId field as the shard key. The orderId has high cardinality and is used often in queries.
mongodb
use shop sh.enableSharding("shop") sh.shardCollection("shop.orders", { orderId: 1 })
Output
{ "ok" : 1 }
Common Pitfalls
Choosing a bad shard key can cause uneven data distribution and slow queries. Common mistakes include:
- Using a field with low cardinality (few unique values), causing data hotspots.
- Choosing a shard key that is frequently updated, which is not allowed.
- Picking a shard key not used in queries, leading to scatter-gather queries and poor performance.
Example of a bad shard key (low cardinality):
mongodb
sh.shardCollection("shop.orders", { status: 1 }) // 'status' has few unique values like 'pending', 'shipped'
Quick Reference
| Tip | Explanation |
|---|---|
| High Cardinality | Choose a shard key with many unique values to balance data. |
| Immutable Field | Shard key fields cannot be changed after insertion. |
| Query Support | Use a shard key that appears in most queries for efficiency. |
| Even Distribution | Avoid keys that cause data to cluster on few shards. |
| Avoid Monotonically Increasing Keys | Keys like timestamps can cause write hotspots. |
Key Takeaways
Pick a shard key with high cardinality to spread data evenly across shards.
Use a shard key that your queries frequently filter on to improve performance.
Shard keys must be immutable and cannot be updated after insertion.
Avoid shard keys with low unique values to prevent hotspots and uneven load.
Test shard key choices with your workload to ensure balanced distribution.