0
0
MongoDBquery~10 mins

Choosing a good shard key in MongoDB - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Choosing a good shard key
Start: Need to shard collection
Analyze data access patterns
Identify candidate shard keys
Evaluate shard key properties
Shard collection using chosen key
Monitor performance and rebalance if needed
This flow shows how to pick a shard key by analyzing data and queries, then choosing a key that balances data and query load well.
Execution Sample
MongoDB
db.orders.createIndex({ orderDate: 1 })
db.orders.createIndex({ customerId: 1 })
db.orders.createIndex({ region: 1 })

// Choose shard key based on query and data
sh.shardCollection("shop.orders", { customerId: 1 })
This example creates indexes and chooses customerId as the shard key for balanced distribution and query efficiency.
Execution Table
StepActionEvaluationResult
1Analyze query patternsQueries mostly filter by customerIdcustomerId is a good candidate
2Check cardinalitycustomerId has many unique valuesHigh cardinality confirmed
3Check data distributionOrders spread evenly across customersEven data distribution
4Check write frequencyWrites are frequent per customerShard key supports write scaling
5Choose shard keycustomerId meets all criteriaShard key set to customerId
6Shard collectionRun sh.shardCollection commandCollection sharded on customerId
7MonitorCheck chunk distribution and query performanceBalanced load observed
8ExitShard key chosen and appliedSharding setup complete
💡 Shard key chosen after confirming it supports balanced data and query load
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3After Step 4Final
candidateShardKeynonecustomerIdcustomerIdcustomerIdcustomerIdcustomerId
cardinalityunknownunknownhighhighhighhigh
dataDistributionunknownunknownunknowneveneveneven
writeFrequencyunknownunknownunknownunknownfrequentfrequent
Key Moments - 3 Insights
Why is high cardinality important for a shard key?
High cardinality means many unique values, which helps distribute data evenly across shards, as shown in execution_table step 2.
Can a shard key with low query usage still be good?
No, a shard key should be used often in queries to route requests efficiently, as seen in step 1 where query patterns guide the choice.
What happens if data distribution is uneven?
Uneven distribution causes some shards to be overloaded, reducing performance. Step 3 checks for even distribution to avoid this.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, what is the candidate shard key after Step 1?
AorderDate
BcustomerId
Cregion
Dnone
💡 Hint
Check the 'Evaluation' column in Step 1 of execution_table
At which step does the evaluation confirm high cardinality?
AStep 1
BStep 3
CStep 2
DStep 4
💡 Hint
Look at the 'Evaluation' column for the step mentioning cardinality
If queries mostly filtered by region instead of customerId, what would change in the variable_tracker?
AcandidateShardKey would be 'region'
Bcardinality would be 'low'
CdataDistribution would be 'uneven'
DwriteFrequency would be 'infrequent'
💡 Hint
Refer to candidateShardKey values in variable_tracker and how query patterns affect it
Concept Snapshot
Choosing a good shard key:
- Analyze query patterns and data
- Pick a key with high cardinality
- Ensure even data distribution
- Use a key frequently in queries
- Shard collection with chosen key
- Monitor and rebalance if needed
Full Transcript
Choosing a good shard key in MongoDB involves analyzing how data is accessed and distributed. First, look at which fields queries use most often. Then check if those fields have many unique values (high cardinality) to spread data evenly. Also, ensure data is balanced across shards to avoid hotspots. Finally, pick a shard key that supports your write and read patterns well. This process helps keep your database fast and scalable.