Imagine you have a huge collection of documents in MongoDB. You want to split this data across many servers to keep things fast and balanced. Why do you need a shard key for this?
Think about how data is divided and found in many places.
The shard key determines how MongoDB divides data across servers. It helps keep data balanced and makes queries efficient by knowing where to look.
You have a collection where the shard key is a field with only two possible values. What is the likely result when you shard the collection on this key?
Think about how many different values the shard key has and how data spreads.
Low cardinality means few distinct values. This causes data to cluster on few shards, leading to imbalance and slower performance.
You want to shard a collection storing user activity logs. Each document has a userId and a timestamp. Which shard key is best to balance load and support queries filtering by user and time?
db.activity.createIndex({ userId: 1, timestamp: 1 })Consider which key helps distribute data and supports common queries.
Using a compound shard key with userId and timestamp balances data by user and supports queries filtering by user and time range.
You notice one shard is overloaded because many writes target the same shard key value. What is a good way to avoid this hotspotting?
Think about spreading writes evenly across shards.
High cardinality and randomness in the shard key help distribute writes evenly, preventing one shard from becoming a hotspot.
A collection is sharded on the country field. Most documents have country set to 'USA'. What problem arises and why?
Think about how many documents share the same shard key value.
When most documents share the same shard key value, data clusters on one shard. This causes imbalance and reduces performance.