In MongoDB, choosing the right shard key affects how data is distributed and accessed. Which of the following best explains why selecting an appropriate shard key is crucial?
Think about how data is split and accessed in a distributed system.
The shard key defines how MongoDB distributes data across shards. A good shard key ensures even data distribution and efficient queries. Other options relate to different database features.
Consider a MongoDB collection sharded on the field userId. If queries mostly filter by userId, what is the expected effect on query performance?
Think about how shard keys help route queries.
When queries filter by the shard key, MongoDB routes the query to the relevant shard(s), improving performance. Scanning all shards happens if the shard key is not used in the query filter.
Which MongoDB command correctly creates a sharded collection with orderId as the shard key?
sh.enableSharding("salesDB"); sh.shardCollection("salesDB.orders", {orderId: 1});
Look for the correct syntax for specifying shard keys as a document with field and direction.
The shard key must be specified as a document with the field name and sort order (1 for ascending). Option D uses the correct syntax. Option D uses a string instead of a document. Option D is invalid syntax. Option D uses a string instead of 1 or -1.
You have a collection with a status field that has only three possible values: 'new', 'processing', and 'done'. You want to shard this collection. Which shard key choice will likely lead to better data distribution?
Think about how many unique values the shard key has and how that affects distribution.
Sharding on status alone causes only three chunks, leading to uneven distribution. Adding createdAt creates many unique shard key values, improving balance. Sharding on createdAt alone can cause hotspotting if data is inserted in order. Sharding on a constant value is invalid.
A collection is sharded on the timestamp field, which is always increasing. Over time, queries slow down and one shard becomes overloaded. What is the main reason for this problem?
Consider how data is distributed when the shard key values increase over time.
Using a monotonically increasing shard key like timestamp causes all new data to be routed to the same shard, creating a hotspot and unbalanced load. This slows down queries and affects cluster performance. MongoDB supports compound keys and indexing is automatic on shard keys.