Range-based sharding in MongoDB - Time & Space Complexity
When using range-based sharding in MongoDB, we want to understand how the time to find and store data changes as the data grows.
We ask: How does the number of operations grow when the amount of data increases?
Analyze the time complexity of a query on a range-sharded collection.
// Assume a collection sharded by a range key
// Query to find documents in a range
db.collection.find({ shardKey: { $gte: 100, $lt: 200 } })
This query fetches documents where the shard key is between 100 and 199, using range-based sharding.
Look for repeated work done by the query.
- Primary operation: Scanning documents within the specified range on the relevant shard(s).
- How many times: The number of documents in the range determines how many times the scan happens.
As the range size or data grows, the work grows roughly with how many documents match.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | Scan about 10 documents |
| 100 | Scan about 100 documents |
| 1000 | Scan about 1000 documents |
Pattern observation: The work grows linearly with the number of documents in the range.
Time Complexity: O(n)
This means the time to run the query grows directly with how many documents are in the requested range.
[X] Wrong: "Range queries on sharded collections always run in constant time because data is split."
[OK] Correct: Even though data is split, the query still scans all matching documents in the range, so time grows with the number of matches.
Understanding how range-based sharding affects query time helps you explain real database behavior clearly and confidently.
"What if the query used an equality match on the shard key instead of a range? How would the time complexity change?"