$addToSet accumulator for unique arrays in MongoDB - Time & Space Complexity
When using the $addToSet accumulator in MongoDB, it is important to understand how the time it takes grows as the input data grows.
We want to know how the operation of adding unique items to an array scales with more data.
Analyze the time complexity of the following MongoDB aggregation snippet using $addToSet.
db.collection.aggregate([
{ $group: {
_id: "$category",
uniqueItems: { $addToSet: "$item" }
}}
])
This groups documents by category and collects unique items into an array for each group.
Look at what repeats as data grows:
- Primary operation: Checking if an item is already in the unique array before adding.
- How many times: For each document in a group, this check happens once.
As the number of documents in a group grows, the number of checks to keep items unique grows too.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 checks |
| 100 | About 100 checks |
| 1000 | About 1000 checks |
Pattern observation: The number of operations grows roughly in direct proportion to the number of documents.
Time Complexity: O(n)
This means the time to build the unique array grows linearly with the number of documents processed.
[X] Wrong: "$addToSet adds items instantly without checking duplicates, so time stays constant."
[OK] Correct: The operation must check if each item is already in the set to keep it unique, so time grows as more items are processed.
Understanding how MongoDB handles uniqueness with $addToSet helps you explain data aggregation efficiency clearly and confidently.
"What if we replaced $addToSet with $push and then removed duplicates later? How would the time complexity change?"