Computed pattern for pre-aggregation in MongoDB - Time & Space Complexity
When we use computed patterns for pre-aggregation in MongoDB, we want to know how the work grows as data grows.
We ask: How does the time to compute aggregated results change when more data is added?
Analyze the time complexity of the following MongoDB aggregation pipeline.
db.orders.aggregate([
{ $match: { status: "completed" } },
{ $group: { _id: "$customerId", totalSpent: { $sum: "$amount" } } },
{ $project: { customerId: "$_id", totalSpent: 1, _id: 0 } }
])
This pipeline filters completed orders, groups them by customer, and sums the amount spent per customer.
Look for repeated work inside the pipeline.
- Primary operation: Scanning each order document once to filter and group.
- How many times: Once per document in the orders collection.
As the number of orders grows, the pipeline processes each order once.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 document checks and group updates |
| 100 | About 100 document checks and group updates |
| 1000 | About 1000 document checks and group updates |
Pattern observation: The work grows roughly in direct proportion to the number of orders.
Time Complexity: O(n)
This means the time to compute the aggregation grows linearly with the number of input documents.
[X] Wrong: "Grouping by customer means the time depends on the number of customers only."
[OK] Correct: The pipeline must still scan every order document to group them, so time depends on total orders, not just customers.
Understanding how aggregation scales helps you explain data processing costs clearly and shows you can reason about real database workloads.
"What if we added an index on the status field used in $match? How would that affect the time complexity?"