Pipeline mental model (stages flow) in MongoDB - Time & Space Complexity
When using a MongoDB pipeline, we want to know how the time it takes grows as we add more data or stages.
We ask: How does the pipeline's work increase when the input or steps grow?
Analyze the time complexity of the following MongoDB aggregation pipeline.
db.orders.aggregate([
{ $match: { status: "shipped" } },
{ $group: { _id: "$customerId", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } },
{ $limit: 5 }
])
This pipeline filters shipped orders, groups by customer to sum amounts, sorts totals descending, and limits to top 5.
Look for repeated work inside the pipeline.
- Primary operation: Scanning all documents to filter and group.
- How many times: Each document is processed once through each stage in order.
As the number of orders grows, the pipeline must check and group more documents.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 document checks and groupings |
| 100 | About 100 document checks and groupings |
| 1000 | About 1000 document checks and groupings |
Pattern observation: The work grows roughly in direct proportion to the number of documents.
Time Complexity: O(n log n)
This means the pipeline's work grows roughly linearly with the number of input documents, but sorting adds a logarithmic factor.
[X] Wrong: "Adding more stages always multiplies the time by the number of stages."
[OK] Correct: Each document passes through stages one after another, so time grows mostly with input size, not simply by multiplying stages.
Understanding how pipeline stages process data step-by-step helps you explain performance clearly and shows you know how MongoDB handles data flow.
"What if we added a $lookup stage that joins with another collection? How would the time complexity change?"