Why expressions matter in pipelines in MongoDB - Performance Analysis
When using MongoDB pipelines, the way expressions are written affects how long queries take to run.
We want to understand how the choice of expressions changes the work the database does as data grows.
Analyze the time complexity of the following MongoDB aggregation pipeline snippet.
db.orders.aggregate([
{ $match: { status: "A" } },
{ $project: {
totalPrice: { $multiply: ["$price", "$quantity"] },
discountPrice: { $subtract: ["$price", "$discount"] }
}
}
])
This pipeline filters orders with status "A" and calculates new fields using expressions.
Look for repeated work done on each document in the collection.
- Primary operation: Applying expressions ($multiply, $subtract) to each matched document.
- How many times: Once for every document that matches the filter.
As the number of matching documents grows, the database must do more calculations.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 calculations |
| 100 | About 100 calculations |
| 1000 | About 1000 calculations |
Pattern observation: The work grows directly with the number of matching documents.
Time Complexity: O(n)
This means the time to run grows in a straight line with the number of documents processed.
[X] Wrong: "Expressions inside pipelines run once, so their cost does not grow with data size."
[OK] Correct: Each expression runs for every document that passes through, so more documents mean more work.
Understanding how expressions affect pipeline performance helps you write efficient queries and shows you think about real data growth.
What if we added a $group stage after $project that aggregates results? How would that change the time complexity?