Pipeline execution order matters in MongoDB - Time & Space Complexity
When using MongoDB pipelines, the order of steps affects how long the query takes.
We want to know how the number of operations changes as the data flows through each step.
Analyze the time complexity of the following MongoDB aggregation pipeline.
db.collection.aggregate([
{ $match: { status: "active" } },
{ $group: { _id: "$category", total: { $sum: "$amount" } } },
{ $sort: { total: -1 } }
])
This pipeline filters active records, groups them by category summing amounts, then sorts by total.
Look at the steps that repeat over the data:
- Primary operation: Scanning all documents to filter with $match.
- How many times: Once for each document in the collection.
- Grouping then processes each matched document once.
- Sorting processes each group once after grouping.
As the number of documents grows, the operations increase roughly like this:
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 scans, grouping, and sorting a few groups |
| 100 | About 100 scans, grouping, and sorting more groups |
| 1000 | About 1000 scans, grouping, and sorting many groups |
Pattern observation: The first step scans all documents, so operations grow directly with input size.
Time Complexity: O(n log n)
This means the time grows a bit faster than the number of documents because sorting adds extra steps.
[X] Wrong: "The order of pipeline steps does not affect performance."
[OK] Correct: Filtering early reduces data size for later steps, making the whole pipeline faster.
Understanding how pipeline order affects time helps you write faster queries and shows you think about efficiency.
"What if we moved the $sort step before the $match? How would the time complexity change?"