0
0
MongoDBquery~10 mins

Pipeline mental model (stages flow) in MongoDB - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Pipeline mental model (stages flow)
Input Documents
Stage 1: $match
Stage 2: $group
Stage 3: $sort
Stage 4: $project
Output Documents
Documents flow through each pipeline stage in order, transforming data step-by-step until the final output is produced.
Execution Sample
MongoDB
db.sales.aggregate([
  { $match: { status: "A" } },
  { $group: { _id: "$cust_id", total: { $sum: "$amount" } } },
  { $sort: { total: -1 } },
  { $project: { total: 1, _id: 0 } }
])
This pipeline filters sales with status 'A', groups by customer ID summing amounts, sorts by total descending, then projects only the total.
Execution Table
StepStageInput DocumentsActionOutput Documents
1$match[{cust_id:1, status:'A', amount:100}, {cust_id:2, status:'B', amount:200}, {cust_id:1, status:'A', amount:50}]Filter documents where status = 'A'[{cust_id:1, status:'A', amount:100}, {cust_id:1, status:'A', amount:50}]
2$group[{cust_id:1, status:'A', amount:100}, {cust_id:1, status:'A', amount:50}]Group by cust_id and sum amounts[{_id:1, total:150}]
3$sort[{_id:1, total:150}]Sort by total descending (only one document, no change)[{_id:1, total:150}]
4$project[{_id:1, total:150}]Include only total field, exclude _id[{total:150}]
5End[{total:150}]Pipeline complete, output final documents[{total:150}]
💡 All stages processed, pipeline ends with final transformed documents.
Variable Tracker
VariableStartAfter 1After 2After 3After 4Final
documents[{cust_id:1, status:'A', amount:100}, {cust_id:2, status:'B', amount:200}, {cust_id:1, status:'A', amount:50}][{cust_id:1, status:'A', amount:100}, {cust_id:1, status:'A', amount:50}][{_id:1, total:150}][{_id:1, total:150}][{total:150}][{total:150}]
Key Moments - 3 Insights
Why does the $match stage reduce the number of documents?
Because $match filters documents based on a condition (status = 'A'), only documents meeting this condition pass to the next stage, as shown in execution_table row 1.
How does $group change the shape of documents?
$group aggregates documents by a key (cust_id) and computes sums, producing new documents with _id and total fields, replacing original fields (see execution_table row 2).
Why does $project exclude the _id field in the final output?
$project controls which fields appear in output documents; setting _id:0 removes it, so only total remains, as shown in execution_table row 4.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 2, what is the total amount for cust_id 1?
A150
B100
C50
D200
💡 Hint
Check the output documents column at step 2 in the execution_table.
At which step does the number of documents reduce from 3 to 2?
AStep 3 ($sort)
BStep 2 ($group)
CStep 1 ($match)
DStep 4 ($project)
💡 Hint
Look at the Input and Output Documents columns in execution_table rows 1 and 2.
If we remove the $project stage, what would the final output documents include?
AOnly total field
BBoth _id and total fields
COnly _id field
DNo fields
💡 Hint
Compare output documents at steps 4 and 5 in execution_table.
Concept Snapshot
MongoDB aggregation pipeline processes documents through ordered stages.
Each stage transforms documents (filter, group, sort, project).
Documents flow from input to output step-by-step.
Stages like $match filter, $group aggregates, $sort orders, $project shapes output.
Final output is the result after all stages run.
Full Transcript
The MongoDB aggregation pipeline works by passing documents through a series of stages. Each stage takes the documents from the previous stage, applies a transformation, and passes the results to the next stage. For example, $match filters documents by a condition, reducing the number of documents. Then $group aggregates documents by a key and computes sums, changing the document structure. $sort orders the documents, and $project selects which fields to keep in the output. This flow continues until all stages are processed, producing the final output documents. The execution table shows each step's input, action, and output, helping visualize how documents change through the pipeline.