0
0
MongoDBquery~5 mins

$lookup with pipeline (advanced join) in MongoDB - Time & Space Complexity

Choose your learning style9 modes available
Time Complexity: $lookup with pipeline (advanced join)
O(n x m)
Understanding Time Complexity

When using $lookup with a pipeline in MongoDB, we want to understand how the time it takes grows as the data gets bigger.

We ask: How does the number of operations change when joining collections with a pipeline?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.


db.orders.aggregate([
  {
    $lookup: {
      from: "products",
      let: { order_item: "$item" },
      pipeline: [
        { $match: { $expr: { $eq: ["$name", "$$order_item"] } } },
        { $project: { price: 1, _id: 0 } }
      ],
      as: "productDetails"
    }
  }
])

This code joins the orders collection with products using a pipeline to match and project product details for each order item.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: For each document in orders, the pipeline runs a $match on products.
  • How many times: Once per orders document, so the number of orders documents times the cost of matching in products.
How Execution Grows With Input

As the number of orders grows, the pipeline runs more times. Each run searches products for matches.

Input Size (orders)Approx. Operations
1010 x cost of matching in products
100100 x cost of matching in products
10001000 x cost of matching in products

Pattern observation: The total work grows roughly in direct proportion to the number of orders documents.

Final Time Complexity

Time Complexity: O(n x m)

This means the time grows with the number of orders (n) times the number of products (m) scanned per lookup.

Common Mistake

[X] Wrong: "The $lookup with pipeline runs just once regardless of input size."

[OK] Correct: Actually, the pipeline runs once for each orders document, so the work multiplies with input size.

Interview Connect

Understanding how $lookup with pipelines scales helps you explain data joining costs clearly, a useful skill when discussing database performance.

Self-Check

What if we added an index on the products.name field used in $match? How would the time complexity change?