Joins vs embedding decision in MongoDB - Performance Comparison
When working with MongoDB, choosing between joins and embedding affects how fast queries run.
We want to understand how the time to get data changes as the data grows.
Analyze the time complexity of these two ways to get related data.
// Using embedding
db.orders.find({ _id: orderId })
// Using join (lookup)
db.orders.aggregate([
{ $match: { _id: orderId } },
{ $lookup: {
from: 'products',
localField: 'productIds',
foreignField: '_id',
as: 'products'
}}
])
The first gets order and products inside it directly. The second joins orders with products collection.
Look at what repeats when running these queries.
- Primary operation: For embedding, a single document fetch; for join, matching plus scanning related product documents.
- How many times: Embedding fetches one document; join scans all related product IDs to find matches.
As the number of related products grows, the work changes differently.
| Input Size (number of related products) | Approx. Operations |
|---|---|
| 10 | Embedding: 1 fetch; Join: 10 lookups |
| 100 | Embedding: 1 fetch; Join: 100 lookups |
| 1000 | Embedding: 1 fetch; Join: 1000 lookups |
Pattern observation: Embedding stays constant; join work grows with number of related items.
Time Complexity: O(n) where n is the number of related documents in join.
This means fetching embedded data stays fast no matter size, but joining takes longer as related data grows.
[X] Wrong: "Joins are always slow and embedding is always better."
[OK] Correct: Embedding can cause large documents that slow writes and use more memory; joins can be efficient if related data is large or changes often.
Understanding how data structure affects query speed shows you can design databases that work well as data grows.
"What if we indexed the foreignField in the join? How would the time complexity change?"