Why modeling decisions matter in MongoDB - Performance Analysis
When we design how data is stored in MongoDB, it affects how fast queries run.
We want to see how these design choices change the work the database does as data grows.
Analyze the time complexity of this query on two different data models.
// Model 1: Embedded documents
db.users.find({ "orders.product": "book" })
// Model 2: Referenced documents
// First find order IDs
const orderIds = db.orders.find({ product: "book" }).map(o => o._id);
// Then find users with those orders
const users = db.users.find({ orders: { $in: orderIds } });
This code shows two ways to find users who ordered a book: one with embedded orders, one with separate order documents.
Look at what repeats as data grows.
- Primary operation: Scanning orders inside users or scanning orders collection.
- How many times: Once per user in embedded model; once per order in referenced model.
As the number of users and orders grows, the work changes differently.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 users, 50 orders | Scan 10 users with embedded orders |
| 100 users, 500 orders | Scan 100 users or 500 orders separately |
| 1000 users, 5000 orders | Scan 1000 users or 5000 orders separately |
Pattern observation: Embedded model work grows with users; referenced model work grows with orders.
Time Complexity: O(n)
This means the query time grows linearly with the number of documents scanned, depending on the model.
[X] Wrong: "Embedding data always makes queries faster."
[OK] Correct: Sometimes embedding means scanning many nested items per document, which can slow queries if data is large.
Understanding how data design affects query speed shows you think about real database work, a key skill for any developer.
"What if we added indexes on the embedded fields or referenced fields? How would that change the time complexity?"