Normalization vs denormalization default in MongoDB - Performance Comparison
When working with databases, how data is organized affects how fast queries run.
We want to see how time to get data changes when using normalized or denormalized data in MongoDB.
Analyze the time complexity of fetching user orders in two ways: normalized and denormalized.
// Normalized: separate collections
const user = db.users.findOne({ _id: userId });
const orders = db.orders.find({ userId: user._id }).toArray();
// Denormalized: embedded orders
const userWithOrders = db.users.findOne({ _id: userId });
const orders = userWithOrders.orders;
This code shows fetching orders separately (normalized) versus embedded inside user (denormalized).
- Normalized primary operation: Querying orders collection for matching userId.
- Normalized how many times: Once per user, but scanning orders that belong to user.
- Denormalized primary operation: Single query to users collection, then access embedded orders array.
- Denormalized how many times: One query, no extra scans.
As the number of orders grows, how does query time change?
| Input Size (orders per user) | Normalized Approx. Operations | Denormalized Approx. Operations |
|---|---|---|
| 10 | Scan 10 orders in orders collection | Access 10 embedded orders |
| 100 | Scan 100 orders in orders collection | Access 100 embedded orders |
| 1000 | Scan 1000 orders in orders collection | Access 1000 embedded orders |
Pattern observation: Both grow roughly linearly with number of orders, but normalized requires separate query scanning orders collection.
Time Complexity: O(n)
This means the time to fetch orders grows linearly with how many orders a user has, whether normalized or denormalized.
[X] Wrong: "Denormalized data always makes queries faster regardless of data size."
[OK] Correct: Large embedded arrays can slow down queries and updates, so time can still grow with data size.
Understanding how data layout affects query time helps you design better databases and answer real questions about performance.
"What if we added an index on userId in the orders collection? How would that change the time complexity for normalized queries?"