Normalization and denormalization help organize data in a database. Normalization keeps data clean and avoids repetition. Denormalization stores data together to make reading faster.
Normalization vs denormalization default in MongoDB
Start learning this pattern below
Jump into concepts and practice - no test required
or
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
Syntax
MongoDB
Normalization: Store related data in separate collections and link them using references. Denormalization: Store related data together in the same document by embedding.
Normalization uses references between collections.
Denormalization uses embedded documents inside one collection.
Examples
MongoDB
// Normalization example // Separate collections for users and orders { _id: ObjectId("user1"), name: "Alice" } { _id: ObjectId("order1"), userId: ObjectId("user1"), product: "Book" }
MongoDB
// Denormalization example // Embed orders inside user document { _id: ObjectId("user1"), name: "Alice", orders: [ { product: "Book" }, { product: "Pen" } ] }
Sample Program
This shows normalized data with separate collections. The query joins orders with users to get user names.
MongoDB
// Normalization: Insert user and order separately use shopDB; db.users.insertOne({ _id: "user1", name: "Alice" }); db.orders.insertOne({ _id: "order1", userId: "user1", product: "Book" }); // Query: Find orders with user name const orders = db.orders.aggregate([ { $lookup: { from: "users", localField: "userId", foreignField: "_id", as: "user" } }, { $unwind: "$user" }, { $project: { product: 1, userName: "$user.name" } } ]).toArray(); printjson(orders);
Important Notes
Normalization reduces data duplication but can make queries slower because of joins.
Denormalization speeds up reads but can cause data duplication and harder updates.
MongoDB defaults to denormalization by embedding data inside documents.
Summary
Normalization separates data into collections linked by references.
Denormalization stores related data together inside documents.
Choose normalization for easy updates, denormalization for fast reads.
Practice
1. What is the main advantage of normalization in MongoDB databases?
easy
Solution
Step 1: Understand normalization concept
Normalization means splitting data into separate collections and linking them by references.Step 2: Identify the main benefit
This separation makes updating data easier because changes happen in one place without duplication.Final Answer:
It separates data into collections linked by references for easy updates. -> Option AQuick Check:
Normalization = separate collections + easy updates [OK]
Hint: Normalization means separate collections linked by references [OK]
Common Mistakes:
- Confusing normalization with denormalization
- Thinking normalization duplicates data
- Assuming normalization speeds up reads
2. Which MongoDB document structure shows denormalization?
easy
Solution
Step 1: Identify denormalized structure
Denormalization stores related data together inside one document, like embedding orders inside user.Step 2: Check options for embedded data
{ _id: 1, name: 'Alice', orders: [ { orderId: 101, item: 'Book' } ] } embeds orders array inside the user document, showing denormalization.Final Answer:
{ _id: 1, name: 'Alice', orders: [ { orderId: 101, item: 'Book' } ] } -> Option BQuick Check:
Denormalization = embedded related data [OK]
Hint: Denormalization embeds related data inside one document [OK]
Common Mistakes:
- Choosing separate collections as denormalized
- Ignoring embedded arrays as denormalization
- Confusing null fields with embedded data
3. Given these two collections:
What is the main drawback of this normalized design when reading user orders?
users: { _id: 1, name: 'Bob' }orders: { _id: 101, userId: 1, item: 'Pen' }What is the main drawback of this normalized design when reading user orders?
medium
Solution
Step 1: Understand normalized design
Users and orders are in separate collections linked by userId reference.Step 2: Identify drawback when reading
To get all orders for a user, you must query orders collection filtering by userId, requiring multiple queries or aggregation.Final Answer:
It requires multiple queries or a join-like operation to get all orders for a user. -> Option AQuick Check:
Normalized read = multiple queries [OK]
Hint: Normalized data needs multiple queries to combine related info [OK]
Common Mistakes:
- Thinking normalized data duplicates info
- Assuming all data is embedded in one document
- Believing updates are harder in normalized data
4. You have a denormalized MongoDB document:
Which problem can occur if you update the item name in one order but forget to update it elsewhere?
{ _id: 1, name: 'Carol', orders: [ { orderId: 201, item: 'Notebook' } ] }Which problem can occur if you update the item name in one order but forget to update it elsewhere?
medium
Solution
Step 1: Recognize denormalization risk
Denormalization duplicates related data inside documents, so the same order info may appear in many places.Step 2: Understand update problem
If you update one copy but not others, data becomes inconsistent and unreliable.Final Answer:
Data inconsistency due to duplicated order info in multiple documents. -> Option DQuick Check:
Denormalization risk = data inconsistency [OK]
Hint: Denormalization can cause inconsistent duplicated data if not updated everywhere [OK]
Common Mistakes:
- Thinking denormalization slows queries
- Believing schema changes automatically
- Confusing index loss with denormalization
5. You want to design a MongoDB schema for a blog with users and posts.
Users have many posts, and posts rarely change after creation.
Which design is best for fast reading and why?
Options:
Users have many posts, and posts rarely change after creation.
Which design is best for fast reading and why?
Options:
A: Store users and posts in separate collections (normalized).B: Embed all posts inside each user document (denormalized).C: Duplicate posts in both users and posts collections.D: Store posts only, with user info duplicated in each post.hard
Solution
Step 1: Analyze data change frequency
Posts rarely change, so embedding them inside users won't cause frequent update problems.Step 2: Choose design for fast reads
Embedding posts inside user documents allows fetching user and posts in one read, improving read speed.Step 3: Compare options
Embedding posts inside user documents for fast reads since posts rarely change fits best for fast reads with rare updates; separate collections require joins; duplicating posts in both risks inconsistency; storing posts only duplicates user info unnecessarily.Final Answer:
Embed posts inside user documents for fast reads since posts rarely change. -> Option CQuick Check:
Denormalization + rare updates = embed for fast reads [OK]
Hint: Embed rarely changing related data for faster reads [OK]
Common Mistakes:
- Choosing normalization for fast reads
- Duplicating data causing inconsistency
- Ignoring update frequency in design
