Jump into concepts and practice - no test required
or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Understanding Normalization vs Denormalization in MongoDB
📖 Scenario: You are working on a simple online bookstore database using MongoDB. You want to understand how to organize your data using normalization and denormalization techniques.
🎯 Goal: Build two collections: one using normalization (separate collections with references) and one using denormalization (embedding related data). Learn how to create these structures step-by-step.
📋 What You'll Learn
Create a books collection with book details
Create an authors collection with author details
Create a normalized structure by referencing authors in books
Create a denormalized structure by embedding author details inside books
💡 Why This Matters
🌍 Real World
Many real-world applications use MongoDB to store data. Understanding when to normalize or denormalize data helps design efficient and easy-to-use databases.
💼 Career
Database developers and backend engineers often decide how to structure data for performance and maintainability. Knowing normalization and denormalization is essential for these roles.
Progress0 / 4 steps
1
Create the authors collection
Create a collection called authors with these exact documents: { _id: 1, name: "Jane Austen" } and { _id: 2, name: "Mark Twain" }.
MongoDB
Hint
Use insertMany to add multiple documents to the authors collection.
2
Create the books collection with author references (Normalization)
Create a collection called books with these exact documents: { title: "Pride and Prejudice", author_id: 1 } and { title: "Adventures of Huckleberry Finn", author_id: 2 }. Use author_id to reference authors.
MongoDB
Hint
Use author_id to link books to authors by their _id.
3
Create the books_denormalized collection with embedded author details (Denormalization)
Create a collection called books_denormalized with these exact documents: { title: "Pride and Prejudice", author: { name: "Jane Austen" } } and { title: "Adventures of Huckleberry Finn", author: { name: "Mark Twain" } }. Embed author details inside each book document.
MongoDB
Hint
Embed the author object inside each book document instead of using a reference.
4
Add an index on author_id in books collection
Create an index on the author_id field in the books collection to speed up queries that find books by author.
MongoDB
Hint
Use createIndex on the author_id field in the books collection.
Practice
(1/5)
1. What is the main advantage of normalization in MongoDB databases?
easy
A. It separates data into collections linked by references for easy updates.
B. It stores all related data together in one document for faster reads.
C. It duplicates data to improve write performance.
D. It automatically creates indexes on all fields.
Solution
Step 1: Understand normalization concept
Normalization means splitting data into separate collections and linking them by references.
Step 2: Identify the main benefit
This separation makes updating data easier because changes happen in one place without duplication.
Final Answer:
It separates data into collections linked by references for easy updates. -> Option A
Quick Check:
Normalization = separate collections + easy updates [OK]
Hint: Normalization means separate collections linked by references [OK]
Common Mistakes:
Confusing normalization with denormalization
Thinking normalization duplicates data
Assuming normalization speeds up reads
2. Which MongoDB document structure shows denormalization?
Hint: Denormalization embeds related data inside one document [OK]
Common Mistakes:
Choosing separate collections as denormalized
Ignoring embedded arrays as denormalization
Confusing null fields with embedded data
3. Given these two collections: users: { _id: 1, name: 'Bob' } orders: { _id: 101, userId: 1, item: 'Pen' } What is the main drawback of this normalized design when reading user orders?
medium
A. It requires multiple queries or a join-like operation to get all orders for a user.
B. It duplicates order data inside each user document.
C. It stores all orders inside the user document causing large documents.
D. It prevents updating user names easily.
Solution
Step 1: Understand normalized design
Users and orders are in separate collections linked by userId reference.
Step 2: Identify drawback when reading
To get all orders for a user, you must query orders collection filtering by userId, requiring multiple queries or aggregation.
Final Answer:
It requires multiple queries or a join-like operation to get all orders for a user. -> Option A
Quick Check:
Normalized read = multiple queries [OK]
Hint: Normalized data needs multiple queries to combine related info [OK]
Common Mistakes:
Thinking normalized data duplicates info
Assuming all data is embedded in one document
Believing updates are harder in normalized data
4. You have a denormalized MongoDB document: { _id: 1, name: 'Carol', orders: [ { orderId: 201, item: 'Notebook' } ] } Which problem can occur if you update the item name in one order but forget to update it elsewhere?
medium
A. Query performance slows down because of references.
B. Indexes on orders array are lost.
C. The database schema becomes normalized automatically.
D. Data inconsistency due to duplicated order info in multiple documents.
Solution
Step 1: Recognize denormalization risk
Denormalization duplicates related data inside documents, so the same order info may appear in many places.
Step 2: Understand update problem
If you update one copy but not others, data becomes inconsistent and unreliable.
Final Answer:
Data inconsistency due to duplicated order info in multiple documents. -> Option D
Quick Check:
Denormalization risk = data inconsistency [OK]
Hint: Denormalization can cause inconsistent duplicated data if not updated everywhere [OK]
Common Mistakes:
Thinking denormalization slows queries
Believing schema changes automatically
Confusing index loss with denormalization
5. You want to design a MongoDB schema for a blog with users and posts. Users have many posts, and posts rarely change after creation. Which design is best for fast reading and why?
Options: A: Store users and posts in separate collections (normalized). B: Embed all posts inside each user document (denormalized). C: Duplicate posts in both users and posts collections. D: Store posts only, with user info duplicated in each post.
hard
A. Separate collections for users and posts for easy updates.
B. Store posts only with duplicated user info for simpler queries.
C. Embed posts inside user documents for fast reads since posts rarely change.
D. Duplicate posts in both collections to optimize writes.
Solution
Step 1: Analyze data change frequency
Posts rarely change, so embedding them inside users won't cause frequent update problems.
Step 2: Choose design for fast reads
Embedding posts inside user documents allows fetching user and posts in one read, improving read speed.
Step 3: Compare options
Embedding posts inside user documents for fast reads since posts rarely change fits best for fast reads with rare updates; separate collections require joins; duplicating posts in both risks inconsistency; storing posts only duplicates user info unnecessarily.
Final Answer:
Embed posts inside user documents for fast reads since posts rarely change. -> Option C
Quick Check:
Denormalization + rare updates = embed for fast reads [OK]
Hint: Embed rarely changing related data for faster reads [OK]