0
0
MongoDBquery~5 mins

Denormalization trade-offs in MongoDB

Choose your learning style9 modes available
Introduction
Denormalization means storing related data together to make reading faster, but it can make updating data harder.
When you want to get data quickly without joining many pieces.
When your app reads data much more often than it changes data.
When you want to reduce the number of database queries for better speed.
When your data rarely changes, so updating duplicates is not a big problem.
When you want to simplify your queries by having all needed data in one place.
Syntax
MongoDB
No specific syntax because denormalization is a design choice.
It means embedding related data inside documents instead of linking separate collections.
Denormalization in MongoDB often means embedding documents inside other documents.
It trades off update complexity for faster reads.
Examples
Data is split into two collections: users and addresses. You need to query both to get full info.
MongoDB
/* Normalized */
{
  _id: 1,
  name: "Alice",
  address_id: 101
}

{
  _id: 101,
  street: "123 Main St",
  city: "Townsville"
}
Address is stored inside the user document. Reading user info is faster because all data is in one place.
MongoDB
/* Denormalized */
{
  _id: 1,
  name: "Alice",
  address: {
    street: "123 Main St",
    city: "Townsville"
  }
}
Sample Program
This inserts a user with embedded address data and then retrieves it in one query.
MongoDB
db.users.insertOne({
  _id: 1,
  name: "Alice",
  address: {
    street: "123 Main St",
    city: "Townsville"
  }
});

const user = db.users.findOne({_id: 1});
printjson(user);
OutputSuccess
Important Notes
Denormalization can cause data duplication, so updates must be done carefully to keep data consistent.
It is best when read speed is more important than write speed.
Embedding data works well for data that belongs only to one parent and is not shared.
Summary
Denormalization stores related data together to speed up reading.
It can make updating data harder because of duplication.
Use it when you read data more often than you update it.