0
0
MongoDBquery~10 mins

Embedding vs referencing decision in MongoDB - Visual Side-by-Side Comparison

Choose your learning style9 modes available
Concept Flow - Embedding vs referencing decision
Start: Need to model related data
Decide: Embed or Reference?
Embed if data is
small, tightly
related, rarely
changed together
Implement chosen data model
Query and update data accordingly
This flow shows how to decide between embedding or referencing data in MongoDB based on data size, relation, and update frequency.
Execution Sample
MongoDB
db.users.insertOne({
  name: "Alice",
  address: { city: "NY", zip: "10001" }
})

// vs

db.users.insertOne({ name: "Bob", address_id: ObjectId("abc123") })
db.addresses.insertOne({ _id: ObjectId("abc123"), city: "LA", zip: "90001" })
Shows embedding address inside user vs referencing address by ID.
Execution Table
StepActionData ModelData StoredQuery Impact
1Insert user with embedded addressEmbedding{name: 'Alice', address: {city: 'NY', zip: '10001'}}Single query to get user and address
2Insert user with referenced addressReferencing{name: 'Bob', address_id: ObjectId('abc123')}, {_id: ObjectId('abc123'), city: 'LA', zip: '90001'}Two queries or $lookup needed to get full data
3Update embedded address cityEmbeddingUpdate user document directlySimple update, no joins
4Update referenced address cityReferencingUpdate address document separatelySeparate update, user document unchanged
5Query user with embedded addressEmbeddingOne document returned with all dataFast, no joins
6Query user with referenced addressReferencingUser document returned, address fetched separatelySlower, needs join or multiple queries
7ExitDecision endsChoose embedding for small, tightly coupled data Choose referencing for large, shared, or frequently changing dataDecision based on data characteristics
💡 Decision stops after evaluating data size, relation, and update patterns
Variable Tracker
VariableStartAfter Step 1After Step 2After Step 3After Step 4Final
User Document{}{name: 'Alice', address: {city: 'NY', zip: '10001'}}{name: 'Alice', address: {city: 'NY', zip: '10001'}}{name: 'Alice', address: {city: 'NY', zip: '10001'}}{name: 'Alice', address: {city: 'NY', zip: '10001'}}Embedded address inside user
Address Document{}N/AN/AN/A{_id: ObjectId('abc123'), city: 'LA', zip: '90001'}{_id: ObjectId('abc123'), city: 'LA', zip: '90001'}
User Document with Reference{}N/A{name: 'Bob', address_id: ObjectId('abc123')}{name: 'Bob', address_id: ObjectId('abc123')}{name: 'Bob', address_id: ObjectId('abc123')}User references address by ID
Key Moments - 3 Insights
Why do we embed data sometimes instead of referencing?
Embedding is chosen when related data is small, changes together, and is accessed together, so one document holds all needed info (see execution_table rows 1, 5).
When is referencing better than embedding?
Referencing is better when related data is large, shared by many documents, or changes independently, requiring separate updates (see execution_table rows 2, 4, 6).
Does embedding always mean faster queries?
Usually yes, because all data is in one document (row 5), but if embedded data grows too large, it can slow writes or exceed document size limits.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, at which step is the user document updated directly with address data?
AStep 3
BStep 2
CStep 4
DStep 6
💡 Hint
Check the 'Action' and 'Data Model' columns for direct updates to embedded data.
According to variable_tracker, what does the user document contain after Step 2?
AEmbedded address object
BEmpty document
CReference to address by ID
DSeparate address document
💡 Hint
Look at 'User Document with Reference' row after Step 2.
If the address data changes frequently and is shared by many users, which data model is better according to the execution flow?
AEmbedding
BReferencing
CNeither
DBoth equally
💡 Hint
Refer to concept_flow and key_moments about data change frequency and sharing.
Concept Snapshot
Embedding vs Referencing in MongoDB:
- Embed when related data is small, accessed and updated together.
- Reference when data is large, shared, or changes independently.
- Embedding stores data in one document; referencing uses IDs to link documents.
- Embedding simplifies queries; referencing supports data reuse and separate updates.
- Choose based on data size, relation, and update patterns.
Full Transcript
This visual execution shows how to decide between embedding and referencing data in MongoDB. First, you consider if the related data is small and tightly connected, or large and loosely connected. Embedding means putting related data inside one document, which makes queries faster and updates simpler when data changes together. Referencing means storing related data in separate documents and linking them by IDs, which is better when data is large, shared by many, or changes independently. The execution table traces inserting users with embedded or referenced addresses, updating them, and querying them. The variable tracker shows how user and address documents change step by step. Key moments clarify why embedding is chosen for small, tightly coupled data and referencing for large, shared data. The quiz tests understanding of these steps and decisions. The quick snapshot summarizes when to embed or reference and their effects on queries and updates.