0
0
MongoDBquery~5 mins

Embedding vs referencing decision in MongoDB - Performance Comparison

Choose your learning style9 modes available
Time Complexity: Embedding vs referencing decision
O(n)
Understanding Time Complexity

When choosing between embedding and referencing in MongoDB, it's important to understand how the time to get data changes as your data grows.

We want to know how the way data is stored affects how long queries take when the amount of data increases.

Scenario Under Consideration

Analyze the time complexity of fetching related data using embedding vs referencing.


// Embedding example
const user = db.users.findOne({ _id: userId });
// user document contains embedded posts array

// Referencing example
const user = db.users.findOne({ _id: userId });
const posts = db.posts.find({ userId: user._id }).toArray();
    

This code shows two ways to get a user's posts: either embedded inside the user document or stored separately and linked by userId.

Identify Repeating Operations

Look at what repeats when fetching posts for a user.

  • Primary operation: Reading posts data either from embedded array or separate collection.
  • How many times: For embedding, posts are read once inside user document. For referencing, a separate query fetches all posts for the user.
How Execution Grows With Input

Consider how the number of posts (n) affects query time.

Input Size (n)Approx. Operations
10 postsEmbedding: 1 read including 10 posts; Referencing: 1 user read + 1 query returning 10 posts
100 postsEmbedding: 1 read including 100 posts; Referencing: 1 user read + 1 query returning 100 posts
1000 postsEmbedding: 1 read including 1000 posts; Referencing: 1 user read + 1 query returning 1000 posts

Pattern observation: Both methods read all posts, but embedding reads them all at once inside one document, while referencing requires a separate query that grows with the number of posts.

Final Time Complexity

Time Complexity: O(n)

This means the time to fetch posts grows linearly with the number of posts, whether embedded or referenced.

Common Mistake

[X] Wrong: "Embedding always makes queries faster because all data is in one place."

[OK] Correct: If the embedded data grows very large, reading the whole document can be slow and use more memory. Referencing can be better for very large or frequently changing related data.

Interview Connect

Understanding how embedding and referencing affect query time helps you explain design choices clearly and shows you think about how data size impacts performance.

Self-Check

"What if we added an index on the referencing field? How would that change the time complexity of fetching posts?"