Bird
Raised Fist0
MongoDBquery~15 mins

$size operator for array length in MongoDB - Deep Dive

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Overview - $size operator for array length
What is it?
The $size operator in MongoDB is used to find the number of elements in an array. It helps you check how many items are inside an array field in your documents. This operator is useful when you want to filter or work with documents based on the length of their arrays.
Why it matters
Without the $size operator, it would be hard to query documents based on how many items they have in an array. For example, if you want to find users with exactly three hobbies, you would have to fetch all data and count manually, which is slow and inefficient. $size makes this fast and easy, improving performance and accuracy.
Where it fits
Before learning $size, you should understand basic MongoDB queries and how arrays are stored in documents. After mastering $size, you can learn about other array operators like $elemMatch and aggregation pipeline stages that manipulate arrays.
Mental Model
Core Idea
$size counts how many items are inside an array field in a MongoDB document.
Think of it like...
Imagine a box filled with toys. $size is like counting how many toys are inside the box without opening it fully.
Document Example:
{
  name: "Alice",
  hobbies: ["reading", "swimming", "coding"]
}

Query with $size:
{ hobbies: { $size: 3 } }

Result: Matches documents where hobbies array has exactly 3 items.
Build-Up - 6 Steps
1
FoundationUnderstanding Arrays in MongoDB
šŸ¤”
Concept: Learn what arrays are and how they are stored in MongoDB documents.
In MongoDB, an array is a list of values stored inside a single field of a document. For example, a user document might have a field called 'hobbies' that holds multiple hobbies as an array: ["reading", "swimming", "coding"]. Arrays can hold any data type, including strings, numbers, or even other documents.
Result
You can store multiple related values together inside one field in a document.
Knowing how arrays are stored helps you understand why counting their length is a common need.
2
FoundationBasic MongoDB Query Structure
šŸ¤”
Concept: Understand how to write simple queries to find documents based on field values.
A MongoDB query looks like { field: value } and finds documents where the field matches the value. For example, { name: "Alice" } finds documents where the name is Alice. Queries can also check array contents, but counting array length needs a special operator.
Result
You can find documents by matching exact values or array contents.
Mastering basic queries is essential before using special operators like $size.
3
IntermediateUsing $size to Match Array Length
šŸ¤”Before reading on: do you think $size can find arrays with at least a certain length, or only exact length? Commit to your answer.
Concept: $size matches documents where the array has exactly the specified number of elements.
To find documents where an array field has a specific length, use { arrayField: { $size: number } }. For example, { hobbies: { $size: 3 } } finds documents where the hobbies array has exactly 3 items. It does not match arrays with more or fewer items.
Result
Only documents with arrays of the exact length specified are returned.
Understanding that $size matches exact lengths prevents confusion when queries return no results.
4
IntermediateCombining $size with Other Query Operators
šŸ¤”Before reading on: can $size be combined with range queries like $gt or $lt directly? Commit to your answer.
Concept: $size works only for exact length matching and cannot be combined directly with range operators in a query filter.
You cannot write { hobbies: { $size: { $gt: 2 } } } to find arrays longer than 2. Instead, use aggregation with $expr and $size or $where for complex conditions. $size is limited to exact matches in normal queries.
Result
Trying to combine $size with range operators in a query filter causes errors or no matches.
Knowing $size's limitation helps you choose the right tool for array length conditions.
5
AdvancedUsing $size in Aggregation Pipelines
šŸ¤”Before reading on: do you think $size can be used inside aggregation expressions to compute array lengths? Commit to your answer.
Concept: In aggregation pipelines, $size can be used as an expression to calculate array length dynamically for each document.
Inside aggregation stages like $project or $match with $expr, you can use { $size: "$arrayField" } to get the length of an array. For example, {$project: { hobbyCount: { $size: "$hobbies" } }} adds a field with the number of hobbies. This allows filtering or transforming data based on array length.
Result
You can compute and use array lengths dynamically in aggregation results.
Using $size in aggregation unlocks powerful data transformations beyond simple queries.
6
ExpertPerformance Considerations and Indexing
šŸ¤”Before reading on: does using $size in queries benefit from indexes on array fields? Commit to your answer.
Concept: Queries using $size do not use indexes efficiently because they require scanning array lengths, which is not indexed directly.
MongoDB indexes do not store array length information, so queries with $size often result in collection scans or index scans without length filtering. For large collections, this can slow queries. To optimize, consider storing array length as a separate field and indexing it for fast queries.
Result
Using $size in queries on large datasets can be slow without additional design.
Knowing $size's performance limits helps design schemas and indexes for scalable applications.
Under the Hood
$size works by checking the internal array length metadata stored with each document's array field. When a query uses $size, MongoDB compares the requested length with the stored length of the array in each document. However, this length is not indexed separately, so MongoDB must scan documents or indexes to evaluate the condition.
Why designed this way?
MongoDB stores arrays as BSON arrays inside documents with length metadata for quick access. $size was designed to leverage this metadata for exact length matching. Range queries on array length were not included in basic queries to keep query execution simple and efficient. More complex length conditions are handled in aggregation pipelines.
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│ MongoDB Doc │
│ {          │
│  hobbies:  │
│  [a,b,c]   │
│  length=3 │
ā””ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
      │
      ā–¼
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│ Query:      │
│ { hobbies:  │
│   { $size:3 }}
ā””ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
      │
      ā–¼
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│ Compare doc │
│ length=3 ?  │
│ size=3      │
ā””ā”€ā”€ā”€ā”€ā”€ā”¬ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
      │
      ā–¼
ā”Œā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”
│ Match or    │
│ skip doc    │
ā””ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”€ā”˜
Myth Busters - 4 Common Misconceptions
Quick: Does $size match arrays with length greater than the number specified? Commit to yes or no.
Common Belief:Many think $size matches arrays with length greater than or equal to the number given.
Tap to reveal reality
Reality:$size matches only arrays with exactly the specified length, not greater or smaller.
Why it matters:Assuming $size matches ranges causes queries to return no results or wrong results, leading to confusion and wasted debugging time.
Quick: Can $size be used with $gt or $lt directly in a query filter? Commit to yes or no.
Common Belief:Some believe you can combine $size with range operators like $gt in a normal query filter.
Tap to reveal reality
Reality:$size cannot be combined with range operators directly; it only accepts a number for exact matching.
Why it matters:Trying to combine them causes query errors or unexpected behavior, blocking complex array length queries.
Quick: Does using $size in queries automatically use indexes on array fields? Commit to yes or no.
Common Belief:Many assume $size queries benefit from indexes on the array field for fast lookup.
Tap to reveal reality
Reality:$size queries do not use indexes efficiently because array length is not indexed separately.
Why it matters:Ignoring this leads to slow queries on large collections and poor application performance.
Quick: Is $size available only in queries, or also in aggregation pipelines? Commit to your answer.
Common Belief:Some think $size is only for queries and cannot be used in aggregation pipelines.
Tap to reveal reality
Reality:$size is available as an expression in aggregation pipelines to compute array lengths dynamically.
Why it matters:Missing this limits the ability to perform advanced data transformations and filtering based on array length.
Expert Zone
1
Using $size in aggregation pipelines allows dynamic computation of array lengths, enabling complex filtering and projections not possible in simple queries.
2
Storing array length as a separate field and indexing it can dramatically improve performance for length-based queries on large datasets.
3
$size does not work with sparse arrays or arrays with null elements differently; it counts all elements regardless of content.
When NOT to use
Avoid using $size for range-based array length queries in normal filters; instead, use aggregation with $expr and $size or store array length in a separate field for indexing. Also, do not rely on $size for performance-critical queries on large collections without proper indexing strategies.
Production Patterns
In production, developers often store array lengths in separate fields updated by application logic or database triggers to enable fast queries. $size is used in aggregation pipelines for reporting and analytics to compute array sizes on the fly. It is also combined with $match and $project stages to filter and reshape data based on array length.
Connections
Array Length in Programming Languages
Similar concept of counting elements in an array or list.
Understanding how arrays work in programming helps grasp why counting elements is a common operation in databases too.
SQL COUNT() Function
Both count elements, but SQL counts rows or grouped items, while $size counts elements inside a single document's array.
Knowing SQL aggregation helps understand MongoDB's approach to counting within documents versus across rows.
Inventory Management Systems
Counting items in stock is like counting array elements in a document field.
Real-world inventory counting parallels how $size counts items, showing the practical need for such operations.
Common Pitfalls
#1Trying to find documents with arrays longer than a number using $size directly.
Wrong approach:{ hobbies: { $size: { $gt: 2 } } }
Correct approach:{ $expr: { $gt: [ { $size: "$hobbies" }, 2 ] } }
Root cause:Misunderstanding that $size accepts only a number for exact matching, not an expression or operator.
#2Expecting $size queries to be fast on large collections without indexing.
Wrong approach:db.collection.find({ hobbies: { $size: 5 } }) on a large unindexed collection expecting quick results.
Correct approach:Store array length in a separate field, index it, and query that field for performance.
Root cause:Assuming $size uses indexes on arrays, ignoring MongoDB's indexing limitations.
#3Using $size to find arrays with at least a certain length in a normal query.
Wrong approach:{ hobbies: { $size: 3 } } expecting to match arrays with 3 or more elements.
Correct approach:Use aggregation with $expr and $size to filter arrays by length ranges.
Root cause:Confusing exact match behavior of $size with range queries.
Key Takeaways
$size counts the exact number of elements in an array field in MongoDB documents.
It only matches arrays with the exact length specified, not ranges or minimum lengths.
$size cannot be combined with range operators in normal queries but can be used in aggregation pipelines for dynamic length calculations.
Queries using $size do not benefit from indexes on array fields, so performance can suffer on large datasets without schema design adjustments.
Storing array length separately and using aggregation pipelines are common patterns to overcome $size limitations in production.

Practice

(1/5)
1. What does the $size operator do in MongoDB?
easy
A. Counts the number of elements in an array
B. Calculates the sum of numbers in an array
C. Finds the largest number in an array
D. Sorts the elements of an array

Solution

  1. Step 1: Understand the purpose of $size

    The $size operator is used to count how many elements are inside an array field in a MongoDB document.
  2. Step 2: Compare with other options

    Other options describe different operations like sum, max, or sort, which are not what $size does.
  3. Final Answer:

    Counts the number of elements in an array -> Option A
  4. Quick Check:

    $size = count array elements [OK]
Hint: Remember: $size counts array items, not values [OK]
Common Mistakes:
  • Confusing $size with sum or max functions
  • Thinking $size sorts arrays
  • Using $size on non-array fields
2. Which of the following is the correct syntax to use $size in a MongoDB aggregation pipeline to add a field itemCount that counts elements in the items array?
easy
A. { $addFields: { itemCount: { $length: "$items" } } }
B. { $match: { itemCount: { $size: "$items" } } }
C. { $project: { itemCount: { $size: "items" } } }
D. { $addFields: { itemCount: { $size: "$items" } } }

Solution

  1. Step 1: Identify correct operator usage in aggregation

    The $size operator is used inside an expression to count array elements. It must be inside a stage like $addFields or $project with the array field referenced as "$items".
  2. Step 2: Check syntax correctness

    { $addFields: { itemCount: { $length: "$items" } } } uses a non-existent $length. { $project: { itemCount: { $size: "items" } } } misses the $ before items. { $match: { itemCount: { $size: "$items" } } } misuses $match with $size.
  3. Final Answer:

    { $addFields: { itemCount: { $size: "$items" } } } -> Option D
  4. Quick Check:

    Use $size inside $addFields with "$arrayField" [OK]
Hint: Use "$arrayField" inside $size in $addFields or $project [OK]
Common Mistakes:
  • Using $length instead of $size
  • Forgetting the $ before array field name
  • Using $size inside $match incorrectly
3. Given the collection documents:
{ "name": "Alice", "tags": ["red", "blue"] }
{ "name": "Bob", "tags": ["green"] }
{ "name": "Carol", "tags": [] }

What will be the result of this aggregation pipeline?
[{ $project: { name: 1, tagCount: { $size: "$tags" } } }]
medium
A. [{ "name": "Alice", "tagCount": 2 }, { "name": "Bob", "tagCount": 1 }, { "name": "Carol", "tagCount": 0 }]
B. [{ "name": "Alice", "tagCount": 3 }, { "name": "Bob", "tagCount": 1 }, { "name": "Carol", "tagCount": 1 }]
C. [{ "name": "Alice", "tagCount": 2 }, { "name": "Bob", "tagCount": 0 }, { "name": "Carol", "tagCount": 0 }]
D. SyntaxError

Solution

  1. Step 1: Understand $size counts array elements

    For each document, $size counts how many items are in the tags array: Alice has 2, Bob has 1, Carol has 0.
  2. Step 2: Apply $project to include name and tagCount

    The pipeline projects the name and adds tagCount with the counted size.
  3. Final Answer:

    [{ "name": "Alice", "tagCount": 2 }, { "name": "Bob", "tagCount": 1 }, { "name": "Carol", "tagCount": 0 }] -> Option A
  4. Quick Check:

    Count array lengths with $size = correct counts [OK]
Hint: Count array length per document with $size in $project [OK]
Common Mistakes:
  • Assuming empty arrays count as 1
  • Mixing up counts for different documents
  • Expecting syntax error for correct query
4. You wrote this aggregation stage to filter documents with exactly 3 tags:
{ $match: { tags: { $size: 3 } } }

But it returns an error. What is the problem?
medium
A. The array field name is missing the $ sign
B. The $match stage requires $expr to use $size
C. The number 3 should be in quotes as "3"
D. $size cannot be used inside $match like this

Solution

  1. Step 1: Understand $size usage in $match

    Directly using $size inside $match like this is invalid because $size is an aggregation expression, not a query operator.
  2. Step 2: Use $expr to evaluate aggregation expressions in $match

    To filter by array length, you must use $expr with $size, like: { $match: { $expr: { $eq: [ { $size: "$tags" }, 3 ] } } }.
  3. Final Answer:

    The $match stage requires $expr to use $size -> Option B
  4. Quick Check:

    Use $expr for aggregation expressions in $match [OK]
Hint: Use $expr to apply $size inside $match [OK]
Common Mistakes:
  • Trying to use $size directly in $match
  • Forgetting $expr wrapper
  • Using quotes around numbers incorrectly
5. You want to find documents where the comments array has more than 2 elements. Which aggregation pipeline stage correctly filters these documents?
hard
A. { $match: { $size: { $gt: [ "$comments", 2 ] } } }
B. { $match: { comments: { $size: { $gt: 2 } } } }
C. { $match: { $expr: { $gt: [ { $size: "$comments" }, 2 ] } } }
D. { $match: { $expr: { $size: { $gt: [ "$comments", 2 ] } } } }

Solution

  1. Step 1: Use $expr to evaluate expressions in $match

    To compare array length, use $expr to allow aggregation expressions inside $match.
  2. Step 2: Use $gt with $size to check array length greater than 2

    The correct syntax is { $gt: [ { $size: "$comments" }, 2 ] } inside $expr.
  3. Step 3: Eliminate incorrect options

    The incorrect options either lack $expr, misuse $size placement, or have wrong syntax for $gt.
  4. Final Answer:

    { $match: { $expr: { $gt: [ { $size: "$comments" }, 2 ] } } } -> Option C
  5. Quick Check:

    Filter by array length with $expr and $gt [OK]
Hint: Use $expr with $gt and $size to filter by array length [OK]
Common Mistakes:
  • Using $size as a query operator inside $match
  • Wrong order or structure of $gt and $size
  • Missing $expr wrapper