0
0
MongodbHow-ToIntermediate · 2 min read

MongoDB Query to Find Top N Records Per Group

Use MongoDB's aggregation pipeline with $group to group documents and $push with $slice to keep top N sorted items per group, like: db.collection.aggregate([{ $sort: { groupField: 1, sortField: -1 } }, { $group: { _id: "$groupField", topN: { $push: "$ROOT" } } }, { $project: { topN: { $slice: ["$topN", n] } } }]).
📋

Examples

Input[{ group: 'A', score: 10 }, { group: 'A', score: 20 }, { group: 'B', score: 15 }, { group: 'B', score: 5 }], n=1
Output[{ _id: 'A', topN: [{ group: 'A', score: 20 }] }, { _id: 'B', topN: [{ group: 'B', score: 15 }] }]
Input[{ group: 'X', value: 5 }, { group: 'X', value: 3 }, { group: 'X', value: 8 }, { group: 'Y', value: 7 }], n=2
Output[{ _id: 'X', topN: [{ group: 'X', value: 8 }, { group: 'X', value: 5 }] }, { _id: 'Y', topN: [{ group: 'Y', value: 7 }] }]
Input[], n=3
Output[]
🧠

How to Think About It

To find the top n items per group, first sort the data by the group and the field you want to rank by. Then group the documents by the group key and collect the sorted items into an array. Finally, limit the array to the top n items using a slice operation.
📐

Algorithm

1
Sort the documents by group field ascending and ranking field descending.
2
Group documents by the group field.
3
Within each group, collect documents into an array preserving order.
4
Slice the array to keep only the top n documents per group.
5
Return the grouped documents with their top n items.
💻

Code

mongodb
db.collection.aggregate([
  { $sort: { group: 1, score: -1 } },
  { $group: {
      _id: "$group",
      topScores: { $push: "$$ROOT" }
    }
  },
  { $project: {
      topScores: { $slice: ["$topScores", 2] }
    }
  }
])
Output
[ { _id: "A", topScores: [ { group: "A", score: 20 }, { group: "A", score: 10 } ] }, { _id: "B", topScores: [ { group: "B", score: 15 }, { group: "B", score: 5 } ] } ]
🔍

Dry Run

Let's trace the example with groups A and B and scores through the aggregation pipeline.

1

Sort documents

Sort by group ascending and score descending: [{group: 'A', score: 20}, {group: 'A', score: 10}, {group: 'B', score: 15}, {group: 'B', score: 5}]

2

Group by 'group' field

Group A: [{group: 'A', score: 20}, {group: 'A', score: 10}], Group B: [{group: 'B', score: 15}, {group: 'B', score: 5}]

3

Slice top 2 per group

Keep first 2 items per group as topScores array

GroupTop Scores
A[{group: 'A', score: 20}, {group: 'A', score: 10}]
B[{group: 'B', score: 15}, {group: 'B', score: 5}]
💡

Why This Works

Step 1: Sorting

Sorting ensures that when we group, the documents are in the order we want to pick the top n from.

Step 2: Grouping

Grouping collects all documents of the same group into one place to process them together.

Step 3: Slicing

Slicing the array keeps only the top n documents per group, discarding the rest.

🔄

Alternative Approaches

Using $setWindowFields (MongoDB 5.0+)
mongodb
db.collection.aggregate([
  { $setWindowFields: {
      partitionBy: "$group",
      sortBy: { score: -1 },
      output: { rank: { $rank: {} } }
    }
  },
  { $match: { rank: { $lte: 2 } } }
])
This method uses window functions to rank documents per group and filter top n, which can be more efficient but requires MongoDB 5.0 or newer.
Using $facet with separate pipelines per group
mongodb
db.collection.aggregate([
  { $facet: {
      groupA: [ { $match: { group: 'A' } }, { $sort: { score: -1 } }, { $limit: 2 } ],
      groupB: [ { $match: { group: 'B' } }, { $sort: { score: -1 } }, { $limit: 2 } ]
    }
  }
])
This approach manually creates pipelines per group, which is less flexible and not scalable for many groups.

Complexity: O(m log m) time, O(m) space

Time Complexity

Sorting the entire collection by group and score dominates with O(m log m), where m is the number of documents.

Space Complexity

Grouping stores arrays of documents per group, which can use O(m) space in the worst case.

Which Approach is Fastest?

Using $setWindowFields can be faster for large datasets as it avoids grouping large arrays, but requires MongoDB 5.0+. The $group and $push method is more widely supported but can be slower with large groups.

ApproachTimeSpaceBest For
$group with $push and $sliceO(m log m)O(m)General use, compatible with older MongoDB versions
$setWindowFields with $rankO(m log m)O(m)Efficient ranking, requires MongoDB 5.0+
$facet with separate pipelinesO(k * n log n)O(n)Few known groups, manual approach
💡
Always sort your data before grouping to ensure the top n items are correctly ordered.
⚠️
Forgetting to sort before grouping causes incorrect top n results because the array order is not guaranteed.