Top K Frequent Elements Using Heap in DSA C++ - Time & Space Complexity
We want to understand how the time needed grows when finding the top K frequent elements using a heap.
How does the number of steps change as the input list gets bigger?
Analyze the time complexity of the following code snippet.
#include <vector>
#include <unordered_map>
#include <queue>
#include <utility>
std::vector<int> topKFrequent(std::vector<int>& nums, int k) {
std::unordered_map<int, int> freq;
for (int num : nums) freq[num]++;
auto cmp = [](std::pair<int,int> a, std::pair<int,int> b) { return a.second > b.second; };
std::priority_queue<std::pair<int,int>, std::vector<std::pair<int,int>>, decltype(cmp)> minHeap(cmp);
for (auto& p : freq) {
minHeap.push(p);
if (minHeap.size() > k) minHeap.pop();
}
std::vector<int> result;
while (!minHeap.empty()) {
result.push_back(minHeap.top().first);
minHeap.pop();
}
return result;
}
This code counts frequencies, then uses a min-heap to keep track of the top K frequent elements.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Loop over all elements to count frequency, then loop over unique elements to push/pop in heap.
- How many times: Counting runs once over all n elements; heap operations run once per unique element (m times).
As input size n grows, counting frequencies grows linearly. Heap operations depend on unique elements m and k.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 (m ≤ 10) | ~10 (count) + 10 * log k (heap) |
| 100 (m ≤ 100) | ~100 + 100 * log k |
| 1000 (m ≤ 1000) | ~1000 + 1000 * log k |
Counting grows directly with n; heap operations grow with number of unique elements times log k, which is usually smaller than n.
Time Complexity: O(n + m log k)
This means the time grows linearly with input size plus the cost of managing a heap of size k for unique elements.
[X] Wrong: "The heap operations take O(n log n) because we push all elements into the heap."
[OK] Correct: We only push unique elements (m), not all n elements, and the heap size is limited to k, so operations are O(m log k), not O(n log n).
Understanding this complexity helps you explain efficient ways to find top frequent items, a common real-world problem.
"What if we used a max-heap of size m instead of a min-heap of size k? How would the time complexity change?"