Median of Data Stream Using Two Heaps in DSA C++ - Time & Space Complexity
We want to understand how fast we can find the middle value when numbers keep coming in one by one.
How does the time needed grow as more numbers arrive?
Analyze the time complexity of the following code snippet.
#include <queue>
#include <vector>
std::priority_queue<int> maxHeap; // stores smaller half
std::priority_queue<int, std::vector<int>, std::greater<int>> minHeap; // stores larger half
void addNum(int num) {
if (maxHeap.empty() || num <= maxHeap.top()) maxHeap.push(num);
else minHeap.push(num);
if (maxHeap.size() > minHeap.size() + 1) {
minHeap.push(maxHeap.top()); maxHeap.pop();
} else if (minHeap.size() > maxHeap.size()) {
maxHeap.push(minHeap.top()); minHeap.pop();
}
}
double findMedian() {
if (maxHeap.size() > minHeap.size()) return maxHeap.top();
return (maxHeap.top() + minHeap.top()) / 2.0;
}
This code keeps two heaps to quickly find the median as new numbers come in.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Inserting a number into one of the two heaps and balancing them.
- How many times: Each insertion happens once per new number added.
Each new number causes one insertion and possibly one balancing move between heaps.
| Input Size (n) | Approx. Operations |
|---|---|
| 10 | About 10 insertions and up to 10 balancing moves |
| 100 | About 100 insertions and up to 100 balancing moves |
| 1000 | About 1000 insertions and up to 1000 balancing moves |
Pattern observation: The number of operations grows roughly in a straight line with the number of inputs.
Time Complexity: O(n log n)
This means that adding n numbers and keeping track of the median takes time that grows a bit faster than n, because each insertion into a heap takes some extra steps.
[X] Wrong: "Adding a number and finding the median is always O(1) because heaps are fast."
[OK] Correct: Each insertion into a heap takes time proportional to the height of the heap, which grows with the number of elements, so it is not constant time.
Understanding how to keep track of medians efficiently shows your skill in using data structures smartly to handle changing data.
"What if we used a balanced binary search tree instead of two heaps? How would the time complexity change?"