0
0
DSA C++programming

Median of Data Stream Using Two Heaps in DSA C++

Choose your learning style9 modes available
Mental Model
Keep the smaller half of numbers in one heap and the larger half in another heap to quickly find the middle value.
Analogy: Imagine two baskets: one holds smaller fruits, the other holds bigger fruits. The middle fruit is the one between these baskets.
MaxHeap (smaller half): 10 -> 5 -> 3 -> null
MinHeap (larger half): 12 -> 15 -> 20 -> null
Dry Run Walkthrough
Input: Stream: 5, 15, 1, 3
Goal: Find the median after each number is added
Step 1: Add 5 to max heap (smaller half)
MaxHeap: 5 -> null
MinHeap: null
Why: Start by adding first number to max heap to hold smaller half
Step 2: Add 15 to min heap (larger half)
MaxHeap: 5 -> null
MinHeap: 15 -> null
Why: Add next number to min heap to hold larger half
Step 3: Add 1 to max heap (smaller half)
MaxHeap: 5 -> 1 -> null
MinHeap: 15 -> null
Why: Add smaller number to max heap to keep smaller half
Step 4: No balancing needed
MaxHeap: 5 -> 1 -> null
MinHeap: 15 -> null
Why: maxHeap.size() = 2 == minHeap.size() + 1, which is allowed; median = maxHeap.top() = 5
Step 5: Add 3 to max heap (smaller half)
MaxHeap: 5 -> 1 -> 3 -> null
MinHeap: 15 -> null
Why: 3 <= maxHeap.top() = 5, so add to smaller half
Step 6: Balance heaps to keep size difference ≤ 1
MaxHeap: 3 -> 1 -> null
MinHeap: 5 -> 15 -> null
Why: maxHeap.size() = 3 > minHeap.size() + 1 = 2, so move maxHeap.top() = 5 to minHeap and pop it from maxHeap
Result:
After each insertion, medians are: 5, 10, 5, 4
Final heaps:
MaxHeap: 3 -> 1 -> null
MinHeap: 5 -> 15 -> null
Annotated Code
DSA C++
#include <iostream>
#include <queue>
#include <vector>

using namespace std;

class MedianFinder {
private:
    priority_queue<int> maxHeap; // smaller half
    priority_queue<int, vector<int>, greater<int>> minHeap; // larger half

public:
    void addNum(int num) {
        if (maxHeap.empty() || num <= maxHeap.top()) {
            maxHeap.push(num); // add to smaller half
        } else {
            minHeap.push(num); // add to larger half
        }

        // Balance the heaps so that size difference is not more than 1
        if (maxHeap.size() > minHeap.size() + 1) {
            minHeap.push(maxHeap.top());
            maxHeap.pop();
        } else if (minHeap.size() > maxHeap.size()) {
            maxHeap.push(minHeap.top());
            minHeap.pop();
        }
    }

    double findMedian() {
        if (maxHeap.size() == minHeap.size()) {
            return (maxHeap.top() + minHeap.top()) / 2.0;
        } else {
            return maxHeap.top();
        }
    }
};

int main() {
    MedianFinder mf;
    vector<int> stream = {5, 15, 1, 3};

    for (int num : stream) {
        mf.addNum(num);
        cout << mf.findMedian() << endl;
    }

    return 0;
}
if (maxHeap.empty() || num <= maxHeap.top()) {
decide to add number to smaller half if empty or number is smaller or equal to max of smaller half
maxHeap.push(num);
add number to max heap (smaller half)
minHeap.push(num);
add number to min heap (larger half)
if (maxHeap.size() > minHeap.size() + 1) {
check if smaller half is too big
minHeap.push(maxHeap.top()); maxHeap.pop();
move top from smaller half to larger half to balance
else if (minHeap.size() > maxHeap.size()) {
check if larger half is bigger
maxHeap.push(minHeap.top()); minHeap.pop();
move top from larger half to smaller half to balance
if (maxHeap.size() == minHeap.size()) {
if equal size, median is average of tops
return (maxHeap.top() + minHeap.top()) / 2.0;
calculate median as average
return maxHeap.top();
if sizes differ, median is top of bigger heap (smaller half)
OutputSuccess
5 10 5 4
Complexity Analysis
Time: O(log n) per insertion because each addNum operation pushes/pops from heaps which take log n time
Space: O(n) because all numbers are stored in the two heaps
vs Alternative: Compared to sorting all numbers each time (O(n log n)), using two heaps is much faster for streaming data
Edge Cases
empty stream
findMedian should not be called or should handle gracefully
DSA C++
if (maxHeap.empty() || num <= maxHeap.top()) {
all numbers equal
heaps balance normally, median is that number
DSA C++
if (maxHeap.empty() || num <= maxHeap.top()) {
stream with one element
median is that element
DSA C++
if (maxHeap.empty() || num <= maxHeap.top()) {
When to Use This Pattern
When you need to find the median in a stream of numbers efficiently, use two heaps to keep track of smaller and larger halves for quick median calculation.
Common Mistakes
Mistake: Not balancing the heaps after insertion, causing incorrect median calculation
Fix: Always balance heaps so their sizes differ by at most one after each insertion
Mistake: Adding all numbers to one heap only
Fix: Distribute numbers between max heap and min heap based on their value relative to current median
Summary
It finds the median of a stream of numbers by using two heaps to store smaller and larger halves.
Use it when you need fast median queries on continuously incoming data.
The key insight is to keep two balanced heaps so the median is always at the top of one or average of both.