Median from Data Stream and Sliding Window
hardDesignHeapMedian
Find Median from Data Stream (LeetCode 295)
Continuously receive integers and provide the current median at any time.
Dual-Heap Design
maxHeap: Stores the smaller half of numbers; top is the largest of smalls.minHeap: Stores the larger half of numbers; top is the smallest of larges.- Invariants:
- The size difference between the two heaps must never exceed 1.
- Specifically, we maintain
maxHeap.size() >= minHeap.size().
class MedianFinder {
private PriorityQueue<Integer> maxHeap = new PriorityQueue<>(Collections.reverseOrder());
private PriorityQueue<Integer> minHeap = new PriorityQueue<>();
public void addNum(int num) {
// Step 1: Push num to maxHeap, then take the largest of smalls and give to minHeap
maxHeap.offer(num);
minHeap.offer(maxHeap.poll());
// Step 2: If minHeap has more elements, move one back to maxHeap to maintain balance
if (maxHeap.size() < minHeap.size()) {
maxHeap.offer(minHeap.poll());
}
}
public double findMedian() {
if (maxHeap.size() > minHeap.size()) {
return (double) maxHeap.peek();
}
return (maxHeap.peek() + minHeap.peek()) / 2.0;
}
}
Complexity: addNum takes $O(\log N)$, findMedian takes $O(1)$.
Sliding Window Median (LeetCode 480)
Given a static size $k$, find the median of every window as it slides across the array.
Challenge: Unlike the simple data stream, we must remove the element leaving the window. Standard PriorityQueue in Java takes $O(N)$ for arbitrary removal, which is too slow.
Technique: Lazy Removal (Delayed Disposal)
Instead of removing an element immediately, we mark it as "to be deleted" in a Hash Map. We only perform the actual removal when that element surfaces at the top of either heap.
public double[] medianSlidingWindow(int[] nums, int k) {
PriorityQueue<Integer> maxH = new PriorityQueue<>(Collections.reverseOrder());
PriorityQueue<Integer> minH = new PriorityQueue<>();
Map<Integer, Integer> delayed = new HashMap<>(); // key: num, value: count of deletions
double[] res = new double[nums.length - k + 1];
// Initial k elements
for (int i = 0; i < k; i++) maxH.offer(nums[i]);
for (int i = 0; i < k / 2; i++) minH.offer(maxH.poll());
for (int i = k; i <= nums.length; i++) {
// Record current median
res[i - k] = (k % 2 == 1) ? (double) maxH.peek() : ((double) maxH.peek() + minH.peek()) / 2.0;
if (i == nums.length) break;
int coming = nums[i];
int leaving = nums[i - k];
// Mark leaving element for deletion
delayed.put(leaving, delayed.getOrDefault(leaving, 0) + 1);
// Adjust heap sizes logically (decrement the count of the heap containing 'leaving')
int balance = (leaving <= maxH.peek()) ? -1 : 1;
// Insert new element
if (!maxH.isEmpty() && coming <= maxH.peek()) {
maxH.offer(coming);
balance++;
} else {
minH.offer(coming);
balance--;
}
// Re-balance the heaps
if (balance < 0) { // maxH needs an element
maxH.offer(minH.poll());
} else if (balance > 0) { // minH needs an element
minH.offer(maxH.poll());
}
// Clean up invalid elements from heap tops (Lazy Deletion)
while (!maxH.isEmpty() && delayed.getOrDefault(maxH.peek(), 0) > 0) {
int top = maxH.poll();
delayed.put(top, delayed.get(top) - 1);
}
while (!minH.isEmpty() && delayed.getOrDefault(minH.peek(), 0) > 0) {
int top = minH.poll();
delayed.put(top, delayed.get(top) - 1);
}
}
return res;
}
Comparison Summary
| Feature | Data Stream Median | Sliding Window Median |
|---|---|---|
| Element Removal | No removals | Removal required ($O(\log N)$ or Lazy) |
| Complexity | $O(\log N)$ per addition | $O(N \log N)$ total |
| Primary Challenge | Simple dual-heap balance | Handling arbitrary deletion from heaps |
| 筋 |