Kth Largest Element in an Array - Priority Queue / Heap / Top K

https://algo.monster/problems/kth_largest_element_in_an_array

The explanation says we are popping the first k elements, but in the solution isn’t it popping n - k elements? Therefore the time complexity of that part should be (n - k)log n, right?

Thank you for pointing out. We can use a max heap to achieve klogn.

If we’re being technical here, since the solution is first looping through the list to make all values negative, then the time complexity is O(2n + klog(n)). If we instead heappush each (negated) element in a loop, then the complexity would be O(nlog(n) + klog(n)), which is slightly better, no?

if you are first populating the heap with every element, and every heap push operation is log height of heap, then you end up with a heap of height n. Therefore wouldn’t populating the heap yield O N(log N) ?

/Limiting heap size to K ensures height of heap is never > K.
Therefore each insertion is LogK.

We keep a size-limited minimum heap. We populate until size = k. Then if next element in array is bigger than minimum, we evict minimum and insert new element. Otherwise, skip. This ensures we are only inserting the maximum numbers,

So final time complexity is O n(logK)

function findKthLargest(nums, k) {
if(!k || !nums || nums.length === 0 || k > nums.length) return null;

const minHeap = new MinHeap();
for(const elem of nums) {
    if(minHeap.size() === k) { 
        if(elem < minHeap.peek().item) continue;
        minHeap.pop();
    }
    minHeap.push(new HeapItem(elem));
}

return minHeap.peek().item;

}

The only problem with the posted solution is the space complexity which is O(n) as you use heapify on the list. My approach improves on this by only storing K elements in the heap: we first add the first K elements, and then we loop through all the remaining elements (N-K). If any element have a value greater than the root of the heap (minimum of the K largest, O(1) operation), then pop it and insert the new element ( an O(log K) operation). At any point in time, we are only storing K elements in the heap.

def find_kth_largest(nums: List[int], k: int) -> int:
    h = []
    i = 0    
    while i < k:
        heappush(h, nums[i])
        i+=1
    while i < len(nums):
        if nums[i] > h[0]:
            heappop(h)
            heappush(h, nums[i])
        i+=1
    return heappop(h)

I think that the optimal solution for this problem is in linear time complexity o(1) space using QuickSelect algorithm. There is no need for a heap here.

LOL
return min(heapq.nlargest(k, nums))

Quick select algorithm reduces the complexity from O(nlogn) to O(n) with worst case complexity of O(n^2)

Simple quicksort solution using python:
def find_kth_largest(nums: List[int], k: int) → int:
# WRITE YOUR BRILLIANT CODE HERE
pivot =random.choice(nums)
left = [x for x in nums if x < pivot]
right = [x for x in nums if x > pivot]
mid = [x for x in nums if x == pivot]
l, r, m = len(left), len(right), len(mid)
if k > r+m:
return find_kth_largest(left, k-r-m)
elif k <= r:
return find_kth_largest(right, k)
else:
return mid[0]

JS example doesn’t implement a real O(n) heapify so it makes it quite confusing. Could you add some links or even a real code example of heapify in JS?

How is heapify O(n)? Every add operation to the heap will try to rebalance it, which would make the overall process O(nlogn).
Here is my solution that uses a min heap instead:
public static int findKthLargest(List nums, int k) {
// WRITE YOUR BRILLIANT CODE HERE
PriorityQueue minHeap = new PriorityQueue<>();
for (int i = 0; i < nums.size(); i++) { // O(n logk)
var curr = nums.get(i);
if (i < k) {
minHeap.add(curr);
continue;
}

        if (curr > minHeap.peek()) {
            minHeap.remove();
            minHeap.add(curr);
        }
    }

    return minHeap.peek();
}

Sorry about the bad formatting. Cannot edit comments so reposting the code:

public static int findKthLargest(List<Integer> nums, int k) {
    // WRITE YOUR BRILLIANT CODE HERE
    PriorityQueue<Integer> minHeap = new PriorityQueue<>();
    for (int i = 0; i < nums.size(); i++) { // O(n logk)
        var curr = nums.get(i);
        if (i < k) {
            minHeap.add(curr);
            continue;
        }

        if (curr > minHeap.peek()) {
            minHeap.remove();
            minHeap.add(curr);
        }
    }

    return minHeap.peek();
}

the analysis requires a bit of math derivation, check out this post: https://stackoverflow.com/questions/51735692/python-heapify-time-complexity

Unless I’m missing something, in Java, populating the heap takes O(n) only if the PriorityQueue object is built passing a Collection to its constructor (it internally uses heapify and siftDown methods). Calling the add method after the PriorityQueue is built takes O(log(n)) (it internally uses siftUp method). If this is correct, then the proposed solution in Java takes O(n.log(n) + k.log(n)).