Skip to content
This repository was archived by the owner on Nov 4, 2024. It is now read-only.

Files

Latest commit

2178f88 · Aug 9, 2023

History

History

heaps

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Aug 9, 2023
Aug 8, 2023
Aug 7, 2023
Aug 8, 2023
Aug 8, 2023
Aug 8, 2023
Aug 8, 2023
Aug 2, 2023
Aug 2, 2023
Aug 8, 2023
Aug 8, 2023

heaps



  • heap is a data structure capable of giving you the smallest (or the largest) element in constant time, while adding or removing the smallest (or the largest) element in logarithmic time.

  • a heap is like a binary tree with these properties:

    • it must have all of its nodes in a specific order, and
    • its shape must be complete (all the levels of the tree must be completely filled except maybe for the last one and the last level must have the left-most nodes filled, always).
    • a max heap's root node must have all its children either greater than or equal to its children.
  • a min heap is the opposite. duplicate values are allowed.

  • heaps can be represented with linked lists, queues (arrays), or binary trees.

  • in the case of an array heap:

    • the parent node index is given by n / 2
    • the left children index is 2 * n
    • the right children index is 2 * n + 1
    • a node is a leaf when its index > n / 2
  • common applications of heap are: sort (heap sort), getting the top-k elements, and finding the kth element.

  • in python you can use heapq.heapify(array) with heapq.heappush(array, value) and heapq.heappop().


class Heap:

    def __init__(self):

        self.heap = []

    def heapify(self, n, i):

        largest = i
        left_children = 2 * i + 1
        right_children = 2 * i + 2

        if left_children < n and self.heap[i] < self.heap[left_children]:
            largest = left_children

        if right_children < n and self.heap[largest] < self.heap[right_children]:
            largest = right_children

        if largest != i:
            self.heap[i], self.heap[largest] = self.heap[largest], self.heap[i]
            self.heapify(n, largest)


    def insert(self, num):

        size = len(self.heap)

        if size == 0:
            self.heap.append(num)

        else:
            self.heap.append(num)
            for i in range((size // 2) - 1, -1, -1):
                self.heapify(size, i)


    def delete_node(self, num):

        size = len(self.heap)

        i = 0
        for i in range(size):
            if num == self.heap[i]:
                break

        self.heap[i], self.heap[size - 1] = self.heap[size - 1], self.heap[i]

        self.heap.remove(size - 1)

        for i in range((len(self.heap) // 2) - 1, -1, -1):
            self.heapify(len(self.heap), i)


heapify


  • it's cheaper to heapify an array of data (O(N)) than create an empty heap and inserting each element (O(N log(N))).

    • heapify means create a binary tree and then comparing each nodes with their children (and swapping when necessary).
    • parents node can simply exchange with their smallest child (so the max number of exchanges is N/2) and leaves are left out.
  • python's built-in heap differs from the standard implementation of a heap in two ways:

    • firstly, it uses zero-based indexing, so it stores the root node at index zero instead of the size of the heap.
    • secondly, the built-in module does not offer a direct way to create a max heap, instead, we must modify the values of each element when inserting in the heap, and when removing it from the heap.

import heapq

min_heap = [3,1,2]
heapq.heapify(min_heap)

max_heap = [-x for x in min_heap]
heapq.heapify(max_heap)

heapq.heappush(min_heap, 5)
heapq.heappush(min_heap, -5)

min_elem = min_heap[0]
max_elem = -1 * max_heap[0

heapq.heappop(min_heap)
heapq.heappop(max_heap)

size_min_heap = len(min_heap)
size_max_heap = len(max_heap)


priority queues


  • a priority queue is an abstract data type with the following properties:

    1. every item has a priority (usually an integer).
    2. an item with a high priority is dequeued before an item with low priority.
    3. two items with an equal priority are dequeued based on their order in the queue.
  • priority queues can be implemented with a stack, queue, or linked list data structures.



min heaps


  • a min heap is a complete binary tree where each node is smaller than its children (the root is the min element).

  • insert:

    • insert the element at the bottom, by finding the most rightmost node and checking its children: if left is empty, insert there, otherwise, insert on right.
    • then compare this node to each parent, exchanging them until the tree's properties are correct.
  • extract_min:

    • first, remove/return the top and then replace the tree's top with its latest element (the bottom-most rightmost).
    • then bubble down, swapping it with one of its children until the min-heap is properly restored
    • there is no need for order between right and left, so this operation would only take O(log N) runtime.
  • the code below is an example of an ad-hoc heap class in python.


class MinHeap:

    def __init__(self, size):

        self.heapsize = size
        self.minheap = [0] * (size + 1)
        self.realsize = 0

    def add(self, element):

        if self.realsize + 1 > self.heapsize:
            print("Too many elements!")
            return False

        self.realsize += 1
        self.minheap[self.realsize] = element

        index = self.realsize
        parent = index // 2

        while self.minheap[index] < self.minheap[parent] and index > 1:

            self.minheap[parent], self.minheap[index] = self.minheap[index], self.minheap[parent]
            index = parent
            parent = index // 2
    
    def peek(self):

        return self.minheap[1]
    
    def pop(self):

        if self.realsize < 1:
            print("Heap is empty.")
            return False

        else:
            remove_element = self.minheap[1]
            self.minheap[1] = self.minheap[self.realsize]
            self.realsize -= 1
            index = 1

            while index <= self.realsize // 2:

                left_children = index * 2
                right_children = (index * 2) + 1

                if self.minheap[index] > self.minheap[left_children] or \
                   self.minheap[index] > self.minheap[right_children]:

                    if self.minheap[left_children] < self.minheap[right_children]:

                        self.minheap[left_children], self.minheap[index] = self.minheap[index], self.minheap[left_children]
                        index = left_children
                    
                    else:

                        self.minheap[right_children], self.minheap[index] = self.minheap[index], self.minheap[right_children]
                        index = right_children
                else:
                    break

            return remove_element
    
    def size(self):
        return self.realsize
    
    def __str__(self):
        return str(self.minheap[1 : self.realsize + 1])


max heaps


  • a max heap is a complete binary tree where each node is larger than its children (the root is the max element).

  • insert:

    • insert the element at the bottom, at the leftmost node.
    • then compare the node to each parent, exchanging them until the tree's properties are correct.
  • extract_max:

    • remove/return the top and then replace the tree's top with its bottom rightmost element.
    • swap up until the max element is on the top.
  • the code below is an example of a max heap class built in python:


class MaxHeap:
    def __init__(self, heapsize):

        self.heapsize = heapsize
        self.maxheap = [0] * (heapsize + 1)
        self.realsize = 0

    def add(self, element):

        self.realsize += 1
        if self.realsize > self.heapsize:
            print("Too many elements!")
            self.realsize -= 1
            return False

        self.maxheap[self.realsize] = element
        index = self.realsize
        parent = index // 2

        while self.maxheap[index] > self.maxheap[parent] and index > 1:
            self.maxheap[parent], self.maxheap[index] = self.maxheap[index], self.maxheap[parent]
            index = parent
            parent = index // 2
            
    def peek(self):

        return self.maxheap[1]
    
    def pop(self):

        if self.realsize < 1:
            print("Heap is empty.")
            return False
        else:
            remove_element = self.maxheap[1]
            self.maxheap[1] = self.maxheap[self.realsize]
            self.realsize -= 1
            index = 1

            while (index <= self.realsize // 2):

                left_children = index * 2
                right_children = (index * 2) + 1

                if (self.maxheap[index] < self.maxheap[left_children] or self.maxheap[index] < self.maxheap[right_children]):
                    if self.maxheap[left_children] > self.maxheap[right_children]:
                        self.maxheap[left_children], self.maxheap[index] = self.maxheap[index], self.maxheap[left_children]
                        index = left_children
                    else:
                        self.maxheap[right_children], self.maxheap[index] = self.maxheap[index], self.maxheap[right_children]
                        index = right_children
                else:
                    break
            return remove_element
    
    def size(self):
        return self.realsize
    
    def __str__(self):
        return str(self.maxheap[1 : self.realsize + 1])


heap sort


  • the core concept of the heap sort involves constructing a heap from our input and then repeatedly removing the min/max element to sort the array.

  • the key idea for in-place heap sort involves a balance of these two ideas:

    • building a heap from an unsorted array through a "bottom-up heapification" process, and
    • using the heap to sort the input array
  • heapsort traditionally uses a max-heap to sort the array, although a min-heap also works.

  • this is not a stable sort.


def heap_sort(self, array) -> None:

      def max_heapify(heap_size, index):
        
            left, right = 2 * index + 1, 2 * index + 2
            largest = index
        
            if left < heap_size and array[left] > array[largest]:
                largest = left
            elif if right < heap_size and array[right] > array[largest]:
                largest = right
            elif largest != index:
                array[index], array[largest] = array[largest], array[index]
                max_heapify(heap_size, largest)

      for i in range(len(lst) // 2 - 1, -1, -1):
            max_heapify(len(array), i)

      for i in range(len(array) - 1, 0, -1):
            array[i], array[0] = array[0], array[i]
            max_heapify(i, 0)

      return array


compare two tops


def compare_two_tops(array) -> int:

        for i in range(len(array)):
            array[i] *= -1

        heapq.heapify(array)

        while len(array) > 1:

            val1 = heapq.heappop(array)
            val2 = heapq.heappop(array)

            if val1 != val2:
                heapq.heappush(array, val1 - val2)

        if array:
            return -heapq.heappop(array)
            
        return 0


find non-overlapping intervals


  • given an array of intervals[i] = [start_i, end_i], return the minimum the non-overlapping intervals:

def non_overlapping_invervals(intervals):
        
        if not intervals:
            return 0

        result = []
        intervals.sort(key=lambda x: x[0])
        heapq.heappush(result, intervals[0][-1])

        for interval in intervals[1:]:

            if result[0] <= interval[0]:
                heapq.heappop(result)
            
            heapq.heappush(result, interval[1])
        
        return len(result)


top k frequent values


def top_k_frequent_values(list, k):
        
        if k == len(nums):
                return nums

        # hashmap element: frequency
        counter = Counter(nums)
        return heapq.nlargest(k, counter.keys(), key=counter.get)


kth largest element in a stream


class KthLargest:

    def __init__(self, k, nums):

        self.k = k
        self.heap = nums
        heapq.heapify(self.heap)

        while len(self.heap) > k:
            heapq.heappop(self.heap)
        
    def add(self, val: int) -> int:

        heapq.heappush(self.heap, val)
        if len(self.heap) > self.k:
            heapq.heappop(self.heap)
        
        return self.heap[0]


kth smallest element in a matrix


  • given an n x n matrix where each of the rows and columns is sorted in ascending order, return the kth smallest element in the matrix.

def kth_smallest(matrix, k) -> int:

        min_heap = []

        for row in range(min(k, len(matrix))):
            min_heap.append((matrix[row][0], row, 0))

        heapq.heapify(min_heap)

        while k:

            element, row, col = heapq.heappop(min_heap)
            if col < len(matrix) - 1:
                heapq.heappush(min_heap, (matrix[row][cow + 1], row, col + 1))
            k -= 1
        
        return element