數據結構之優先級隊列

系列文章目錄

數據結構之ArrayList_arraylist o(1) o(n)-CSDN博客

數據結構之LinkedList-CSDN博客

數據結構之棧-CSDN博客

數據結構之隊列-CSDN博客

數據結構之二叉樹-CSDN博客

系列文章目錄

前言

一、優先級隊列和堆

二、堆的模擬實現

1. 堆的創建

2. 計算建堆的時間復雜度

3. 堆的插入

4. 堆的刪除

三、優先級隊列源碼

1. 構造方法

2. 擴容機制

3. 比較機制

四、topK問題

前言

本文主要介紹了優先級隊列和堆的實現原理。首先講解了堆的基本概念，包括大根堆和小根堆的存儲結構和父子節點關系。然后詳細講解了堆的模擬實現過程，包括創建堆、插入元素和刪除元素的操作步驟及時間復雜度分析。接著分析了Java中PriorityQueue的源碼實現，包括構造方法、擴容機制和比較機制。最后討論了topK問題的解決方案，提出使用大根堆/小根堆來高效獲取前k個最小/最大元素的算法思路。文章通過代碼示例展示了如何實現這些數據結構，并分析了相關操作的時間復雜度。

一、優先級隊列和堆

優先級隊列的兩個基本操作：返回高優先級對象，添加新的對象；

PriorityQueue 底層使用了堆這種數據結構，堆是在完全二叉樹的基礎上進行了一些調整；

堆是按照完全二叉樹的順序存儲（層序遍歷）的方式存儲；

小根堆：孩子節點都比根節點大；

大根堆：孩子節點都比根節點小；

如果根節點的下標是 i，左孩子節點的下標為 2 * i + 1，右孩子節點的下標為 2 * i + 2；

如果孩子節點的下標是 i，根節點的下標為（i - 1）/ 2；

二、堆的模擬實現

1. 堆的創建

思路（以大根堆為例）：

從后往前遍歷每一棵子樹，如果根節點的值比左右孩子的節點的值小，就將左右孩子節點的最大值和根節點的值交換；

交換完成后，這棵樹的左右子樹有可能出現孩子節點的值比根節點的值大的情況，因此需要將已經交換的孩子節點再次作為根節點，判斷根節點和左右孩子節點的值，直至交換完最后一棵子樹；

createHeap(): void 創建堆，從后往前遍歷每一棵子樹，向下調整根節點值；

shiftDown(int parent, int usedSize): void 計算孩子節點下標，如果孩子節點的值大于根節點的值，向下調整根節點的值；

public class Heap {private int[] elem;private int usedSize;public Heap(){this.elem = new int[10];}public Heap(int[] array){int n = array.length;this.elem = new int[n];for(int i = 0; i < array.length; i++){this.elem[i] = array[i];this.usedSize++;}}public void createHeap(){for(int parent = (this.usedSize - 1 - 1) / 2; parent >= 0; parent--){shiftDown(parent, this.usedSize);}}private  void shiftDown(int parent, int usedSize){int child = parent * 2 + 1;while(child < usedSize){if(child + 1 < usedSize && this.elem[child] < this.elem[child + 1]){child++;}if(this.elem[parent] < this.elem[child]){swap(parent, child);parent = child;child = parent * 2 + 1;}}}private void swap(int i, int j){int tmp = this.elem[i];this.elem[i] = this.elem[j];this.elem[j] = tmp;}
}

2. 計算建堆的時間復雜度

假設有 n 個節點，那么堆的高度為：h = log2^(n + 1)；

計算節點的個數：從第一層到最后一層分別為?2^0, 2^1, 2^2,..., 2^(h-3), 2^(h-2), 2^(h-1)；

計算要調整的高度：最壞情況下，從第一層到倒數第二層都需要調整，最后一層不需要調整，調整的高度分別是 h - 1, h - 2, h - 3, ..., 2, 1, 0；

因此需要花費的時間為：

T(n) = 2^0 * (h - 1) + 2^1 * (h - 2) + 2^2 * (h - 3) + 2^3 * (h - 4) + ... + 2^(h - 3) * 2?+ 2^(h - 2) * 1；

2 * T(n) = 2^1 * (h - 1) + 2^2 * (h - 2) + 2^3 * (h - 3) + ... + 2^(h - 2) * 2 + 2^(h - 1) * 1;

使用錯位相減計算：

T(n) = -2^0 * (h - 1) + 2^1 + 2^2 + 2^3 + ... + 2^(h - 3) + 2^(h - 2) + 2^(h - 1)；

T(n) = 2^0 + 2^1 + 2^2 + 2^3 + ... + 2^(h - 3) + 2^(h - 2) + 2^(h - 1) - h；

使用等比數列求和公式：

T(n) = 1 * (1 - 2^h) / (1 - 2)? - h = 2^h?- 1 - h；

將 h = log2^(n + 1) 帶入公式：T(n) = n - log2^(n + 1)；

當 n 足夠的大情況下，log2^(n + 1）相比于 n 是可以忽略的，因此建堆的時間復雜度為 O(N)；

3. 堆的插入

思路：

先在最后一個位置放上插入的元素，再將這個元素向上調整；

向上調整時，該節點作為 child 節點和根節點發生了交換，這個節點又會作為新的 child 節點，可能仍然需要繼續向上調整，因此需要循環直至 child 作為整個完全二叉樹的根節點；

如果沒有發生交換，說明已經是一個大根堆了，則跳出循環即可；

offer(int val): void 向堆中添加元素；

shiftUp(int child): void 如果根節點的值比孩子節點的值小，向上調整孩子節點的值；

    public void offer(int val){if(isFull()){this.elem = Arrays.copyOf(this.elem,  2 * this.elem.length);}this.elem[this.usedSize] = val;this.usedSize++;shiftUp(this.usedSize - 1);}private boolean isFull(){return this.usedSize == this.elem.length;}private void shiftUp(int child){while(child != 0){int parent = (child - 1) / 2;if(this.elem[child] > this.elem[parent]){swap(parent, child);}else{break;}child = parent;}}

4. 堆的刪除

思路：

將根節點和最后一個位置的元素進行交換，再將根節點位置的元素向下調整；

poll(): int 彈出堆頂元素；

    public int poll(){if(isEmpty()){return -1;}swap(0, this.usedSize - 1);this.usedSize--;shiftDown(0, this.usedSize);return this.elem[this.usedSize];}public boolean isEmpty(){return this.usedSize == 0;}

三、優先級隊列源碼

PriorityQueue 默認是小根堆；

PriorityQueue 中放置的元素必須能夠比較大小，否則會拋出 ClassCastException 異常；

PriorityQueue 中不能放置 null 對象，否則會拋出空指針異常；

插入和刪除元素的時間復雜度為 O(logN)；

1. 構造方法

兩個構造方法本質上都是調用：

public PriorityQueue(int initialCapacity,Comparator<? super E> comparator)

第一個參數為初始化容量，第二個參數為比較器；

默認的初始化容量 DEFAULT_INITIAL_CAPCITY 為 11；

public class PriorityQueue<E> extends AbstractQueue<E>implements java.io.Serializable {private static final long serialVersionUID = -7720805057305804111L;private static final int DEFAULT_INITIAL_CAPACITY = 11;transient Object[] queue; // non-private to simplify nested class accessprivate int size = 0;private final Comparator<? super E> comparator;transient int modCount = 0; // non-private to simplify nested class accesspublic PriorityQueue() {this(DEFAULT_INITIAL_CAPACITY, null);}public PriorityQueue(int initialCapacity) {this(initialCapacity, null);}public PriorityQueue(Comparator<? super E> comparator) {this(DEFAULT_INITIAL_CAPACITY, comparator);}public PriorityQueue(int initialCapacity,Comparator<? super E> comparator) {// Note: This restriction of at least one is not actually needed,// but continues for 1.5 compatibilityif (initialCapacity < 1)throw new IllegalArgumentException();this.queue = new Object[initialCapacity];this.comparator = comparator;}
}

2. 擴容機制

MAX_ARRAY_SIZE 最大的數組容量為 Integer.MAX_VALUE - 8；

grow(int minCapacity): void 擴容：

當前的容量小于 64 字節：當前容量 + 當前容量 + 2 字節；

當前容量大于等于 64 字節：當前容量 * 1.5；

如果擴容后的容量大于?MAX_ARRAY_SIZE，調用 hugeCapacity(int minCapacity): int

hugeCapacity(int minCapacity): int 大容量擴容機制：

如果所需的最小容量溢出了（變成負數），則拋出異常；

如果所需的最小容量大于?MAX_ARRAY_SIZE，擴容為 Integer.MAX_VALUE；

如果所需的最小容量小于等于?MAX_ARRAY_SIZE，擴容為??MAX_ARRAY_SIZE；

    private static final int MAX_ARRAY_SIZE = Integer.MAX_VALUE - 8;/*** Increases the capacity of the array.** @param minCapacity the desired minimum capacity*/private void grow(int minCapacity) {int oldCapacity = queue.length;// Double size if small; else grow by 50%int newCapacity = oldCapacity + ((oldCapacity < 64) ?(oldCapacity + 2) :(oldCapacity >> 1));// overflow-conscious codeif (newCapacity - MAX_ARRAY_SIZE > 0)newCapacity = hugeCapacity(minCapacity);queue = Arrays.copyOf(queue, newCapacity);}private static int hugeCapacity(int minCapacity) {if (minCapacity < 0) // overflowthrow new OutOfMemoryError();return (minCapacity > MAX_ARRAY_SIZE) ?Integer.MAX_VALUE :MAX_ARRAY_SIZE;}

3. 比較機制

offer(E e): boolean 向堆中添加元素，如果堆中沒有元素，直接添加到 0 下標，如果堆中有元素，調用 siftUp() 方法；

siftUp(int k, E x): void 向上調整，如果傳了比較器，調用 siftUpUsingComparator() 方法，否則調用 siftUpComparable() 方法；

siftUpUsingComparator(int k, E x): void 使用比較器，向上調整；

計算根節點下標，將根節點下標的元素保存在變量 e 里面；

假設比較器是 a.compareTo(b):

使用比較器（調用比較器的 compare() 方法）比較 x 和 e，如果大于等于 0（x 大于等于根節點的值），跳出循環，把 x 放到堆的下一個位置，此時建立的是一個小根堆；

如果小于 0（x 的值比根節點小），把 e 放到下一個位置，向上調整，直到 x 比某一個根節點大或者已經到 0 下標的位置，把 x 放到這個位置，此時建立的是一個小根堆；

因此比較器傳 a.compareTo(b)，建立的就是小根堆；

反之，比較器傳 b.compareTo(b)，建立的就是大根堆；

    public boolean offer(E e) {if (e == null)throw new NullPointerException();modCount++;int i = size;if (i >= queue.length)grow(i + 1);size = i + 1;if (i == 0)queue[0] = e;elsesiftUp(i, e);return true;}private void siftUp(int k, E x) {if (comparator != null)siftUpUsingComparator(k, x);elsesiftUpComparable(k, x);}private void siftUpUsingComparator(int k, E x) {while (k > 0) {int parent = (k - 1) >>> 1;Object e = queue[parent];if (comparator.compare(x, (E) e) >= 0)break;queue[k] = e;k = parent;}queue[k] = x;}private void siftUpComparable(int k, E x) {Comparable<? super E> key = (Comparable<? super E>) x;while (k > 0) {int parent = (k - 1) >>> 1;Object e = queue[parent];if (key.compareTo((E) e) >= 0)break;queue[k] = e;k = parent;}queue[k] = key;}

四、topK問題

以前 k 個最小的元素為例：

使用優先級隊列進行排序：

    public int[] smallestK(int[] arr, int k) {PriorityQueue<Integer> heap = new PriorityQueue<>();for(int x : arr) heap.offer(x);int[] ret = new int[k];for(int i = 0; i < k; i++) {ret[i] = heap.poll();}return ret;}

如果數組中元素非常多，優先級隊列可能存不下，并且這樣做是效率不高的；

推薦的做法：

使用數組中前 k 個數據建立一個容量為 k 的大根堆（堆頂元素是 k 個元素中最大的）；

遍歷后續的元素，如果元素小于堆頂元素，那么堆頂元素一定不是前 k 個最小的元素之一，堆頂元素出堆，新元素入堆；

    public int[] smallestK(int[] arr, int k) {int[] ret = new int[k];if(k == 0) return ret;PriorityQueue<Integer> heap = new PriorityQueue<>((a, b) -> b - a);for(int i = 0; i < k; i++) heap.offer(arr[i]);int n = arr.length;for(int i = k; i < n; i++){if(arr[i] < heap.peek()){heap.poll();heap.offer(arr[i]);}}for(int i = 0; i < k; i++){ret[i] = heap.poll();}return ret;}

找前 k 小的元素，建大根堆；找前 k 大的元素，建小根堆；

堆中元素就是前 k 小或者大的元素，如果是找第 k 小或者第 k 大元素，直接返回堆頂元素即可；?