從O(k*n)到O(1)：如何用哈希表終結多層if判斷的性能困局

【前言】
??本文將以哈希表重構實戰為核心，完整展示如何將傳統條件匹配邏輯(上千層if-else判斷)轉化為O(1)的哈希表高效實現。通過指紋驗證場景的代碼級解剖，您將深入理解：
??1.哈希函數設計如何規避沖突陷阱
??2.鏈式尋址法的工程實現細節
??若有出錯之處，望各位大哥大姐指出(●’?’●)

Ⅰ 背景

??最近，拿到一個場景，有一個研判規則中，需要一次匹配上千個以上規則的規則，一開始采用的是多層if判斷，但是這種在高頻事件中，明顯性能遭不住，而且在研判速度上遠遠達不到預期

最初代碼如下

bool is_finger(char finger[]){if (strlen(finger) == yyy){return 0;}if (strlen(finger) == xxx){return 0;}//………………還有幾千個規則研判
}

【目標】將程序的時間復雜度O(k*n)，降至O(1)
【實現】可以采用兩種，一是哈希表，二是字典樹

Ⅱ C程序優化實踐

說那么多，沒啥用，直接實操，沖沖沖
先定義下變量和結構體


#define HASH_SIZE 1024  // 哈希表大小，應該是質數以減少沖突typedef struct HashNode {char* key;struct HashNode* next;  // 處理沖突用的鏈表
} HashNode;typedef struct {HashNode* table[HASH_SIZE];
} HashMap;

初始化哈希表


// 哈希函數
unsigned int hash(const char* str) {unsigned int hash = 5381;int c;while ((c = *str++)) {hash = ((hash << 5) + hash) + c; // hash * 33 + c}return hash % HASH_SIZE;
}
// 初始化哈希表
HashMap* init_fingerprint_map() {HashMap* map = (HashMap*)malloc(sizeof(HashMap));memset(map->table, 0, sizeof(HashNode*) * HASH_SIZE);// 需要過濾的指紋列表const char* fingerprints[] = {"En", "nTf.n", "kno:n", "n)on", "fknn","kn", "n&n", "nn", "n&nn", "Ton",};// 插入所有指紋for (int i = 0; i < sizeof(fingerprints)/sizeof(char*); i++) {unsigned int index = hash(fingerprints[i]);HashNode* node = (HashNode*)malloc(sizeof(HashNode));node->key = strdup(fingerprints[i]);node->next = map->table[index];map->table[index] = node;}return map;
}

關鍵實現，哈希查找

// 查找函數 - O(1) 平均時間復雜度
bool is_fingerprint(HashMap* map, const char* fingerprint) {unsigned int index = hash(fingerprint);HashNode* current = map->table[index];// 在鏈表中查找while (current != NULL) {if (strcmp(current->key, fingerprint) == 0) {return false;  // 找到匹配項，返回false}current = current->next;}return true;  // 未找到匹配項
}

記得要釋放內存

// 釋放哈希表內存
void free_hashmap(HashMap* map) {for (int i = 0; i < HASH_SIZE; i++) {HashNode* current = map->table[i];while (current != NULL) {HashNode* temp = current;current = current->next;free(temp->key);free(temp);}}free(map);
}

完整代碼

#include <stdbool.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>#define HASH_SIZE 1024  // 哈希表大小，應該是質數以減少沖突typedef struct HashNode {char* key;struct HashNode* next;  // 處理沖突用的鏈表
} HashNode;typedef struct {HashNode* table[HASH_SIZE];
} HashMap;// 哈希函數
unsigned int hash(const char* str) {unsigned int hash = 5381;int c;while ((c = *str++)) {hash = ((hash << 5) + hash) + c; // hash * 33 + c}return hash % HASH_SIZE;
}// 初始化哈希表
HashMap* init_fingerprint_map() {HashMap* map = (HashMap*)malloc(sizeof(HashMap));memset(map->table, 0, sizeof(HashNode*) * HASH_SIZE);// 需要過濾的指紋列表const char* fingerprints[] = {"En", "nTf.n", "kno:n", "n)on", "fknn","kn", "n&n", "nn", "n&nn", "Ton",};// 插入所有指紋for (int i = 0; i < sizeof(fingerprints)/sizeof(char*); i++) {unsigned int index = hash(fingerprints[i]);HashNode* node = (HashNode*)malloc(sizeof(HashNode));node->key = strdup(fingerprints[i]);node->next = map->table[index];map->table[index] = node;}return map;
}// 查找函數 - O(1) 平均時間復雜度
bool is_fingerprint(HashMap* map, const char* fingerprint) {unsigned int index = hash(fingerprint);HashNode* current = map->table[index];// 在鏈表中查找while (current != NULL) {if (strcmp(current->key, fingerprint) == 0) {return false;  // 找到匹配項，返回false}current = current->next;}return true;  // 未找到匹配項
}// 釋放哈希表內存
void free_hashmap(HashMap* map) {for (int i = 0; i < HASH_SIZE; i++) {HashNode* current = map->table[i];while (current != NULL) {HashNode* temp = current;current = current->next;free(temp->key);free(temp);}}free(map);
}
int main() {// 初始化（只需要一次）HashMap* map = init_fingerprint_map();// 快速查找并打印結果bool result1 = is_fingerprint(map, "En");printf("Test 'En': %s\n", result1 ? "true" : "false");bool result2 = is_fingerprint(map, "kn");printf("Test 'kn': %s\n", result2 ? "true" : "false");bool result3 = is_fingerprint(map, "other");printf("Test 'other': %s\n", result3 ? "true" : "false");// 清理資源free_hashmap(map);return 0;
}

Ⅲ 深度解析哈希表為啥能O(1)

1. 先了解下什么是哈希表？

想象你有一個帶編號的儲物柜（這就是哈希表）：
在這里插入圖片描述

哈希函數就像一個規則，告訴你把東西放在哪個柜子里
比如：把字符串 “hello” 放到 3 號柜子

回到一開始說的"為什么說查找是 O(1)"!
當你要找 “hello” 時：

用哈希函數算出位置：3
我們直接去 3 號柜子找
這樣子，是不是就不需要查看其他柜子了，直接O(1)，起飛蕪湖~~

2. 哈希沖突到底是什么

了解了什么是哈希表，那開始熟悉下哈希沖突
假設現在：

“hello” -> 3號柜子
“world” -> 也是3號柜子

處理沖突的方式:儲物柜用鏈子連接

好了，了解了基本邏輯，基本可以上C代碼

// 假設我們有一個小型哈希表，存儲常見編程語言
#define HASH_SIZE 8  // 8個儲物柜// 存儲數據
hash("Python") -> 3
hash("Java") -> 3    // 沖突！
hash("Go") -> 5儲物柜：
0    1    2    3          4    5     6    7
[  ] [  ] [  ] [Python]-> [Java] [Go] [  ] [  ]// 查找"Java"的過程
1. hash("Java") = 3           // 計算位置
2. 檢查3號柜子的 Python      // 不是
3. 檢查下一個 Java          // 找到了！

3.哈希表為什么快

想象一個真實的哈希表

#define HASH_SIZE 1024  // 1024個儲物柜// 如果存100個數據
// 平均每個柜子只會有 100/1024 ≈ 0.1 個數據
// 也就是說，大多數柜子是空的！

聯想實際場景：圖書館找書

不需要從第一本找到最后一本
直接根據編號去對應書架
即使這個位置有幾本書，也只需要看很少幾本

??這樣子，就是不是很清晰了，其實哈希表，就是拿著key，拿到索引，然后去對應柜子找東西
按照這個思路，來解讀下剛剛寫的哈希查找代碼

bool is_fingerprint(HashMap* map, const char* fingerprint) {// 1. 計算應該去哪個儲物柜unsigned int index = hash(fingerprint);// 2. 去到那個儲物柜HashNode* current = map->table[index];// 3. 如果這個儲物柜有多個物品，挨個檢查while (current != NULL) {// 4. 檢查是不是要找的東西if (strcmp(current->key, fingerprint) == 0) {return false;  // 找到了！}current = current->next;  // 看下一個}return true;  // 沒找到
}

值得注意的是：這里的循環是很少執行，因為柜子的東西不會太多，甚至有些規則還是空的

哈希表很大（比如1024個位置）
數據相對較少（比如100個）
哈希函數會盡量均勻分布
所以每個位置平均不到1個數據

所以雖然代碼里有 while 循環，但實際上：

直接定位到具體位置（像圖書館找書架）
即使需要循環，也只需要看很少的幾個

??所以說哈希表，這就是為什么說它是 O(1) 的原因了，如果東西太多了，柜子設置太多了，就可以要用另一種方式了，那就是字典樹
再次感謝各位大哥大姐們捧場，閱讀到此，本篇結束，如有其他疑問，評論區相見~~

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/894967.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/894967.shtml
英文地址，請注明出處：http://en.pswp.cn/news/894967.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！