C++ 標準庫中的哈希函數：從std::hash到自定義哈希器

C++ 標準庫中的哈希函數：從 `std::hash` 到自定義哈希器

1. 引言

在上一篇中，我們介紹了哈希表為什么能夠實現 O(1) 查找。
核心秘密在于：哈希函數。

在 C++ 標準庫中，哈希表容器（如 unordered_map、unordered_set）依賴一個統一的機制：std::hash。
那么 std::hash 是什么？它能處理哪些類型？如果我要存儲自定義類型，應該怎么寫哈希函數呢？
本文將一一揭開這些問題。

2. 什么是 `std::hash`

std::hash 是 C++ 標準庫提供的 函數對象模板。它為常見的基礎類型（如 int、double、string 等）定義了默認哈希函數。

簡單理解：

std::hash<T> 是一個類模板。
它重載了 operator()，返回一個 size_t 類型的哈希值。
默認可以直接用于 unordered_map 或 unordered_set。

示例代碼

#include <iostream>
#include <string>
#include <functional>
using namespace std;int main() {hash<int> hashInt;hash<string> hashStr;cout << "Hash(42) = " << hashInt(42) << endl;cout << "Hash(\"Alice\") = " << hashStr("Alice") << endl;return 0;
}

輸出的結果是一個整數（具體值依賴編譯器實現），這個整數就是鍵的哈希值。

3. `std::hash` 支持哪些類型

標準庫中，以下常見類型都有默認實現：

所有基本數值類型：int, long, double, char …
指針類型：T*
std::string 與 std::u16string 等字符串類

因此，對于這些類型，你可以直接使用 unordered_map：

#include <unordered_map>
#include <string>
using namespace std;int main() {unordered_map<string, int> score = {{"Alice", 90},{"Bob", 85}};return 0;
}

無需額外定義哈希函數。

4. 為什么需要自定義哈希函數

如果你要存儲 自定義類型（例如坐標點、復數、pair 等），標準庫并不知道如何為其生成哈希值。
這時就需要 自己提供哈希函數。

5. 自定義哈希函數的方法

方法一：自定義函數對象

#include <iostream>
#include <unordered_set>
using namespace std;struct Point {int x, y;bool operator==(const Point& other) const {return x == other.x && y == other.y;}
};struct PointHash {size_t operator()(const Point& p) const {return hash<int>()(p.x) ^ (hash<int>()(p.y) << 1);}
};int main() {unordered_set<Point, PointHash> points;points.insert({1, 2});points.insert({3, 4});cout << "size = " << points.size() << endl;return 0;
}

這里：

operator== 用于判斷兩個對象是否相等。
PointHash 提供哈希值。
注意用位運算（如 ^ 和 <<）來混合不同字段，減少沖突。

方法二：使用 `lambda`（C++14 起）

#include <unordered_map>
#include <string>
#include <functional>
#include <utility>
using namespace std;int main() {auto pair_hash = [](const pair<int,int>& p) {return hash<int>()(p.first) ^ (hash<int>()(p.second) << 1);};unordered_map<pair<int,int>, string, decltype(pair_hash)> mp(10, pair_hash);mp[{1,2}] = "坐標(1,2)";return 0;
}

這種方式更靈活，但語法稍微復雜，需要 decltype(pair_hash) 指定哈希類型。

方法三：偏特化 `std::hash`

C++ 允許為自定義類型對 std::hash 模板進行偏特化：

#include <unordered_set>
#include <string>
using namespace std;struct Student {string name;int id;bool operator==(const Student& other) const {return name == other.name && id == other.id;}
};namespace std {template<>struct hash<Student> {size_t operator()(const Student& s) const {return hash<string>()(s.name) ^ (hash<int>()(s.id) << 1);}};
}int main() {unordered_set<Student> students;students.insert({"Alice", 1001});students.insert({"Bob", 1002});return 0;
}

這種寫法簡潔，直接擴展 std::hash，但要小心命名空間污染。

6. 哈希函數設計小技巧

避免簡單相加：如 x + y 容易沖突。
使用移位和異或：(hash1 ^ (hash2 << 1)) 是常見組合方式。
考慮質數：在復雜結構中，可用質數混合提高分布性。
保持快速：哈希函數需要高效，否則會拖慢整體性能。

7. 小結

std::hash 是 C++ 哈希表的默認入口，支持常見基礎類型。
對于自定義類型，需要提供哈希函數，可以通過函數對象、lambda 或偏特化實現。
好的哈希函數應當 分布均勻 + 高效計算，否則會導致嚴重沖突，影響性能。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/pingmian/96653.shtml
繁體地址，請注明出處：http://hk.pswp.cn/pingmian/96653.shtml
英文地址，請注明出處：http://en.pswp.cn/pingmian/96653.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！