KMP 字符串hash算法

kmp算法

最大相同真前后綴：

如?ababa的最大真前后綴為aba，而不是ababa（真前后綴與真子集類似，不可是本身，不然沒意義）

所以next[1]? = 0；//string的下標從1開始

kmp模擬

next初始化：

注意：下標從1開始

KMP用法示例code

例題1：斤斤計較的小z

link:1.斤斤計較的小Z - 藍橋云課

code

#include "bits/stdc++.h"
using namespace std;
using ll = long long;
const ll MAXN = 1e6 + 7;
char p[MAXN], s[MAXN];
ll n, m;
ll nex[MAXN];int main()
{ll ans = 0;cin>>p+1>>s+1;// 保證字符串下標從1開始m = strlen(p + 1);n = strlen(s + 1);// init nexnex[0] = nex[1] = 0;for(int i = 2, j = 0; i <= m; i++){while(j && p[i] != p[j + 1]) j = nex[j];if(p[i] == p[j + 1]) j++;nex[i] = j;}// KMPfor(int i = 1, j = 0; i <= n; i++){while(j && s[i] != p[j + 1]) j = nex[j];if(s[i] == p[j + 1]) j++;if(j == m) ans++;}cout<<ans<<endl;return 0;
}

字符串hash

字符串hash初始化

使用ull類型，不用擔心溢出（溢出時自動取模，不會影響結果）

tips:

s[i] - 'A' + 1是為了保證其不為0，如字符串為AAA時，不加1的話，其哈希值為0，與A，AA哈希值相同，這是我們不希望的
更方便地，我們直接令h[i] = h[i-1] * base + s[i]即可
注意base值要比字符值的范圍更大，以降低hash沖突的概率（一般設置為131）

獲取字串s[l] ~ s[r]的hash

例題1的字符串hash解法

#include "bits/stdc++.h"
using namespace std;
using ll = long long;
using ull = unsigned long long;
const ll MAXN = 1e6 + 7;
ull hp[MAXN], hs[MAXN], b[MAXN];// hs[i]:s[1]~s[i]對應的hash值;  b[i]:base的i次方
ull base = 131;
char p[MAXN], s[MAXN];
ll n, m;ull getHash(ull* h, ull bg, ull ed)
{return h[ed] - h[bg] * b[ed - bg];// ull溢出相當于自動取模, 所以不用擔心溢出
}int main()
{ll ans = 0;cin>>p+1>>s+1;// 保證字符串下標從1開始m = strlen(p + 1);n = strlen(s + 1);// init hp, hs, bb[0] = 1;for(int i = 1; i <= n; i++){b[i] = base * b[i-1];hp[i] = hp[i-1] * base + p[i];hs[i] = hs[i-1] * base + s[i];}// 匹配相同字符串(利用hash值)for(int i = 1; i + m - 1 <= n; i++){if(getHash(hp, 1, m) == getHash(hs, i, i + m - 1)) ans++;}cout<<ans<<endl;return 0;
}

總結

KMP與字符串hash都是用于字符串匹配的算法，字符串hash非常簡單（雖然有極小概率的hash沖突導致判斷失誤，匹配失敗），KMP算法更加嚴謹；

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/web/96772.shtml
繁體地址，請注明出處：http://hk.pswp.cn/web/96772.shtml
英文地址，請注明出處：http://en.pswp.cn/web/96772.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！