php替換中文,PHP中文替換

//定義編碼

header(?'Content-Type:text/html;charset=utf-8 ');

$words=array('我','你','他');

$content="測一測我是不是違禁詞";

$banned=generateRegularExpression($words);

//檢查違禁詞

$res_banned=check_words($banned,$content);

write_html($content,$res_banned);

/**

* @describe 數組生成正則表達式

* @param array $words

* @return string

function?generateRegularExpression($words)

{

$regular?= implode('|',?array_map('preg_quote',?$words));

return?"/$regular/i";

}

/**

* @describe 字符串生成正則表達式

* @param array $words

* @return string

function?generateRegularExpressionString($string){

$str_arr[0]=$string;

$str_new_arr=??array_map('preg_quote',?$str_arr);

return?$str_new_arr[0];

}

/**

* 檢查敏感詞

* @param $banned

* @param $string

* @return bool|string

function?check_words($banned,$string)

{????$match_banned=array();

//循環查出所有敏感詞

$new_banned=strtolower($banned);

$i=0;

do{

$matches=null;

if?(!empty($new_banned) && preg_match($new_banned,?$string,?$matches)) {

$isempyt=empty($matches[0]);

if(!$isempyt){

$match_banned?=?array_merge($match_banned,?$matches);

$matches_str=strtolower(generateRegularExpressionString($matches[0]));

$new_banned=str_replace("|".$matches_str."|","|",$new_banned);

$new_banned=str_replace("/".$matches_str."|","/",$new_banned);

$new_banned=str_replace("|".$matches_str."/","/",$new_banned);

}

$i++;

if($i>20){

$isempyt=true;

break;

}

}while(count($matches)>0 && !$isempyt);

//查出敏感詞

if($match_banned){

return?$match_banned;

}

//沒有查出敏感詞

return?array();

}

/**

* 打印到頁面上

* @param $filepath

* @param $res_mingan

* @param $res_banned

function?write_html($content,$res_banned){

print_r($content);

if($res_banned){

print_r("? 違禁詞(".count($res_banned).")：".implode('|',$res_banned));

}

echo?"
";

}

1、匹配中文

$str = "中文“;

preg_match_all("/[\x{4e00}-\x{9fa5}]+/u",$str,$match);

2、替換中文:

在所在的php文件里，要加上

mb_internal_encoding("UTF-8");

mb_regex_encoding("UTF-8");

這樣才能支持多字節進行模式匹配。詳細介紹:http://blog.chinaunix.net/uid-20279807-id-1711213.html

3、php提供了四個替換函數，分別是str_replace，preg_replace，mb_ereg_replace，ereg_replace(在php7.1已經摒棄掉)

在替換中文時，發現用preg_replace替換中文最合適.

str_replace 不支持正則表達式，不能完全匹配，導致局部字段被替換。例如: $str = "模塊一模塊一斷電"，$str = str_replace("模塊一","module1",$str);，導致"模塊一斷電"被替換成"module1斷電"。

mixed preg_replace ( mixed $pattern , mixed $replacement , mixed $subject [, int $limit = -1 [, int &$count ]] ) ?支持$pattern,$replacement?以數組的方式進行查找替換，但數組過多時，進行搜索匹配，耗CPU嚴重。

mb_ereg_replace 支持正則表達式，但不用分隔符//進行匹配，但使用mb_ereg_replace，發現有些中文匹配不了。具體原因暫不清楚。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/395011.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/395011.shtml
英文地址，請注明出處：http://en.pswp.cn/news/395011.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！