上篇說到,安裝了es后正常運行
es分詞下載地址
從 GitHub Release 下載(推薦)
👉 https://github.com/medcl/elasticsearch-analysis-ik/releases
或
https://release.infinilabs.com/analysis-ik/stable/
安裝:
選擇與你 ES 版本匹配的包,例如:
elasticsearch-analysis-ik-8.12.0.zip
下載命令:
cd /tmp
wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v8.12.0/elasticsearch-analysis-ik-8.12.0.zip
?? 注意:不要下載
source code
,要下載assets
里的.zip
文件。
3. 創建 plugins 目錄(如果不存在)
Elasticsearch 插件默認安裝在:
$ES_HOME/plugins/ik/
創建目錄:
mkdir -p $ES_HOME/plugins/ik
4. 解壓插件到 plugins 目錄
unzip elasticsearch-analysis-ik-8.12.0.zip -d $ES_HOME/plugins/ik/
$ES_HOME
是你的 Elasticsearch 安裝目錄,例如/data/isee/apps/elasticsearch-8.12.0
5. 檢查目錄結構
安裝完成后,目錄結構應如下:
$ES_HOME/plugins/ik/
├── plugin-descriptor.properties
├── plugin-security.policy
├── config/
│ ├── IKAnalyzer.cfg.xml
│ ├── main.dic
│ └── stopword.dic
└── lib/├── elasticsearch-analysis-ik-8.12.0.jar└── commons-codec-1.9.jar└── ...
6. 修改配置文件(可選)
配置文件路徑:
$ES_HOME/plugins/ik/config/IKAnalyzer.cfg.xml
你可以添加自定義詞典:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties><comment>IK Analyzer 擴展配置</comment><entry key="ext_dict">custom.dic</entry><entry key="ext_stopwords">stopwords.dic</entry>
</properties>
然后在 config/
目錄下創建 custom.dic
,添加自定義詞匯:
人工智能
大模型
阿里云
Qwen
7. 設置權限(重要)
確保 Elasticsearch 用戶有權限讀取插件:
chown -R isee:isee $ES_HOME/plugins/ik
# 或你運行 ES 的用戶
8. 重啟 Elasticsearch
# 先停止
ps aux | grep elasticsearch
kill <pid># 啟動
bin/elasticsearch -d
? 三、驗證插件是否安裝成功
1. 檢查日志
查看 $ES_HOME/logs/isee_cluster.log
,確認沒有插件加載錯誤。
2. 調用分詞 API 測試
# curl -X GET -u elastic:9yZWp=3UnEVkBxYBhnlS "https://10.10.10.10:9200/_analyze" -H "Content-Type: application/json" -d'
> {
> "analyzer": "ik_smart",
> "text": "阿里巴巴推出通義千問大模型"
> }'
curl: (60) Peer's certificate issuer has been marked as not trusted by the user.
More details here: http://curl.haxx.se/docs/sslcerts.htmlcurl performs SSL certificate verification by default, using a "bundle"of Certificate Authority (CA) public keys (CA certs). If the defaultbundle file isn't adequate, you can specify an alternate fileusing the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented inthe bundle, the certificate verification probably failed due to aproblem with the certificate (it might be expired, or the name mightnot match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, usethe -k (or --insecure) option.
才想起來,我們是https的服務,有ca證書,先不認證證書,-k
# curl -k -X GET -u elastic:9yZWp=3UnEVkBxYBhnlS "https://10.10.10.10:9200/_analyze" -H "Content-Type: application/json" -d'
> {
> "analyzer": "ik_smart",
> "text": "阿里巴巴推出通義千問大模型"
> }'
{"tokens":[{"token":"阿里巴巴","start_offset":0,"end_offset":4,"type":"CN_WORD","position":0},{"token":"推出","start_offset":4,"end_offset":6,"type":"CN_WORD","position":1},{"token":"通義","start_offset":6,"end_offset":8,"type":"CN_WORD","position":2},{"token":"千","start_offset":8,"end_offset":9,"type":"TYPE_CNUM","position":3},{"token":"問","start_offset":9,"end_offset":10,"type":"CN_CHAR","position":4},{"token":"大模型","start_offset":10,"end_offset":13,"type":"CN_WORD","position":5}]}[isee@host-10-15-32-71 elasticsearch-8.12.0]$
分詞安裝成功。